Chi-Square Test of Association
The Chi-Square Test of Association is used to determine whether two categorical variables are statistically related in a contingency table.
Understanding the Contingency Table
We begin with the following observed frequency table:
Cat X | Cat Y | Cat Z | Total | |
---|---|---|---|---|
Cat A | ? | 60 | ? | 200 |
Cat B | ? | ? | ? | 400 |
Total | 150 | 180 | 270 | 600 |
We are interested in calculating the expected frequency for Cell (1,1) — the intersection of Cat A and Cat X.
Calculating Expected Frequency
The expected frequency for a cell is calculated as:
\[ E_{(i,j)} = \frac{\text{Row Total}_i \times \text{Column Total}_j}{\text{Grand Total}} \]
For Cell (1,1):
- Row Total (Cat A): 200
- Column Total (Cat X): 150
- Grand Total: 600
\[ E_{(1,1)} = \frac{200 \times 150}{600} = \frac{30,000}{600} = 50 \]
So, the expected value in Cell (1,1) is 50.
Completing the Table of Expected Frequencies
Now, use the same formula to compute all other expected values:
Cat X | Cat Y | Cat Z | Total | |
---|---|---|---|---|
Cat A | 50 | 60 | 90 | 200 |
Cat B | 100 | 120 | 180 | 400 |
Total | 150 | 180 | 270 | 600 |
Critical Value and Decision Rule
To evaluate the statistical significance:
- Choose a significance level (\(\alpha\)), commonly 0.05
- Determine degrees of freedom (df) using:
\[ df = (r - 1)(c - 1) \] where \(r\) = number of rows, \(c\) = number of columns
Compare the computed Chi-Square test statistic to the critical value from the Chi-Square distribution table. If the test statistic is greater than the critical value, reject the null hypothesis of independence.