Chi-Square Test

Author
Affiliation

Renato A. Folledo, Jr.

Isabela State University

Introduction

The chi-square test of independence is used to analyze the frequency table (i.e. contengency table) formed by two categorical variables. The chi-square test evaluates whether there is a significant association between the categories of the two variables

Chi-square test examines whether rows and columns of a contingency table are statistically significantly associated.

Null hypothesis (H0): the row and the column variables of the contingency table are independent. Alternative hypothesis (H1): row and column variables are dependent

Example 1

Determine whether passing rate is independent from gender

# create a matrix of past/fail tally by gender (malefemal)
dt <- matrix(c(45,5,15,15), nrow=2, byrow=T)
rownames(dt) <- c("female", "male") # add row names
colnames(dt) <- c("pass", "fail")   # add column names
dt                                  # display matrix values
       pass fail
female   45    5
male     15   15
chisq <- chisq.test(dt)             # run chi-square test
chisq$p.value                       # check p.value if significant
[1] 0.0001889622

Example 2

Determine whether the roles of husbands and wives are independent of household chores

# source: http://www.sthda.com/english/wiki/chi-square-test-of-independence-in-r
chores <- data.table::fread("C:/jun/FirstSem24-25/Stat 2024/Labs/Chi/Chores.csv")
chores
         Chore  Wife Alternating Husband Jointly
        <char> <int>       <int>   <int>   <int>
 1:    Laundry   156          14       2       4
 2:  Main_meal   124          20       5       4
 3:     Dinner    77          11       7      13
 4: Breakfeast    82          36      15       7
 5:    Tidying    53          11       1      57
 6:     Dishes    32          24       4      53
 7:   Shopping    33          23       9      55
 8:   Official    12          46      23      15
 9:    Driving    10          51      75       3
10:   Finances    13          13      21      66
11:  Insurance     8           1      53      77
12:    Repairs     0           3     160       2
13:   Holidays     0           1       6     153
chisq <- chisq.test(chores[,2:5])
chisq

    Pearson's Chi-squared test

data:  chores[, 2:5]
X-squared = 1944.5, df = 36, p-value < 2.2e-16
contrib <- 100*chisq$residuals^2/chisq$statistic
round(contrib, 2)
      Wife Alternating Husband Jointly
 [1,] 7.74        0.27    1.78    2.25
 [2,] 4.98        0.01    1.24    1.90
 [3,] 2.20        0.07    0.60    0.56
 [4,] 1.22        0.61    0.41    1.44
 [5,] 0.15        0.13    1.27    0.66
 [6,] 0.06        0.18    0.89    0.63
 [7,] 0.09        0.09    0.58    0.59
 [8,] 0.69        3.77    0.01    0.31
 [9,] 1.54        2.40    3.37    1.79
[10,] 0.89        0.04    0.03    1.70
[11,] 1.71        0.94    0.87    1.68
[12,] 2.92        0.95   21.92    2.28
[13,] 2.83        1.10    1.23   12.45
corrplot::corrplot(contrib, is.cor = FALSE)