R Markdown

Chi-Square Test of Independence

The chi-square test of independence is used to analyze the frequency table (i.e. contengency table) formed by two categorical variables. The chi-square test evaluates whether there is a significant association between the categories of the two variables.

Graphical display of contengency tables

Contingency table can be visualized using the function balloonplot() [in gplots package]. This function draws a graphical matrix where each cell contains a dot whose size reflects the relative magnitude of the corresponding component.

# Create the observed data matrix
observed <- matrix(c(30, 50, 20, 40, 20, 10, 30, 30, 10), nrow = 3, byrow = TRUE)
rownames(observed) <- c("Car", "Bicycle", "Walk")
colnames(observed) <- c("Reading", "Gaming", "Socializing")
contigency_table<- observed
contigency_table
##         Reading Gaming Socializing
## Car          30     50          20
## Bicycle      40     20          10
## Walk         30     30          10

hypothesis formulation h0:hobbies and transport are independent h1:hobbies and transport are dependent

level of significance

alpha=0.05
# Calculate row totals and column totals
row_totals <- rowSums(observed)
col_totals <- colSums(observed)
total_obs <- sum(observed)
# Calculate expected frequencies
expected <- outer(row_totals, col_totals) / total_obs

expected
##          Reading   Gaming Socializing
## Car     41.66667 41.66667    16.66667
## Bicycle 29.16667 29.16667    11.66667
## Walk    29.16667 29.16667    11.66667
# Compute the chi-square statistic
chi2_stat <- sum((observed - expected)^2 / expected)
chi2_stat
## [1] 13.02857
# Degrees of freedom
dof <- (nrow(observed) - 1) * (ncol(observed) - 1)
dof
## [1] 4
# Print results
cat("Chi2 Statistic:", chi2_stat, "\n")
## Chi2 Statistic: 13.02857
cat("Degrees of Freedom:", dof, "\n")
## Degrees of Freedom: 4
qchisq(alpha,dof)
## [1] 0.710723

reject ho if chisq >= to the critical chi so 13.08 is greater than 0.710 so there hobbies and transport are dependent

by function

chi.sq<-chisq.test(contigency_table)
chi.sq
## 
##  Pearson's Chi-squared test
## 
## data:  contigency_table
## X-squared = 13.029, df = 4, p-value = 0.01114

if p<alpha reject ho

if(chi.sq$p.value<alpha)
{
  cat("reject ho")
}else
{
  cat("accept ho")
}
## reject ho