Welcome! This R Markdown document serves as a demo session focusing on chi-square tests. Specifically, we will cover two topics:
The Chi-Square Test of Goodness of Fit is a statistical test used to determine if observed frequencies of categorical data match the expected frequencies. It helps to assess whether a given distribution of data fits a specific theoretical model, allowing researchers to either accept or reject the null hypothesis that the observed and expected frequencies are similar.
Let’s start by creating a simulated dataset for the purpose of this demonstration.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Create a sample dataset for Chi-Square Test of Goodness of Fit
set.seed(123)
categories <- c("Red", "Green", "Blue")
observed_counts <- sample(50:150, 3, replace = TRUE)
observed_counts
## [1] 80 128 100
# Create a data frame
data_goodness_of_fit <- data.frame(Category = categories, Observed = observed_counts)
# Calculate percentages
total_count <- sum(data_goodness_of_fit$Observed)
data_goodness_of_fit$Percentage <- (data_goodness_of_fit$Observed / total_count) * 100
# Compute positions for percentage labels
data_goodness_of_fit <- data_goodness_of_fit %>%
arrange(desc(Category)) %>%
mutate(LabelPosition = cumsum(Observed) - 0.5 * Observed)
data_goodness_of_fit
## Category Observed Percentage LabelPosition
## 1 Red 80 25.97403 40
## 2 Green 128 41.55844 144
## 3 Blue 100 32.46753 258
The research question that can be addressed using the Chi-Square Test of Goodness of Fit from the dataset is, “Do the observed frequencies of colors (Red, Green, Blue) differ from an equal distribution across the categories?”
We will use ggplot2
to create a stacked bar graph to
visualize the dataset.
library(ggplot2)
# Create a stacked bar graph with custom colors and percentage annotations
ggplot(data_goodness_of_fit,
aes(x = "Categories", y = Observed, fill = Category)) + # Base layer specifying data and aesthetics
geom_bar(stat = "identity") + # Add bars using the provided data without aggregation
scale_fill_manual(values = c("Red" = "#E63946", "Green" = "#2A9D8F", "Blue" = "#264653")) + # Manually set colors for each category
geom_text(aes(y = LabelPosition, # Specify the position of the text labels
label = paste0(round(Percentage, 1), "%")), # Add text labels for percentages
color = "white",
fontface = "bold") +
labs(title = "Stacked Bar Graph of Observed Counts with Percentages", # Set the title of the graph
x = "", # Remove x-axis label
y = "Count") # Set the y-axis label
Before conducting the test, we should check the following assumptions:
In our simulated data, these assumptions are met.
We will use the chisq.test()
function in R to conduct
the test.
# Conducting the chi-square test of goodness of fit
chi_sq_result <- chisq.test(data_goodness_of_fit$Observed)
# Display results
chi_sq_result
##
## Chi-squared test for given probabilities
##
## data: data_goodness_of_fit$Observed
## X-squared = 11.325, df = 2, p-value = 0.003474
The p-value will tell us if our observed distribution significantly differs from an expected uniform distribution.
Based on the test output provided:
We can calculate the effect size using Cramér's V
.
The formulas for the calculation is:
\[ V = \sqrt{\frac{\chi^2}{N(k - 1)}} \]
# Calculating Cramér's V
cramer_v <- sqrt(chi_sq_result$statistic / (sum(data_goodness_of_fit$Observed) * (min(length(data_goodness_of_fit$Category) - 1, 1))))
# Display Cramér's V
paste("Cramér's V: ", round(cramer_v, 2))
## [1] "Cramér's V: 0.19"
Interpretation Guidelines:
Df | Small | Medium | Large |
---|---|---|---|
1 | 0.10 | 0.30 | 0.50 |
2 | 0.07 | 0.21 | 0.35 |
3 | 0.06 | 0.17 | 0.29 |
4 | 0.05 | 0.15 | 0.25 |
5 | 0.04 | 0.13 | 0.22 |
The Cramer’s V value of 0.19 suggests a small to medium effect size for the observed differences in the frequencies of the color categories (Blue, Red, and Green). This indicates that there’s a weak to moderate deviation from what would be expected if the colors were distributed equally.
The Chi-square Test of Independence is used to determine if there is a significant association between two categorical variables. Specifically, it tests whether the observed frequencies for categories are different from what we would expect under the assumption of independence.
In our current study, we aim to examine the relationship between the
method of reward used to train cats and the outcome of whether the cat
can dance. The dataset, catsData
, provides information on
two variables:
The dataset can be structured in two ways:
Let’s begin by exploring our dataset visually.
library(ggplot2)
library(scales)
library(dplyr)
# Provided dataset
catsData <- read.delim("cats.dat", header = TRUE)
head(catsData)
## Training Dance
## 1 Food as Reward Yes
## 2 Food as Reward Yes
## 3 Food as Reward Yes
## 4 Food as Reward Yes
## 5 Food as Reward Yes
## 6 Food as Reward Yes
# Prepare the data for a stacked bar graph
plot_data <- catsData %>%
count(Training, Dance) %>%
group_by(Training) %>%
mutate(percent = n/sum(n)) # Calculate percentage within each Training group
# Generate labels for the bars
plot_data$label <- paste(as.character(plot_data$n),
'\n(',
as.character(percent(plot_data$percent)),')',
sep = '') # Create labels with counts and percentages
# Display the data
plot_data # Output the processed data
## # A tibble: 4 × 5
## # Groups: Training [2]
## Training Dance n percent label
## <chr> <chr> <int> <dbl> <chr>
## 1 Affection as Reward No 114 0.704 "114\n(70.4%)"
## 2 Affection as Reward Yes 48 0.296 "48\n(29.6%)"
## 3 Food as Reward No 10 0.263 "10\n(26.3%)"
## 4 Food as Reward Yes 28 0.737 "28\n(73.7%)"
# Create the stacked bar graph
ggplot(plot_data, aes(x = Training, y = percent, fill = Dance)) +
geom_col(position = "fill", width = 0.5) + # Create bars with fill position and specified width
geom_text(aes(label=label), position = "fill",color = 'black', vjust=2) + # Add text labels to bars
scale_y_continuous(labels = percent) + # Format the y-axis labels as percentages
scale_x_discrete(name = 'Types of training') + # Label the x-axis
scale_fill_brewer(palette="Pastel1", name = 'Dance') # Choose a color palette and label the legend
For cats trained with Affection
, only about 29.6% were
observed to dance, while the majority (70.4%) did not. In contrast, when
Food
was used as a reward, a significant 73.7% of cats
danced, with only 26.3% refraining.
The graph suggests that cats are more inclined to dance when offered food as a reward compared to affection.
In our data, these assumptions are met.
To analyze the relationship between the type of training and a cat’s
ability to dance, we’ll employ the CrossTable()
function
from the gmodels
package. This function has several
arguments:
(catsData$Training, catsData$Dance)
or a contingency table
(catsTable)
.fisher
: If set to TRUE, it computes the Fisher’s exact
test. Useful when the sample size is small.chisq
: If set to TRUE, it calculates the chi-squared
test.expected
: If set to TRUE, it displays the expected
frequencies.sresid
: If set to TRUE, it shows the standardized
residuals.format
: This determines the output format. For our
purpose, we’ve set it to “SPSS” for a clean presentation.Let’s start by entering our data:
# Entering data: the contingency table
table(catsData$Dance,catsData$Training)
##
## Affection as Reward Food as Reward
## No 114 10
## Yes 48 28
food <- c(10, 28)
affection <- c(114, 48)
catsTable <- cbind(food, affection)
catsTable
## food affection
## [1,] 10 114
## [2,] 28 48
Now, let’s proceed with our analysis:
library(gmodels)
# Using the raw data:
gmodels::CrossTable(catsData$Training,
catsData$Dance,
fisher = TRUE,
chisq = TRUE,
expected = TRUE,
sresid = TRUE,
format = "SPSS")
##
## Cell Contents
## |-------------------------|
## | Count |
## | Expected Values |
## | Chi-square contribution |
## | Row Percent |
## | Column Percent |
## | Total Percent |
## | Std Residual |
## |-------------------------|
##
## Total Observations in Table: 200
##
## | catsData$Dance
## catsData$Training | No | Yes | Row Total |
## --------------------|-----------|-----------|-----------|
## Affection as Reward | 114 | 48 | 162 |
## | 100.440 | 61.560 | |
## | 1.831 | 2.987 | |
## | 70.370% | 29.630% | 81.000% |
## | 91.935% | 63.158% | |
## | 57.000% | 24.000% | |
## | 1.353 | -1.728 | |
## --------------------|-----------|-----------|-----------|
## Food as Reward | 10 | 28 | 38 |
## | 23.560 | 14.440 | |
## | 7.804 | 12.734 | |
## | 26.316% | 73.684% | 19.000% |
## | 8.065% | 36.842% | |
## | 5.000% | 14.000% | |
## | -2.794 | 3.568 | |
## --------------------|-----------|-----------|-----------|
## Column Total | 124 | 76 | 200 |
## | 62.000% | 38.000% | |
## --------------------|-----------|-----------|-----------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 25.35569 d.f. = 1 p = 4.767434e-07
##
## Pearson's Chi-squared test with Yates' continuity correction
## ------------------------------------------------------------
## Chi^2 = 23.52028 d.f. = 1 p = 1.236041e-06
##
##
## Fisher's Exact Test for Count Data
## ------------------------------------------------------------
## Sample estimate odds ratio: 6.579265
##
## Alternative hypothesis: true odds ratio is not equal to 1
## p = 1.311709e-06
## 95% confidence interval: 2.837773 16.42969
##
## Alternative hypothesis: true odds ratio is less than 1
## p = 0.9999999
## 95% confidence interval: 0 14.25436
##
## Alternative hypothesis: true odds ratio is greater than 1
## p = 7.7122e-07
## 95% confidence interval: 3.193221 Inf
##
##
##
## Minimum expected frequency: 14.44
# Using the contingency table:
gmodels::CrossTable(catsTable,
fisher = TRUE,
chisq = TRUE,
expected = TRUE,
sresid = TRUE,
format = "SPSS")
##
## Cell Contents
## |-------------------------|
## | Count |
## | Expected Values |
## | Chi-square contribution |
## | Row Percent |
## | Column Percent |
## | Total Percent |
## | Std Residual |
## |-------------------------|
##
## Total Observations in Table: 200
##
## |
## | food | affection | Row Total |
## -------------|-----------|-----------|-----------|
## [1,] | 10 | 114 | 124 |
## | 23.560 | 100.440 | |
## | 7.804 | 1.831 | |
## | 8.065% | 91.935% | 62.000% |
## | 26.316% | 70.370% | |
## | 5.000% | 57.000% | |
## | -2.794 | 1.353 | |
## -------------|-----------|-----------|-----------|
## [2,] | 28 | 48 | 76 |
## | 14.440 | 61.560 | |
## | 12.734 | 2.987 | |
## | 36.842% | 63.158% | 38.000% |
## | 73.684% | 29.630% | |
## | 14.000% | 24.000% | |
## | 3.568 | -1.728 | |
## -------------|-----------|-----------|-----------|
## Column Total | 38 | 162 | 200 |
## | 19.000% | 81.000% | |
## -------------|-----------|-----------|-----------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 25.35569 d.f. = 1 p = 4.767434e-07
##
## Pearson's Chi-squared test with Yates' continuity correction
## ------------------------------------------------------------
## Chi^2 = 23.52028 d.f. = 1 p = 1.236041e-06
##
##
## Fisher's Exact Test for Count Data
## ------------------------------------------------------------
## Sample estimate odds ratio: 0.1519927
##
## Alternative hypothesis: true odds ratio is not equal to 1
## p = 1.311709e-06
## 95% confidence interval: 0.06086544 0.352389
##
## Alternative hypothesis: true odds ratio is less than 1
## p = 7.7122e-07
## 95% confidence interval: 0 0.3131634
##
## Alternative hypothesis: true odds ratio is greater than 1
## p = 0.9999999
## 95% confidence interval: 0.07015399 Inf
##
##
##
## Minimum expected frequency: 14.44
The main body of the table gives us the observed frequencies (counts) for each combination of the factors.
Pearson’s Chi-squared test:
Yates’ continuity correction:
Fisher’s Exact Test:
The effect size helps us understand the practical significance of our results. There are two measures of effect size that we will calculate:
fisher = TRUE
in CrossTable()
).library(rcompanion)
## Registered S3 method overwritten by 'DescTools':
## method from
## reorder.factor gdata
# The new dataset
Species = c(rep("Species1", 16), rep("Species2", 16))
Color = c(rep(c("blue", "blue", "blue", "green"),4),
rep(c("green", "green", "green", "blue"),4))
d <- as.data.frame(cbind(Species, Color))
head(d)
## Species Color
## 1 Species1 blue
## 2 Species1 blue
## 3 Species1 blue
## 4 Species1 green
## 5 Species1 blue
## 6 Species1 blue
# Compute Cramer's V
cramerV(matrix(table(d$Species, d$Color), ncol = 2), ci = T)
## Cramer.V lower.ci upper.ci
## 1 0.5 0.1909 0.7893
Interpretation Guidelines:
Df | Small | Medium | Large |
---|---|---|---|
1 | 0.10 | 0.30 | 0.50 |
2 | 0.07 | 0.21 | 0.35 |
3 | 0.06 | 0.17 | 0.29 |
4 | 0.05 | 0.15 | 0.25 |
5 | 0.04 | 0.13 | 0.22 |
The Cramer’s V value of 0.5 suggests a large effect size, indicating a strong association between the two categorical variables - in this case, the type of reward (affection vs. food) and the behavior of cats dancing. There’s a statistically significant and strong association between the type of reward given to cats and their likelihood to dance. The method of rewarding cats (with food or affection) has a substantial influence on their behavior in this context. From our data, rewarding cats with food is more effective in training cats to dance.