title: “Chi-Square Goodness-of-Fit — Dessert Preferences (Scenario 1)” author: “Geetha Shivani” date: “November 12, 2025” output: html_document: toc: true toc_depth: 2 number_sections: yes —


Purpose

Test whether the observed dessert preferences match an expected equal distribution across three desserts (Chocolate Cake, Vanilla Cheesecake, Tiramisu).

Hypotheses
- H0: Preferences are equal across desserts (p1 = p2 = p3 = 1/3).
- H1: At least one dessert is preferred more/less than expected.

Data Entry

🔁 Replace the numbers below with your observed counts from the restaurant.

# Observed counts (EDIT THESE if you have your own data)
desserts <- c(50, 45, 30)
names(desserts) <- c("Chocolate_Cake", "Vanilla_Cheesecake", "Tiramisu")

# Expected probabilities for equal preference
expected <- rep(1/3, 3)

desserts
##     Chocolate_Cake Vanilla_Cheesecake           Tiramisu 
##                 50                 45                 30
expected
## [1] 0.3333333 0.3333333 0.3333333
sum_obs <- sum(desserts)
sum_obs
## [1] 125

Quick Visualization

op <- par(mfrow=c(1,2), las=2)
barplot(desserts, main="Observed Counts", ylab="Count")
barplot(expected * sum(desserts), main="Expected Counts (Equal)", ylab="Count")

par(op)

Assumption Check

exp_counts <- expected * sum(desserts)
exp_counts
## [1] 41.66667 41.66667 41.66667
any_lt5 <- any(exp_counts < 5)
any_lt5
## [1] FALSE

Test: Chi-Square Goodness-of-Fit

gof <- chisq.test(desserts, p = expected, rescale.p = TRUE)
gof
## 
##  Chi-squared test for given probabilities
## 
## data:  desserts
## X-squared = 5.2, df = 2, p-value = 0.07427

Effect Size (Cohen’s w)

# Cohen's w for GOF: w = sqrt( sum((obs - exp)^2 / exp) ) / N  *BUT*
# More directly: w = sqrt( sum( (pi_obs - pi_exp)^2 / pi_exp ) )
pi_obs <- desserts / sum(desserts)
pi_exp <- expected
w <- sqrt( sum((pi_obs - pi_exp)^2 / pi_exp) )
w
## [1] 0.2039608
# Benchmarks: small = .10, medium = .30, large = .50

Post Hoc: Contribution by Category

std_resid <- (desserts - exp_counts) / sqrt(exp_counts)
contrib <- std_resid^2
cbind(Observed=desserts, Expected=round(exp_counts,2), StdResid=round(std_resid,2), Contribution=round(contrib,2))
##                    Observed Expected StdResid Contribution
## Chocolate_Cake           50    41.67     1.29         1.67
## Vanilla_Cheesecake       45    41.67     0.52         0.27
## Tiramisu                 30    41.67    -1.81         3.27

APA-Style Reporting

cat(sprintf("A chi-square goodness-of-fit test indicated that dessert preferences %s equal across categories, ",
            ifelse(gof$p.value < 0.05, "were not", "were")))

A chi-square goodness-of-fit test indicated that dessert preferences were equal across categories,

cat(sprintf("χ²(%d) = %.2f, p = %.3f. ", gof$parameter, gof$statistic, gof$p.value))

χ²(2) = 5.20, p = 0.074.

cat(sprintf("Effect size was Cohen's w = %.2f. ", w))

Effect size was Cohen’s w = 0.20.

cat("Standardized residuals suggest categories with the largest deviations from expectation (see table above).")

Standardized residuals suggest categories with the largest deviations from expectation (see table above).

Notes for RPubs