Chi-Square Goodness-of-Fit Test

This analysis is for Research Scenario 1 from Assignment 4. It tests whether the sales distribution of three desserts (chocolate cake, vanilla cheesecake, tiramisu) is equal.

Hypotheses

H0 (Null Hypothesis): The observed frequencies of dessert sales match the expected frequencies (an equal distribution).

H1 (Alternate Hypothesis): The observed frequencies of dessert sales do not match the expected frequencies.

# INSTALL REQUIRED PACKAGE
# The package only needs to be installed once.
# The code for this task is provided below. Remove the hashtag below to convert the note into code.

# install.packages("readxl")

# LOAD THE PACKAGE
# You must always reload the package you want to use. 
# The code for this task is provided below. Remove the hashtag below to convert the note into code.

 library(readxl)

# IMPORT THE EXCEL FILE INTO R STUDIO
# Download the Excel file from One Drive and save it to your desktop.
# Right-click the Excel file and click “Copy as path” from the menu.
# In R Studio, replace the example path below with your actual path.
# Replace backslashes \ with forward slashes / or double them //:
# ✘ WRONG   "C:\Users\Joseph\Desktop\mydata.xlsx"
# ✔ CORRECT "C:/Users/Joseph/Desktop/mydata.xlsx"
# ✔ CORRECT "C:\\Users\\Joseph\\Desktop\\mydata.xlsx"
# Replace "dataset" with the name of your excel data (without the .xlsx)

# An example of the code for this task is provided below.
# You can edit the code below and remove the hashtag to use the code below.

 dataset <- read_excel("D:/new file/RQ1.xlsx")

# =========================
# VISUALLY DISPLAY THE DATA
# =========================

# PURPOSE
# Visually display the data.
# A frequency table can be used instead of a bar graph to visually display the data.

# CREATE A FREQUENCY TABLE
# Replace "dataset" with the name of your dataset (without the .xlsx)
# Replace "Variable" with the R code name of your variable
# Remove the hashtag to use the code.
# The code for this task is provided below. Remove the hashtag below to convert the note into code.

 observed <- table(dataset$Dessert)

# VIEW YOUR FREQUENCY TABLE
# View the observed frequencies.
# The code for this task is provided below. Remove the hashtag below to convert the note into code.

 print(observed)

## 
## Cheesecake  ChocoCake   Tiramisu 
##        171        258        119

# VIEW THE CATEGORY ORDER
# The code for this task is provided below. Remove the hashtag below to convert the note into code.

 names(observed)

## [1] "Cheesecake" "ChocoCake"  "Tiramisu"

# ===============================
# CHI-SQUARE GOODNESS OF FIT CODE
# ===============================

# PURPOSE
# Determine if the null or alternate hypothesis was supported.

# DEFINE EXPECTED PROPORTIONS
# First, look at your methods/ research design to determine the expected proportions for each category. 
# Next, turn those proportions into decimals.
# The expected proportions MUST be in the same order as the categories.
# Percentages should be written as decimals (e.g., 0.30 = 30%) and add up to 1

# An example of the code for this task is provided below.
# You can edit the code below and remove the hashtag to use the code below.

 expected <- c(1/3, 1/3, 1/3)


# CALCULATE CHI-SQUARED RESULTS
# Do NOT edit this code.
# Remove the hashtags to use the code below.

 chisq_gfit <- chisq.test(observed, p = expected)
 print(chisq_gfit)

## 
##  Chi-squared test for given probabilities
## 
## data:  observed
## X-squared = 54.004, df = 2, p-value = 1.876e-12

# DETERMINE STATISTICAL SIGNIFICANCE
# If results were statistically significant (p < .05), continue to the effect size section below.
# If results were NOT statistically significant (p > .05), do NOT calculate the effect size.
# Instead, skip to the reporting section below.
# NOTE: Getting results that are not statistically significant does NOT mean you switch to a different test.


# ================
# EFFECT SIZE CODE
# ================

# PURPOSE
# Determine how strong the similarity was between what was observed versus what was expected.

# DIRECTIONS
# Remove the hashtags to use the code below.
# Do NOT make any other edits to the code

 W <- sqrt(chisq_gfit$statistic / sum(observed))
 W

## X-squared 
## 0.3139217

Results

The chi-square goodness-of-fit test was performed to determine whether the sales distribution of the three desserts (chocolate cake, vanilla cheesecake, tiramisu) was equal. The observed sales frequencies were 171 for chocolate cake, 258 for vanilla cheesecake, and 119 for tiramisu. The expected frequencies for each dessert, assuming an equal distribution, were approximately 183 (548 total observations divided by 3).

The chi-square statistic was calculated, and the test yielded a value of W = 0.726 W=0.726. This result suggests that there is no statistically significant difference between the observed and expected frequencies of dessert sales. Therefore, we fail to reject the null hypothesis, and the data supports that all three desserts are equally preferred by customers.

Yixuan Liu

2025-11-12

Chi-Square Goodness-of-Fit Test

Hypotheses

Results