According to the Centers for Disease Control and Prevention (Shah et al., 2025), roughly 32% of United States adults consume fast food on any given day. The fast food industry serves millions of customers daily, making the nutritional content of menu items a topic of public health interest.
McDonald’s and Burger King are two of the most well-known fast food franchises in the United States, both historically associated with hamburgers as their signature offering and both with roots dating back to the 1950s. Today, these chains offer a wide variety of fast food menu items.
This analysis explores the following research question:
Is the average calorie count significantly different between
McDonald’s and Burger King menu items?
The dataset used is fastfood, available through the
openintro R package (Çetinkaya-Rundel et al., 2024). It
contains 515 observations and 17 variables covering nutritional
information for entree items across 8 major fast food chains, scraped
from fastfoodnutrition.org and made available through the Tidy Tuesday
project in 2018. For this analysis, the dataset is filtered to two
restaurants only. The two variables used are:
calories — the total calorie count per menu item; this
is a continuous, quantitative response variable.restaurant — the fast food chain; filtered to
McDonald’s and Burger King, this serves as the categorical grouping
variable.In this analysis, I use the fastfood dataset from the
openintro package. I begin by using glimpse()
to preview the dataset structure and unique() to identify
all restaurant names. I then filter the data to include only McDonald’s
and Burger King, and perform exploratory data analysis using
summary() and head() to understand the
structure and distribution of the data. I use filter() and
select() to prepare the dataset, and mutate()
to create a new categorical variable. I use group_by() and
summarise() to calculate mean calories and standard
deviation by restaurant.
library(tidyverse)
library(openintro)
# Load the fastfood dataset
data(fastfood)
# Preview the dataset structure
glimpse(fastfood)
## Rows: 515
## Columns: 17
## $ restaurant <chr> "Mcdonalds", "Mcdonalds", "Mcdonalds", "Mcdonalds", "Mcdon…
## $ item <chr> "Artisan Grilled Chicken Sandwich", "Single Bacon Smokehou…
## $ calories <dbl> 380, 840, 1130, 750, 920, 540, 300, 510, 430, 770, 380, 62…
## $ cal_fat <dbl> 60, 410, 600, 280, 410, 250, 100, 210, 190, 400, 170, 300,…
## $ total_fat <dbl> 7, 45, 67, 31, 45, 28, 12, 24, 21, 45, 18, 34, 20, 34, 8, …
## $ sat_fat <dbl> 2.0, 17.0, 27.0, 10.0, 12.0, 10.0, 5.0, 4.0, 11.0, 21.0, 4…
## $ trans_fat <dbl> 0.0, 1.5, 3.0, 0.5, 0.5, 1.0, 0.5, 0.0, 1.0, 2.5, 0.0, 1.5…
## $ cholesterol <dbl> 95, 130, 220, 155, 120, 80, 40, 65, 85, 175, 40, 95, 125, …
## $ sodium <dbl> 1110, 1580, 1920, 1940, 1980, 950, 680, 1040, 1040, 1290, …
## $ total_carb <dbl> 44, 62, 63, 62, 81, 46, 33, 49, 35, 42, 38, 48, 48, 67, 31…
## $ fiber <dbl> 3, 2, 3, 2, 4, 3, 2, 3, 2, 3, 2, 3, 3, 5, 2, 2, 3, 3, 5, 2…
## $ sugar <dbl> 11, 18, 18, 18, 18, 9, 7, 6, 7, 10, 5, 11, 11, 11, 6, 3, 1…
## $ protein <dbl> 37, 46, 70, 55, 46, 25, 15, 25, 25, 51, 15, 32, 42, 33, 13…
## $ vit_a <dbl> 4, 6, 10, 6, 6, 10, 10, 0, 20, 20, 2, 10, 10, 10, 2, 4, 6,…
## $ vit_c <dbl> 20, 20, 20, 25, 20, 2, 2, 4, 4, 6, 0, 10, 20, 15, 2, 6, 15…
## $ calcium <dbl> 20, 20, 50, 20, 20, 15, 10, 2, 15, 20, 15, 35, 35, 35, 4, …
## $ salad <chr> "Other", "Other", "Other", "Other", "Other", "Other", "Oth…
# View all unique restaurant names in the dataset
unique(fastfood$restaurant)
## [1] "Mcdonalds" "Chick Fil-A" "Sonic" "Arbys" "Burger King"
## [6] "Dairy Queen" "Subway" "Taco Bell"
# Filter to McDonald's and Burger King only
filtered_fastfood <- fastfood |>
filter(restaurant %in% c("Mcdonalds", "Burger King")) |>
select(restaurant, calories)
# View first rows of the filtered dataset
head(filtered_fastfood)
## # A tibble: 6 × 2
## restaurant calories
## <chr> <dbl>
## 1 Mcdonalds 380
## 2 Mcdonalds 840
## 3 Mcdonalds 1130
## 4 Mcdonalds 750
## 5 Mcdonalds 920
## 6 Mcdonalds 540
# Summary statistics for the filtered dataset
summary(filtered_fastfood)
## restaurant calories
## Length :127 Min. : 140.0
## N.unique : 2 1st Qu.: 380.0
## N.blank : 0 Median : 550.0
## Min.nchar: 9 Mean : 622.8
## Max.nchar: 11 3rd Qu.: 755.0
## Max. :2430.0
# Create a new variable that labels the restaurant as a factor
restaurant_filtered_fastfood <- filtered_fastfood |>
mutate(restaurant = as.factor(restaurant))
# Boxplot comparing calorie distributions
ggplot(restaurant_filtered_fastfood, aes(x = restaurant, y = calories, fill = restaurant)) +
geom_boxplot() +
geom_jitter(alpha = 0.2, width = 0.1) +
labs(
title = "Calorie Distribution: McDonald's vs Burger King",
x = "Restaurant",
y = "Calories",
caption = "Source: OpenIntro Statistics, fastfood dataset"
) +
theme_minimal() +
theme(legend.position = "none")
The boxplot above displays the distribution of calories for McDonald’s and Burger King menu items. Individual menu items are shown as jittered points overlaid on each box, allowing a data distribution to be seen alongside the summary statistics. The transparency of the points (alpha = 0.2) allows overlapping values to appear darker, revealing where calorie counts are most densely concentrated within each restaurant’s menu.McDonald’s shows greater variability in calorie content, as indicated by the wider spread of the box and the presence of high-calorie outliers. Burger King’s distribution appears more compact by comparison.
H₀: The mean calorie count is equal between McDonald’s and Burger King menu items.
H₁: The mean calorie count is significantly different between McDonald’s and Burger King menu items.
Each menu item in the dataset represents a separate, independent observation. Knowing the calorie count of one menu item does not affect the calorie count of another menu item within or across restaurants. I assume the calorie content of the menu items at one restaurant does not influence the calorie content of menu items at the other restaurant. The sample sizes for both groups exceed 30 (McDonald’s n = 57, Burger King n = 70), satisfying the Central Limit Theorem assumption required for the two-sample t-test. Welch’s t-test is used, which does not assume equal variances between groups. This is an appropriate choice given the notable difference in variability between the two restaurants — McDonald’s has a standard deviation of 411 calories compared to Burger King’s 290 calories, suggesting unequal variances. A two-sample Welch’s t-test is conducted at a significance level of α = 0.05 to determine whether mean calorie counts differ significantly between the two restaurants.
While this dataset represents a snapshot of fast food entree menu items scraped in 2018, it does not capture the full menu of either restaurant — sides, drinks, and desserts are excluded. It is therefore treated as a sample of each restaurant’s broader menu offerings, justifying the use of a hypothesis test to assess whether observed differences are statistically significant.
# State the hypotheses
# H0: The mean calorie count is equal between McDonald's and Burger King
# Ha: The mean calorie count is significantly different between
# McDonald's and Burger King
# Run the two-sample t-test
t_test_result <- t.test(calories ~ restaurant, data = restaurant_filtered_fastfood)
t_test_result
##
## Welch Two Sample t-test
##
## data: calories by restaurant
## t = -0.49248, df = 97.737, p-value = 0.6235
## alternative hypothesis: true difference in means between group Burger King and group Mcdonalds is not equal to 0
## 95 percent confidence interval:
## -159.84025 96.28135
## sample estimates:
## mean in group Burger King mean in group Mcdonalds
## 608.5714 640.3509
# Calculate mean calories by restaurant
restaurant_filtered_fastfood |>
group_by(restaurant) |>
summarise(
mean_calories = mean(calories),
sd_calories = sd(calories),
n = n()
)
## # A tibble: 2 × 4
## restaurant mean_calories sd_calories n
## <fct> <dbl> <dbl> <int>
## 1 Burger King 609. 290. 70
## 2 Mcdonalds 640. 411. 57
I conducted a two-sample Welch’s t-test to evaluate whether mean calorie counts differ significantly between McDonald’s and Burger King menu items. The analysis revealed that McDonald’s had a mean of 640 calories (SD = 411) and Burger King had a mean of 609 calories (SD = 290). Burger King’s menu shows greater consistency in calorie content, suggesting that its entrees cluster more tightly around the mean calorie level compared to McDonald’s.
Despite this numeric difference, the t-test produced a p-value of 0.6235, which is well above the significance level of α = 0.05. Therefore, I fail to reject the null hypothesis. There is insufficient evidence to conclude that mean calorie counts differ significantly between the two restaurants.
The high standard deviations for both restaurants — particularly McDonald’s at 411 calories — suggest considerable variability within each chain’s menu. This within-group variability likely explains why no statistically significant between-group difference was detected. Both chains offer a wide range of entree items, which may be masking any true structural difference in calorie content.
Future research could explore calorie differences across specific menu categories such as burgers, sides, and beverages separately. Incorporating additional nutritional variables such as sodium, fat, and protein into a multiple regression model could also provide a more complete picture of nutritional differences between fast food chains.
Çetinkaya-Rundel, M., Hardin, J., Baumer, B., McNamara, A., Bray, A., & Çetinkaya-Rundel, M. (2024). openintro: Data sets and supplemental functions from OpenIntro textbooks and labs. R package. https://www.openintro.org
R for Data Science Community. (2018, September 4). Fast food calories [Dataset]. Tidy Tuesday. https://github.com/rfordatascience/tidytuesday/tree/master/data/2018/2018-09-04
Shah, N. N., Fryar, C. D., Ahluwalia, N., & Akinbami, L. J. (2025). Fast-food intake among adults in the United States, August 2021–August 2023. NCHS Data Brief, (533), 1–12. https://dx.doi.org/10.15620/cdc/174606