Source: Time Magazine
This project explores the nutritional content of items on the McDonald’s USA menu. The data was collected and published by McDonald’s Corporation, which makes its nutritional information publicly available to consumers. The dataset contains 266 menu items across 9 categories including Breakfast, Chicken and Fish, Desserts, and Beverages. The variables used in this project include both quantitative and categorical types. The categorical variable is Category, which groups menu items by food type. The quantitative variables include Calories, Total Fat (g), Saturated Fat (g), Trans Fat (g), Cholesterol (mg), etc. The central questions I explore are: which nutritional components best predict calorie content, and how do calorie levels differ across menu categories? I chose this topic because I eat at McDonald’s frequently, and I wanted to better understand the nutritional value of the food I consume. Beyond my personal curiosity, this analysis has broader relevance — McDonald’s is one of the most visited fast food chains in the United States, and insights from this data could help other Americans make more informed decisions about their eating habits.
“According to the Centers for Disease Control and Prevention, adults in the United States consume about 37% of their daily calories from fast food, with consumption being highest among younger age groups” (CDC, 2018). This highlights the public health significance of understanding the nutritional content of fast food menus like McDonald’s.
# load the libraries
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.5.2
## Warning: package 'ggplot2' was built under R version 4.5.2
library(readr)
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 4.5.2
library(ggrepel)
## Warning: package 'ggrepel' was built under R version 4.5.2
library(highcharter)
## Warning: package 'highcharter' was built under R version 4.5.2
library(RColorBrewer)
# set working directory
menu <- read_csv("menu2_.csv")
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
## dat <- vroom(...)
## problems(dat)
## Rows: 266 Columns: 25
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Category, Item, Serving Size, Calories
## dbl (20): Calories from Fat, Total Fat, Total Fat (% Daily Value), Saturated...
## lgl (1): Observ
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(menu)
## # A tibble: 6 × 25
## Category Item `Serving Size` Calories `Calories from Fat` `Total Fat`
## <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 Breakfast Egg McMuffin 4.8 oz (136 g) 300cal. 120 13
## 2 Breakfast Egg White D… 4.8 oz (135 g) 250 70 8
## 3 Breakfast Sausage McM… 3.9 oz (111 g) 370 200 23
## 4 Breakfast Sausage McM… 5.7 oz (161 g) 450 250 28
## 5 Breakfast Sausage McM… 5.7 oz (161 g) 400 210 23
## 6 Breakfast Steak & Egg… 6.5 oz (185 g) 430 210 23
## # ℹ 19 more variables: `Total Fat (% Daily Value)` <dbl>,
## # `Saturated Fat` <dbl>, `Saturated Fat (% Daily Value)` <dbl>,
## # `Trans Fat` <dbl>, Cholesterol <dbl>, `Cholesterol (% Daily Value)` <dbl>,
## # Sodium <dbl>, `Sodium (% Daily Value)` <dbl>, Carbohydrates <dbl>,
## # `Carbohydrates (% Daily Value)` <dbl>, `Dietary Fiber` <dbl>,
## # `Dietary Fiber (% Daily Value)` <dbl>, Sugars <dbl>, Protein <dbl>,
## # `Vitamin A (% Daily Value)` <dbl>, `Vitamin C (% Daily Value)` <dbl>, …
# cleaning
names(menu) <- tolower(names(menu))
names(menu) <- gsub(" ","_",names(menu))
names(menu) <- gsub("[(). //-]", "_", names(menu))
mcdonalds <- menu|>
select(-observ)
head(mcdonalds)
## # A tibble: 6 × 24
## category item serving_size calories calories_from_fat total_fat
## <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 Breakfast Egg McMuffin 4.8 oz (136… 300cal. 120 13
## 2 Breakfast Egg White Delight 4.8 oz (135… 250 70 8
## 3 Breakfast Sausage McMuffin 3.9 oz (111… 370 200 23
## 4 Breakfast Sausage McMuffin … 5.7 oz (161… 450 250 28
## 5 Breakfast Sausage McMuffin … 5.7 oz (161… 400 210 23
## 6 Breakfast Steak & Egg McMuf… 6.5 oz (185… 430 210 23
## # ℹ 18 more variables: `total_fat__%_daily_value_` <dbl>, saturated_fat <dbl>,
## # `saturated_fat__%_daily_value_` <dbl>, trans_fat <dbl>, cholesterol <dbl>,
## # `cholesterol__%_daily_value_` <dbl>, sodium <dbl>,
## # `sodium__%_daily_value_` <dbl>, carbohydrates <dbl>,
## # `carbohydrates__%_daily_value_` <dbl>, dietary_fiber <dbl>,
## # `dietary_fiber__%_daily_value_` <dbl>, sugars <dbl>, protein <dbl>,
## # `vitamin_a__%_daily_value_` <dbl>, `vitamin_c__%_daily_value_` <dbl>, …
mcdonalds$calories <- gsub("cal.", "", mcdonalds$calories)
mcdonalds$calories <- gsub("cal", "", mcdonalds$calories)
mcdonalds$calories <- gsub("CAL", "", mcdonalds$calories)
head(mcdonalds)
## # A tibble: 6 × 24
## category item serving_size calories calories_from_fat total_fat
## <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 Breakfast Egg McMuffin 4.8 oz (136… 300 120 13
## 2 Breakfast Egg White Delight 4.8 oz (135… 250 70 8
## 3 Breakfast Sausage McMuffin 3.9 oz (111… 370 200 23
## 4 Breakfast Sausage McMuffin … 5.7 oz (161… 450 250 28
## 5 Breakfast Sausage McMuffin … 5.7 oz (161… 400 210 23
## 6 Breakfast Steak & Egg McMuf… 6.5 oz (185… 430 210 23
## # ℹ 18 more variables: `total_fat__%_daily_value_` <dbl>, saturated_fat <dbl>,
## # `saturated_fat__%_daily_value_` <dbl>, trans_fat <dbl>, cholesterol <dbl>,
## # `cholesterol__%_daily_value_` <dbl>, sodium <dbl>,
## # `sodium__%_daily_value_` <dbl>, carbohydrates <dbl>,
## # `carbohydrates__%_daily_value_` <dbl>, dietary_fiber <dbl>,
## # `dietary_fiber__%_daily_value_` <dbl>, sugars <dbl>, protein <dbl>,
## # `vitamin_a__%_daily_value_` <dbl>, `vitamin_c__%_daily_value_` <dbl>, …
mcdonalds$calories<- as.numeric(mcdonalds$calories)
head(mcdonalds)
## # A tibble: 6 × 24
## category item serving_size calories calories_from_fat total_fat
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 Breakfast Egg McMuffin 4.8 oz (136… 300 120 13
## 2 Breakfast Egg White Delight 4.8 oz (135… 250 70 8
## 3 Breakfast Sausage McMuffin 3.9 oz (111… 370 200 23
## 4 Breakfast Sausage McMuffin … 5.7 oz (161… 450 250 28
## 5 Breakfast Sausage McMuffin … 5.7 oz (161… 400 210 23
## 6 Breakfast Steak & Egg McMuf… 6.5 oz (185… 430 210 23
## # ℹ 18 more variables: `total_fat__%_daily_value_` <dbl>, saturated_fat <dbl>,
## # `saturated_fat__%_daily_value_` <dbl>, trans_fat <dbl>, cholesterol <dbl>,
## # `cholesterol__%_daily_value_` <dbl>, sodium <dbl>,
## # `sodium__%_daily_value_` <dbl>, carbohydrates <dbl>,
## # `carbohydrates__%_daily_value_` <dbl>, dietary_fiber <dbl>,
## # `dietary_fiber__%_daily_value_` <dbl>, sugars <dbl>, protein <dbl>,
## # `vitamin_a__%_daily_value_` <dbl>, `vitamin_c__%_daily_value_` <dbl>, …
colSums(is.na(mcdonalds))
## category item
## 0 0
## serving_size calories
## 0 3
## calories_from_fat total_fat
## 1 2
## total_fat__%_daily_value_ saturated_fat
## 2 2
## saturated_fat__%_daily_value_ trans_fat
## 1 2
## cholesterol cholesterol__%_daily_value_
## 2 1
## sodium sodium__%_daily_value_
## 2 1
## carbohydrates carbohydrates__%_daily_value_
## 3 1
## dietary_fiber dietary_fiber__%_daily_value_
## 2 0
## sugars protein
## 0 1
## vitamin_a__%_daily_value_ vitamin_c__%_daily_value_
## 1 2
## calcium__%_daily_value_ iron__%_daily_value_
## 1 1
#Tableau Visualization:
(https://public.tableau.com/shared/NZ79Z3YDG
This Tableau chart breaks down the nutrients in each McDonald’s menu item by category. What stood out to me most is how much sodium dominates almost every item it towers over the other nutrients simply because it’s measured in milligrams. Breakfast items like the Big Breakfast are the tallest overall, showing just how packed they are nutritionally. You can use the category filter on the right to zoom in on specific sections of the menu.
avg_calories <- mcdonalds |>
filter(!is.na(calories)) |>
group_by(category) |>
summarize(avg_cal = round(mean(calories, na.rm = TRUE), 1)) |>
arrange(desc(avg_cal))
hchart(avg_calories, "bar", hcaes(x = category, y = avg_cal)) |>
hc_title(text = "Average Calories by McDonald's Menu Category") |>
hc_xAxis(title = list(text = "Menu Category")) |>
hc_yAxis(title = list(text = "Average Calories")) |>
hc_tooltip(pointFormat = "Avg Calories: <b>{point.y}</b>") |>
hc_colors("#c8102e") |>
hc_caption(text = "Source: McDonald's USA Nutritional Facts") |>
hc_add_theme(hc_theme_flat())
This interactive bar chart displays the average calorie count for each menu category at McDonald’s. The data reveals that Chicken & Fish and Beef & Pork categories carry the highest average calorie counts, which makes sense given that these items tend to be larger, protein-heavy entrees. On the lower end, Beverages and Salads have the fewest average calories, reflecting their lighter composition. This visualization is useful for consumers who want to quickly identify which sections of the McDonald’s menu to approach with caution when managing calorie intake.
mcdonalds |>
filter(!is.na(calories), !is.na(total_fat)) |>
ggplot(aes(x = total_fat, y = calories, color = category)) +
geom_point(size = 2.5, alpha = 0.75) +
scale_color_brewer(palette = "Set1") +
theme_foundation() +
labs(
title = "Calories vs. Total Fat in McDonald's Menu Items",
subtitle = "Each point represents one menu item, colored by menu category",
x = "Total Fat (g)",
y = "Calories",
color = "Menu Category",
caption = "Source: McDonald's USA Nutritional Facts"
)
This scatter plot visualizes the relationship between total fat content and calorie count for every item on the McDonald’s menu, with each color representing a different menu category. A clear positive relationship is visible — as total fat increases, calories increase as well. Beef & Pork and Chicken & Fish items cluster toward the higher end of both axes, confirming they are among the most calorie-dense. Beverages and Coffee & Tea items cluster near the lower left, indicating lower fat and calorie content overall. The color coding by category makes it easy to spot patterns across menu sections at a glance.
# filter out all na's to set up a correlation plot
mcdonalds1 <- mcdonalds |>
filter(!is.na(calories))|>
filter(!is.na(total_fat))|>
filter(!is.na(saturated_fat))|>
filter(!is.na(trans_fat))|>
filter(!is.na(sodium))|>
filter(!is.na(carbohydrates))|>
filter(!is.na(protein))|>
filter(!is.na(cholesterol))|>
filter(!is.na(dietary_fiber))|>
select(calories, total_fat, saturated_fat,trans_fat, sugars, sodium,cholesterol, carbohydrates, dietary_fiber, protein)
head(mcdonalds1)
## # A tibble: 6 × 10
## calories total_fat saturated_fat trans_fat sugars sodium cholesterol
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 300 13 5 0 3 750 260
## 2 370 23 8 0 2 780 45
## 3 450 28 10 0 2 860 285
## 4 400 23 8 0 2 880 50
## 5 430 23 9 1 3 960 300
## 6 460 26 13 0 3 1300 250
## # ℹ 3 more variables: carbohydrates <dbl>, dietary_fiber <dbl>, protein <dbl>
#make correlation plot to look at which variables can determine calories.
library(DataExplorer)
plot_correlation(mcdonalds1)
#multiple linear regression
multiple_model <- lm(calories ~ total_fat + carbohydrates + protein + dietary_fiber + sodium + sugars + cholesterol,
data = mcdonalds1)
summary(multiple_model)
##
## Call:
## lm(formula = calories ~ total_fat + carbohydrates + protein +
## dietary_fiber + sodium + sugars + cholesterol, data = mcdonalds1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.230 -4.097 0.218 3.150 192.292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.6033480 1.8168448 -0.882 0.378
## total_fat 8.5798042 0.1483025 57.853 <2e-16 ***
## carbohydrates 4.1830134 0.1228323 34.055 <2e-16 ***
## protein 4.2604630 0.1825505 23.339 <2e-16 ***
## dietary_fiber -0.4348191 0.9015951 -0.482 0.630
## sodium -0.0008306 0.0057083 -0.146 0.884
## sugars -0.1719959 0.1273483 -1.351 0.178
## cholesterol 0.0087130 0.0135902 0.641 0.522
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.72 on 245 degrees of freedom
## Multiple R-squared: 0.9969, Adjusted R-squared: 0.9968
## F-statistic: 1.125e+04 on 7 and 245 DF, p-value: < 2.2e-16
Model Equation : Calories = -1.60 + 8.58(Total Fat) + 4.18(Carbohydrates) + 4.26(Protein) - 0.43(Dietary Fiber) - 0.0008(Sodium) - 0.17(Sugars) + 0.009(Cholesterol)
This multiple linear regression model predicts calorie content across McDonald’s menu items using seven nutritional predictors. The adjusted R² of 0.9968 indicates that 99.68% of the variation in calories is explained by the model — an exceptionally strong fit. Of the seven predictors, only three are statistically significant which is total fat, carbohydrates, and protein.
#check basic assumptions and plots
plot(multiple_model)
The diagnostic plots support the model’s validity. The Residuals vs
Fitted plot shows mostly random scatter, suggesting the linearity
assumption holds, though there is an outlier visible. The Q-Q plot
indicates the residuals are approximately normally distributed, with
slight deviation at the upper tail due to that same outlier.
This project analyzed the nutritional composition of McDonald’s menu items using multiple linear regression. The regression model revealed that total fat, carbohydrates, and protein are the strongest and most statistically significant predictors of calorie content, together explaining approximately 99.68% of the variation in calories. The bar chart and scatter plot together reinforced this finding visually, showing that Beef & Pork and Chicken & Fish categories consistently rank highest in both calories and fat content, while Beverages and Salads sit at the lower end.
One surprising pattern was how dominant sodium appeared in the Tableau visualization relative to other nutrients. If given more time, I would have liked to explore changes in McDonald’s nutritional content over time, or compare McDonald’s data to other major fast food chains to provide broader context for the findings.
Centers for Disease Control and Prevention. (2018). FastStats: Obesity and overweight. U.S. Department of Health & Human Services. https://www.cdc.gov/nchs/fastats/obesity-overweight.htm
Image: https://time.com/4084668/mcdonalds-rebranding-sales-growth/