Fast Food

Author

M Sullivan

Source: https://www.partstown.com/about-us/what-is-considered-fast-food

Introduction

The following data analysis explores different kinds of food contents found in menu items offered at several fast food restaurants. There are eight fast food restaurants within this dataset: McDonald’s, Chick-Fil-A, Sonic, Arby’s, Burger King, Dairy Queen, Subway and Taco Bell. Each restaurants sells different food items including burgers, sandwiches, burritos, chicken nuggets and salads. Within each food item are a wide array of nutritional content amounts: calories, fats, cholesterol, sodium, carbs, fiber, sugar, protein, vitamins and calcium. The source for this dataset is Data.World, however there is no ReadMe file with any information on methodology.

One topic of background research I feel is relevant is income within the fast food industry. McDonald’s alone reported approximately $8.47 billion dollars in 2023. To add more variance and perspective, Chick-Fil-A generated about $6.4 billion dollars in 2022, and Dairy Queen generated roughly $3.6 billion dollars in 2023. It is also imperative to understand what each of these nutritional values means. A calorie is a metric for energy. Fats are similar but also help regulate heart circulation and assist in healthy hair and skin growth. Carbohydrates are similar units for energy storage that also contain sugars and starch. Cholesterol contributes to cell growth and hormone production. Sodium produces muscle growth and nerve function. Fiber assists your digestive system and weight change. Sugar transforms carbs into glucose for energy use. Protein offers structure to the body’s internal tissues and organs. A vitamin is a carbon compound that helps many parts of the body function correctly. Lastly, calcium strengthens bones and teeth. For many of these food contents, too much or too low consumption likely will result in certain heart, blood and internal health problems. I also looked up the metric for each nutritional value (grams, milligrams, etc.) as I was unaware. One correlation I will analyze is the relationship between sodium and calories in all fast food restaurants.

It is essential to comprehend exactly what we are absorbing into our bodies, so we know what nutrients we may lack or need more of for healthy living. I enjoy learning about how much of every nutrient a restaurant sells. I eat at several of the restaurants in this dataset, especially McDonald’s. After all, McDonald’s is the highest-grossing fast food resraurant in the world with Chick-Fil-A not far behind. I would like to know which nutrients we are predominantly consuming to evaluate which levels in our bodies are particularly high.

Load the libraries

library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.4.1
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggthemes)
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
setwd("C:/Users/micha/OneDrive/Documents/DATA 110")
fastfood <- read_csv ("fastfood.csv")
Rows: 515 Columns: 17
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (3): restaurant, item, salad
dbl (14): calories, cal_fat, total_fat, sat_fat, trans_fat, cholesterol, sod...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Let’s view the first few rows of the dataset

head(fastfood)
# A tibble: 6 × 17
  restaurant item       calories cal_fat total_fat sat_fat trans_fat cholesterol
  <chr>      <chr>         <dbl>   <dbl>     <dbl>   <dbl>     <dbl>       <dbl>
1 Mcdonalds  Artisan G…      380      60         7       2       0            95
2 Mcdonalds  Single Ba…      840     410        45      17       1.5         130
3 Mcdonalds  Double Ba…     1130     600        67      27       3           220
4 Mcdonalds  Grilled B…      750     280        31      10       0.5         155
5 Mcdonalds  Crispy Ba…      920     410        45      12       0.5         120
6 Mcdonalds  Big Mac         540     250        28      10       1            80
# ℹ 9 more variables: sodium <dbl>, total_carb <dbl>, fiber <dbl>, sugar <dbl>,
#   protein <dbl>, vit_a <dbl>, vit_c <dbl>, calcium <dbl>, salad <chr>

Create a new column

vitamin_total <- fastfood |>
  mutate(vitamin_total = vit_a + vit_c)
vitamin_total # This adds vitamin a and vitamin c to obtain a total vitamin number
# A tibble: 515 × 18
   restaurant item      calories cal_fat total_fat sat_fat trans_fat cholesterol
   <chr>      <chr>        <dbl>   <dbl>     <dbl>   <dbl>     <dbl>       <dbl>
 1 Mcdonalds  Artisan …      380      60         7       2       0            95
 2 Mcdonalds  Single B…      840     410        45      17       1.5         130
 3 Mcdonalds  Double B…     1130     600        67      27       3           220
 4 Mcdonalds  Grilled …      750     280        31      10       0.5         155
 5 Mcdonalds  Crispy B…      920     410        45      12       0.5         120
 6 Mcdonalds  Big Mac        540     250        28      10       1            80
 7 Mcdonalds  Cheesebu…      300     100        12       5       0.5          40
 8 Mcdonalds  Classic …      510     210        24       4       0            65
 9 Mcdonalds  Double C…      430     190        21      11       1            85
10 Mcdonalds  Double Q…      770     400        45      21       2.5         175
# ℹ 505 more rows
# ℹ 10 more variables: sodium <dbl>, total_carb <dbl>, fiber <dbl>,
#   sugar <dbl>, protein <dbl>, vit_a <dbl>, vit_c <dbl>, calcium <dbl>,
#   salad <chr>, vitamin_total <dbl>

First, let’s explore Chick-Fil-A data specifically

chick_fil_a <- fastfood |>
  filter(restaurant == "Chick Fil-A")
chick_fil_a
# A tibble: 27 × 17
   restaurant  item     calories cal_fat total_fat sat_fat trans_fat cholesterol
   <chr>       <chr>       <dbl>   <dbl>     <dbl>   <dbl>     <dbl>       <dbl>
 1 Chick Fil-A Chargri…      430     144        16     8           0          85
 2 Chick Fil-A Chargri…      310      54         6     2           0          55
 3 Chick Fil-A Chick-n…      270      99        11     2.5         0          45
 4 Chick Fil-A 1 Piece…      120      54         6     3           0          25
 5 Chick Fil-A 2 Piece…      230     108        12     3           0          55
 6 Chick Fil-A 3 Piece…      350     153        17     3           0          70
 7 Chick Fil-A 4 piece…      470     207        23     3           0          90
 8 Chick Fil-A Chicken…      500     207        23     7           0          75
 9 Chick Fil-A 4 piece…      130      54         6     1.5         0          40
10 Chick Fil-A 6 piece…      190      81         9     1.5         0          55
# ℹ 17 more rows
# ℹ 9 more variables: sodium <dbl>, total_carb <dbl>, fiber <dbl>, sugar <dbl>,
#   protein <dbl>, vit_a <dbl>, vit_c <dbl>, calcium <dbl>, salad <chr>
chick_vitamin_total <- chick_fil_a |>
  mutate(chick_vitamin_total = vit_a + vit_c)
chick_vitamin_total # This adds the same column to just our new Chick Fil-A dataset
# A tibble: 27 × 18
   restaurant  item     calories cal_fat total_fat sat_fat trans_fat cholesterol
   <chr>       <chr>       <dbl>   <dbl>     <dbl>   <dbl>     <dbl>       <dbl>
 1 Chick Fil-A Chargri…      430     144        16     8           0          85
 2 Chick Fil-A Chargri…      310      54         6     2           0          55
 3 Chick Fil-A Chick-n…      270      99        11     2.5         0          45
 4 Chick Fil-A 1 Piece…      120      54         6     3           0          25
 5 Chick Fil-A 2 Piece…      230     108        12     3           0          55
 6 Chick Fil-A 3 Piece…      350     153        17     3           0          70
 7 Chick Fil-A 4 piece…      470     207        23     3           0          90
 8 Chick Fil-A Chicken…      500     207        23     7           0          75
 9 Chick Fil-A 4 piece…      130      54         6     1.5         0          40
10 Chick Fil-A 6 piece…      190      81         9     1.5         0          55
# ℹ 17 more rows
# ℹ 10 more variables: sodium <dbl>, total_carb <dbl>, fiber <dbl>,
#   sugar <dbl>, protein <dbl>, vit_a <dbl>, vit_c <dbl>, calcium <dbl>,
#   salad <chr>, chick_vitamin_total <dbl>

Create a simple bar graph of food items from Chick-Fil-A and their calorie content

ggplot(data = chick_fil_a) +
  geom_bar(aes(x = item, y = calories), stat = "identity") +
  labs(x = "Food Items", y = "Calories", title = "Chick Fil-A Food Item Calories") +
theme(axis.text.x = element_text(angle = 80, hjust = 1)) 

geom_point() # Plot the points
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity 

Create a scatterplot with a linear regression and confidence interval

I am first going to measure sodium and calorie content in only Chick-Fil-A food items. The size of each dot is determined by the total fat in the item. I use the geom_smooth function to incorporate a dashed line to display a clear positive trend between the two variables.

ggplot(chick_vitamin_total,
            aes(x = sodium,
                y = calories,
                size = total_fat,)) +
geom_point(alpha = 0.6, color = "red") +
  xlim(200,3700) +
  ylim(50,1000) +
  labs(title = "Chick Fil-A Menu Items: Nutritional Content",
       caption = "Source: Data.World",
       x= "Sodium (in grams)",
       y= "Calories",
       size = "Total Fat (in grams)") +
  geom_smooth(method='lm',formula=y~x, se = FALSE, linetype= "dashed", size = 0.6) +
  theme_economist(base_size = 12) # Change the theme
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_smooth()`).

Let’s add some interactivity with plotly, a feature that adds mouse-over tooltip capabilities but also causes us to lose our Total Fat legend

ggplot(chick_vitamin_total,
            aes(x = sodium,
                y = calories,
                size = total_fat,)) +
geom_point(alpha = 0.6, color = "red") +
  xlim(200,3700) +
  ylim(50,1000) +
  labs(title = "Chick Fil-A Menu Items: Nutritional Content",
       caption = "Source: Data.World",
       x= "Sodium (in grams)",
       y= "Calories",
       size = "Total Fat (in grams)") +
  geom_smooth(method='lm',formula=y~x, se = FALSE, linetype= "dashed", size = 0.6) +
  theme_economist(base_size = 12)
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_smooth()`).

  ggplotly()

In this first visualization I measured the sodium and calorie content from every food item that Chick-Fil-A serves. This scatterplot shows a very clear positive correlation between the two variables; when sodium increases in a food item, so does the number of calories. Interestingly enough, because the size of each dot on the plot corresponds to how many grams of fat that item has, we are able to see a second positive correlation - that when sodium and calories increase, so does the amount of fat within that menu item.

Using a linear regression model

cor(chick_vitamin_total$sodium, chick_vitamin_total$calories)
[1] 0.9485251
fit1 <- lm(calories ~ sodium, data = chick_vitamin_total)
summary(fit1)

Call:
lm(formula = calories ~ sodium, data = chick_vitamin_total)

Residuals:
    Min      1Q  Median      3Q     Max 
-136.18  -43.09  -10.61   40.51  154.72 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 53.14759   26.02441   2.042   0.0518 .  
sodium       0.28771    0.01921  14.975 5.45e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 71.21 on 25 degrees of freedom
Multiple R-squared:  0.8997,    Adjusted R-squared:  0.8957 
F-statistic: 224.3 on 1 and 25 DF,  p-value: 5.446e-14

Analysis

The model has the equation : calories = 0.29(sodium) + 53.15. This dervies from the formula y=mx + b. The p-value for the model is 0.00000000000005446. This is an extremely small p-value, suggesting that there is a significant correlation between sodium and calories in Chick-Fil-A food items. The p-value also has 3 asterisks which suggests it is a meaningful variable to explain the linear increase in calories. The slope may be interpreted as the following: For each additional gram of sodium in a Chick-Fil-A food item, there is a predicted increase of 0.29 calories. The adjust R-squared value is 0.8957. This means that nearly 90% of the variation in the observations may be explained by this model while about 10% of the variation in the data is likely not explained by the model.

Now let’s explore data from all eight fast food restaurants within the initial dataset

I am now measuring the total carbs and cholesterol in all food items for all restaurants.

ggplot(vitamin_total, aes(x = total_carb, y = cholesterol, color = restaurant)) +
  geom_point(alpha = 0.1) +
  scale_colour_viridis_d() +
  geom_jitter() +
  labs(title = "Fast Food Restaurant Carb and Cholesterol Content by Menu Item",
       caption = "Source: Data.World",
       x = "Carbs (in grams)",
       y = "Cholesterol (in milligrams)",
       color = "Restaurant") +
  theme_bw()

Remove a couple of outliers

vitamin_total2 <- vitamin_total[vitamin_total$item != "American Brewhouse King",]
vitamin_total3 <- vitamin_total2[vitamin_total2$item != "10 piece Sweet N' Spicy Honey BBQ Glazed Tenders",]

Clean the data

I am cleaning up the data now. This includes altering the color scheme manually to colors that appear brighter and are more distinct from one another, as well as changing the theme of the scatterplot to theme_stata which seems more interesting and appropriate. I also increased the alpha slightly as an attempt to make each colored point stand out more and not obfuscate any neighboring points.

ggplot(vitamin_total3, aes(x = total_carb, y = cholesterol, color = restaurant)) +
  geom_point(alpha = 0.3) +
  scale_color_manual(values = c("red", "blue", "brown", "green", "plum","#E60","cyan","yellow")) +
  geom_jitter() +
  labs(title = "Fast Food Restaurant Carb and Cholesterol Content by Menu Item",
       caption = "Source: Data.World",
       x = "Carbs (in grams)",
       y = "Cholesterol (in milligrams)",
       color = "Restaurant") +
  theme_stata() # Change the theme

This second visualization plots every menu item in the entire dataset based on their carb and cholesterol content on a scatterplot, color-coded by restaurant. The correlation between these two variables is, overall, much weaker than the first scatterplot. There is much variety in the amount of carbs in each food item, while the amount of cholesterol is generally concentrated under 200 milligrams. Subway and Taco Bell, in particular, have menu items with a wide range of carbs - some items under 25 grams and other items above 100 grams. Both of those restaurants sell a higher variety of food compared to other niche fast foot places. Other restaurants such as Burger King and Sonic have food items that are much closer to 50 grams of carbs, on average. I wonder if these two restaurants have similar carb content in their food because they are both primarily burger joints. The dots for McDonald’s food items are very intersting, because they are all over the place. Three of the top five highest cholesterol items come from McDonald’s, including the American Brewhouse King which ranks as the number one highest food item.

Measuring fat from calories and vitamin totals for each restaurant

ggplot(vitamin_total3, aes(x=cal_fat, y=vitamin_total, color = restaurant)) +
  geom_point(alpha = 0.08) +
  scale_color_manual(values = c("red", "blue", "brown", "green", "plum","#E60","cyan","yellow")) +
  geom_jitter() +
  facet_wrap(~restaurant) +
  labs(title = "Fast Food Restaurant Fat Calories and Vitamin Contents",
       x = "Calories From Fat",
       y = "Total Vitamins",
       color = "Restaurant",
       caption = "Source: Data.World") +
  theme_bw()
Warning: Removed 213 rows containing missing values or values outside the scale range
(`geom_point()`).
Removed 213 rows containing missing values or values outside the scale range
(`geom_point()`).

Remove the outlier menu item from Subway

vitamin_total4 <- vitamin_total3[vitamin_total3$item != "Footlong Turkey & Bacon Avocado",]

Remove Burger King from the visualization since there are no calories from fat or vitamin data for that restaurant

This will change the color of each remaining restaurant, so I am removing the blue color that was assigned to Burger King as well.

Now let’s see our final visualization

vitamin_total5 <- vitamin_total4 |> filter(restaurant %in% c("Arbys", "Chick Fil-A", "Dairy Queen", "Mcdonalds", "Sonic", "Subway", "Taco Bell"))
ggplot(vitamin_total5, aes(x=cal_fat, y=vitamin_total, color = restaurant,)) +
  geom_point(alpha = 0.08) +
  scale_color_manual(values = c("red", "brown", "green", "plum","#E60","cyan","yellow")) +
  geom_jitter() +
  facet_wrap(~restaurant) +
  labs(title = "Fast Food Restaurant Protein and Vitamin Contents",
       x = "Calories From Fat",
       y = "Total Vitamins",
       color = "Restaurant",
       caption = "Source: Data.World") +
  theme_bw()
Warning: Removed 144 rows containing missing values or values outside the scale range
(`geom_point()`).
Removed 144 rows containing missing values or values outside the scale range
(`geom_point()`).

Conclusion

The last visualization looks at calories specifically from fat and the total vitamins column I created from the vitamin a and c columns already given. Each restaurant has their own unique-looking scatterplot. Generally speaking, each restaurant has very few food items that contain more than 500 calories from fat. Once again, McDonald’s has the most variety, this time in both calories from fat and in total vitamins. I am a little surprised that McDonald’s has a handful of food items with such a high vitamin total given that they have the highest cholesterol items, and I do not typically associate those two contents together. I usually correlate cholesterol to unhealthy food and vitamins to healthy food, so I suppose these results speak to the surprisingly wide variety in McDonald’s menu. Subway also has several food items wtith low to intermediate levels of calories from fat and total vitamins. Despite Chick-Fil-A being known for chicken and Arby’s being known for roast beef, the two restaurants have nearly identical plots: few calories from fat overall and total vitamins under 100 grams for each item. The Taco Bell plot is the opposite of McDonald’s and, just like with their low cholesterol content, also plots few calories from fat and total vitamins in their items.

I wish I could have been able to add commas on my x and y axis to perhaps make the values in the thousands clearer at first glance. In the first scatterplot, I also wanted to move the x and y axis titles farther away from the numbers on the scale since they are almost running into each other.

Bibliography

1.Lingo D. What Are Calories and How Many Do You Need? EatingWell. Published March 8, 2023. https://www.eatingwell.com/article/8033186/what-are-calories/

1.MedlinePlus. Carbohydrates. Medlineplus.gov. Published 2022. https://medlineplus.gov/carbohydrates.html

1.Dietary fats explained: MedlinePlus Medical Encyclopedia. medlineplus.gov. Accessed July 8, 2024. https://medlineplus.gov/ency/patientinstructions/000104.htm#:

1.American Heart Association. What Is Cholesterol? www.heart.org. Published November 6, 2020. https://www.heart.org/en/health-topics/cholesterol/about-cholesterol

1.Gordon B. Is Sodium the Same Thing as Salt. www.eatright.org. Published August 8, 2019. https://www.eatright.org/health/essential-nutrients/minerals/is-sodium-the-same-thing-as-salt

1.Better Health Channel. Sugar. Better Health Channel. Published 2011. https://www.betterhealth.vic.gov.au/health/healthyliving/sugar

1.Medline Plus. What are proteins and what do they do? medlineplus.gov. Published 2021. https://medlineplus.gov/genetics/understanding/howgeneswork/protein/

1.Brazier Y. Vitamins: What are they and what do they do? Medical News Today. Published December 15, 2020. https://www.medicalnewstoday.com/articles/195878

1.National Institutes of Health. Office of Dietary Supplements - Calcium. Nih.gov. Published December 6, 2019. https://ods.od.nih.gov/factsheets/Calcium-Consumer/

1.International Dairy Queen Revenue - Zippia. www.zippia.com. Published December 14, 2021. https://www.zippia.com/international-dairy-queen-careers-27508/revenue/

1.Carter SM. Chick-fil-A lands behind McDonald’s as second-highest-grossing fast-food chain. FOXBusiness. Published May 15, 2020. https://www.foxbusiness.com/lifestyle/chick-fil-a-lands-behind-mcdonalds-as-second-highest-grossing-fast-food-chain