Research Question- Are sodium levels that same among different fast food restaurants

Introduction of dataset

Overall the dataset has various information relating to fast food restaurants and the nutritional value of their menu items. Therefore, variables mostly consist of sodium, cholesterol, protein, calories, and more. For my analysis I will only need the variables sodium and restaurant. This dataset consists of 515 observations and 17 variables and was accessed through Open Intro. Link: https://www.openintro.org/data/index.php?data=fastfood The information surrounds 8 fast food restaurants. These are Arby’s, Burger King, Chik-Fil-A, Dairy Queen, McDonald’s, Sonic, Subway, and Taco Bell.

Loading Data & Libraries

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(dplyr)
library(ggplot2)
setwd("C:/Users/tonge/Downloads")
fastfood <- read_csv("fastfood.csv")

## Rows: 515 Columns: 17
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (3): restaurant, item, salad
## dbl (14): calories, cal_fat, total_fat, sat_fat, trans_fat, cholesterol, sod...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Data Analysis

I will conduct a data analysis looking at the different sodium levels of each restaurant, and seeing which one is the highest. Next, to address my research question I will create a chart that it grouped my restaurant(x value) which will be color coded. Then, the y value will be sodium so one can see the differences across restaurants.

head(fastfood)

## # A tibble: 6 × 17
##   restaurant item       calories cal_fat total_fat sat_fat trans_fat cholesterol
##   <chr>      <chr>         <dbl>   <dbl>     <dbl>   <dbl>     <dbl>       <dbl>
## 1 Mcdonalds  Artisan G…      380      60         7       2       0            95
## 2 Mcdonalds  Single Ba…      840     410        45      17       1.5         130
## 3 Mcdonalds  Double Ba…     1130     600        67      27       3           220
## 4 Mcdonalds  Grilled B…      750     280        31      10       0.5         155
## 5 Mcdonalds  Crispy Ba…      920     410        45      12       0.5         120
## 6 Mcdonalds  Big Mac         540     250        28      10       1            80
## # ℹ 9 more variables: sodium <dbl>, total_carb <dbl>, fiber <dbl>, sugar <dbl>,
## #   protein <dbl>, vit_a <dbl>, vit_c <dbl>, calcium <dbl>, salad <chr>

str(fastfood)

## spc_tbl_ [515 × 17] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ restaurant : chr [1:515] "Mcdonalds" "Mcdonalds" "Mcdonalds" "Mcdonalds" ...
##  $ item       : chr [1:515] "Artisan Grilled Chicken Sandwich" "Single Bacon Smokehouse Burger" "Double Bacon Smokehouse Burger" "Grilled Bacon Smokehouse Chicken Sandwich" ...
##  $ calories   : num [1:515] 380 840 1130 750 920 540 300 510 430 770 ...
##  $ cal_fat    : num [1:515] 60 410 600 280 410 250 100 210 190 400 ...
##  $ total_fat  : num [1:515] 7 45 67 31 45 28 12 24 21 45 ...
##  $ sat_fat    : num [1:515] 2 17 27 10 12 10 5 4 11 21 ...
##  $ trans_fat  : num [1:515] 0 1.5 3 0.5 0.5 1 0.5 0 1 2.5 ...
##  $ cholesterol: num [1:515] 95 130 220 155 120 80 40 65 85 175 ...
##  $ sodium     : num [1:515] 1110 1580 1920 1940 1980 950 680 1040 1040 1290 ...
##  $ total_carb : num [1:515] 44 62 63 62 81 46 33 49 35 42 ...
##  $ fiber      : num [1:515] 3 2 3 2 4 3 2 3 2 3 ...
##  $ sugar      : num [1:515] 11 18 18 18 18 9 7 6 7 10 ...
##  $ protein    : num [1:515] 37 46 70 55 46 25 15 25 25 51 ...
##  $ vit_a      : num [1:515] 4 6 10 6 6 10 10 0 20 20 ...
##  $ vit_c      : num [1:515] 20 20 20 25 20 2 2 4 4 6 ...
##  $ calcium    : num [1:515] 20 20 50 20 20 15 10 2 15 20 ...
##  $ salad      : chr [1:515] "Other" "Other" "Other" "Other" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   restaurant = col_character(),
##   ..   item = col_character(),
##   ..   calories = col_double(),
##   ..   cal_fat = col_double(),
##   ..   total_fat = col_double(),
##   ..   sat_fat = col_double(),
##   ..   trans_fat = col_double(),
##   ..   cholesterol = col_double(),
##   ..   sodium = col_double(),
##   ..   total_carb = col_double(),
##   ..   fiber = col_double(),
##   ..   sugar = col_double(),
##   ..   protein = col_double(),
##   ..   vit_a = col_double(),
##   ..   vit_c = col_double(),
##   ..   calcium = col_double(),
##   ..   salad = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr>

Checking for NAs - there are none in the variables I will need

colSums(is.na(fastfood))

##  restaurant        item    calories     cal_fat   total_fat     sat_fat 
##           0           0           0           0           0           0 
##   trans_fat cholesterol      sodium  total_carb       fiber       sugar 
##           0           0           0           0          12           0 
##     protein       vit_a       vit_c     calcium       salad 
##           1         214         210         210           0

Visualizations:

This interactive visualization shows that each restaurant experiences fluctuations in sodium. However, Mcdonald’s experiences the highest sodium spikes. The code used is from my Data 110 notes(Maliha, 2026).

library(highcharter)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## Highcharts (www.highcharts.com) is a Highsoft software product which is

## not free for commercial and Governmental use

library(RColorBrewer)
library(dplyr)
library(tidyverse)

cols <- brewer.pal(4, "Set1")

highchart() |>
  hc_add_series(data = fastfood,
                type = "line",
                hcaes(y = sodium,
                      group = restaurant)) |>
  hc_colors(cols) |>
  hc_xAxis(title = list(text = "Year")) |>
  hc_yAxis(title = list(text = "Sodium"))

Summary of Sodium(mg) based on fast food restaurant - Here we can see the means based on the restaurant. The highest average is seen in Arby’s(1515.273), followed by McDonald’s(1437.895).

fastfood_means <- fastfood |>
  group_by(restaurant) |>
  summarise(
    mean_sodium = mean(sodium, na.rm = TRUE),
    median_sodium = median(sodium, na.rm = TRUE),
    sd_sodium = sd(sodium, na.rm = TRUE),
    min_sodium = min(sodium, na.rm = TRUE),
    max_sodium = max(sodium, na.rm = TRUE))
fastfood_means

## # A tibble: 8 × 6
##   restaurant  mean_sodium median_sodium sd_sodium min_sodium max_sodium
##   <chr>             <dbl>         <dbl>     <dbl>      <dbl>      <dbl>
## 1 Arbys             1515.          1480      664.        100       3350
## 2 Burger King       1224.          1150      500.        310       2310
## 3 Chick Fil-A       1151.          1000      727.        220       3660
## 4 Dairy Queen       1182.          1030      610.         15       3500
## 5 Mcdonalds         1438.          1120     1036.         20       6080
## 6 Sonic             1351.          1250      665.        470       4520
## 7 Subway            1273.          1130      744.         65       3540
## 8 Taco Bell         1014.           960      474.        290       2260

Statisical Analysis

Hypothesis:

\(H_0\): \(\mu_A\) = \(\mu_B\) = \(\mu_C\) =\(\mu_D\) = \(\mu_E\) = \(\mu_F\) = \(\mu_G\) =\(\mu_H\)

\(H_a\): not all \(\mu_i\) are equal

ANOVA

Testing the mean sodium levels across 8 different fast food restaurants

anova_result <- aov(sodium ~ restaurant, data = fastfood)

anova_result

## Call:
##    aov(formula = sodium ~ restaurant, data = fastfood)
## 
## Terms:
##                 restaurant Residuals
## Sum of Squares    13382025 231300945
## Deg. of Freedom          7       507
## 
## Residual standard error: 675.4368
## Estimated effects may be unbalanced

summary(anova_result)

##              Df    Sum Sq Mean Sq F value   Pr(>F)    
## restaurant    7  13382025 1911718    4.19 0.000167 ***
## Residuals   507 231300945  456215                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation: The p-value is very small (0.000167) when alpha = 0.05. This shows there is strong evidence against the null hypothesis. Overall, this test suggests that there are significant differences in sodium levels among the different fast food restaurants.

Conclusion and Future Steps

The ANOVA test resulted in a statistically significant p value when alpha = 0.05. Therefore, we know there is a difference in sodium levels among different fast food restaurants. Furthermore, through visualizations and summaries it is known that the highest sodium levels are seen in McDonald’s menu items. For consumers trying to stay healthy, this can help them decide which restaurants to avoid. Especially McDonald’s since they may have more unhealthy items on the menu. The results from this analysis can also be used by health experts to convince people to avoid fast food. Since avoiding these high sodium foods can prevent the many health conditions associated with high sodium intake. Such as cardiovascular diseases and hypertension. In the future, I could conduct ANOVA to see the difference in sodium levels across menu items. I could group food items into categories like burgers, sandwiches, tacos, chicken tenders, etc. Then I could test which menu items experience the highest sodium levels. Additionally, I could conduct a linear regression model testing if sodium increases as calories increase.

References(APA)

Maliha, M. (2026). Working with Continuous Variables with DS Labs and HighCharter [Class notes]. Montgomery College. DATA 110.

OpenIntro. (n.d.). fastfood [Data set]. https://www.openintro.org/data/index.php?data=fastfood

Project 2 - Data 101

S Tonge

2026-03-31