Cereal <- read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/berryp1_xavier_edu/EWp2YMBmFMRFkSUVJ_qbHZ4B6TU4_qrI56EaObOH7qBfbg?download=1")Analysis of Cereal Nutrition
Introduction
This data contains 76 cereals and their nutritional information, such as protein, carbs, fat, sugar, and fiber. It also records information about what shelf the cereal is placed on in a store, the manufacturer, and if it is hot or cold. Each row represents a different cereal.
Research Question
Do cereals with the word wheat have lower calories than those without the word wheat? I find this interesting because these keywords portray a cereal as being healthy, so I wonder if that branding translates.
Analysis
I will need to split the cereals into two separate categories. One will be those with the word wheat and one without the word wheat. I will do this by mutating a column that outputs True if it has wheat in the name and False if it does not. Next, i will create 2 box plots comparing calories with and without wheat
Cereal %>%
mutate(has_wheat = str_detect(name, regex("wheat", ignore_case = TRUE))) %>%
ggplot(aes(x = factor(has_wheat, labels = c("Other", "Includes Wheat")), y = calories)) +
geom_boxplot() +
labs(
title = "Comparison of Calorie Content in Cereals with and without 'Wheat' in the Name",
x = "Cereal Name",
y = "Calories")This boxplot shows that cereals with the word “Wheat” in their name tend to have lower calories per serving. The boxplot on the right has a lower median, Q1, and Q3 compared to the other cereals. This makes sense since “Wheat” implies a healthier alternative. Cereals with “Wheat” also have a smaller IQR, indicating less variation in their calorie content.