Question 1: Using tidyverse commands discussed in class, subset just the rows corresponding to food items sold at Arby’s, Subway, and Taco Bell. Share your code here.
groups_3 <- fastfood %>%
filter(restaurant %in%
c("Subway", "Arbys", "Taco Bell"))
Question 2: Using the mutate() function on the full dataset (not the one created in Question 1), create a column that subtracts the calories from fat from the total calorie content of each food item.
fastfood <-
mutate(fastfood, cal_noFat
= calories - total_fat)
Question 3: Share what commands you would use to select just the restaurant, item, and calories columns AND only include the food items with calorie counts > 1000. Do this all using piping.
fastfood %>%
select(restaurant, item, calories) %>%
filter(restaurant %in%
c("Subway", "Arbys", "Taco Bell")) %>%
filter(calories > 1000)
# A tibble: 5 x 3
restaurant item calories
<chr> <chr> <dbl>
1 Arbys Triple Decker Sandwich 1030
2 Subway Footlong Big Hot Pastrami 1160
3 Subway Footlong Carved Turkey & Bacon w/ Cheese 1140
4 Subway Footlong Chicken & Bacon Ranch Melt 1140
5 Subway Footlong Italian Hero 1100
Question 4: Using piping, the group_by function, and the summarise function, compute the average calorie content, the standard deviation of the calorie content, and the sample size for Arby’s, Subway, and Taco Bell separately. Use the dataset you created in Question 1. Share your code here and fill in the following table with your results:
group <- fastfood %>%
filter(restaurant %in%
c("Subway", "Arbys", "Taco Bell")) %>%
group_by(restaurant) %>%
summarise( mean_cal = mean(calories),
sd_cal = sd(calories),
n = length(calories))
group
# A tibble: 3 x 4
restaurant mean_cal sd_cal n
<chr> <dbl> <dbl> <int>
1 Arbys 533. 210. 55
2 Subway 503. 282. 96
3 Taco Bell 444. 184. 115
Question 5: What are the hypotheses for this ANOVA test?
Question 6: Create histograms of the calorie contents of each restaurant. Evaluate how normal the distribution of calories looks for each restaurant.
The calorie content of Subway’s distribution looks normal with no extreme outliers.
The calorie content of Arby’s distribution looks normal with no extreme outliers.
The calorie content of Taco Bell’s distribution looks normal with no extreme outliers.
Question 7: Create side-by-side boxplots of the calorie contents of each restaurant. Evaluate the constant variance assumption based on these boxplots. Feel free to reference the standard deviations you computed in Question 4. The side by side box plots for each restaurant’s calorie content illustrates that the variability appears to be approximately constant across the each restaurant.
Question 8: Using the above code (but filling in the appropriate variable names), carry out an ANOVA test using a significance level of 0.05. Fill in the following table with your results AND state your conclusions.
Df Sum Sq Mean Sq F value Pr(>F)
restaurant 2 352468 176234 3.351 0.0365 *
Residuals 263 13829781 52585
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-value of .0365 is smaller than the significance level of .05, indicating the evidence is strong enough to reject the null hypothesis. That is, the data provides strong evidence that the average calorie content varies across some (or all) restaurants.
Question 9: Carry out a pairwise t-test using a Bonferroni correction. Share your results here and if you rejected any of the pairwise t-tests.
Pairwise comparisons using t tests with pooled SD
data: groups_3$calories and groups_3$restaurant
Arbys Subway
Subway 1.000 -
Taco Bell 0.056 0.187
P value adjustment method: bonferroni
With the modified significance level or .05/3 = 0.0167. All the p-values calculated from the pairwaise t-test using a Bonfettoni correction are larger than our modified significance level of 0.0167. This means that with this method the p-values show that there is not strong evidence of a difference in the average calorie contents of these restaurants.
Subway \(\stackrel{?}{=}\) Arby’s, Subway \(\stackrel{?}{=}\) Taco Bell, Arby’s \(\stackrel{?}{=}\) Taco Bell
In, conclusion we did not have sufficient evidence to reject the null hypothesis.