setup

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.1     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.2.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Why This Dataset?

Starbucks drinks are extremely popular
Many people don’t know how much sugar and calories are in them
Goal: help people make healthier choices

Research Questions

Which Starbucks drinks have the highest sugar and calories?
How does drink size affect calories and sugar?

Dataset

TidyTuesday Week 52 (Dec 21, 2021)
Contains:
- Drink name
- Beverage category
- Size
- Calories
- Sugar
- Fat
- Caffeine

Loading Data

library(tidyverse)

starbucks <- read_csv(
  "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-12-21/starbucks.csv"
)

## Rows: 1147 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (4): product_name, size, trans_fat_g, fiber_g
## dbl (11): milk, whip, serv_size_m_l, calories, total_fat_g, saturated_fat_g,...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Highest Sugar Drinks

starbucks %>%
  arrange(desc(sugar_g)) %>%
  slice(1:10) %>%
  ggplot(aes(x = reorder(product_name, sugar_g),
             y = sugar_g)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 10 Highest-Sugar Drinks",
       x = "Drink",
       y = "Sugar (g)")

Calories by Drink Size

starbucks %>%
  ggplot(aes(x = size, y = calories)) +
  geom_jitter(width = 0.2, height = 0, color = "darkgreen", alpha = 0.6) +
  labs(title = "Calories by Drink Size",
       x = "Drink Size",
       y = "Calories (kcal)")

Key Findings

Frappuccinos are the highest in sugar
Larger drink sizes usually have more calories
Teas are consistently low in calories and sugar
Lattes vary depending on milk type (whole, non-fat, soy, etc.)
Popular drinks can contain over 60–80g of sugar

Conclusion

Starbucks drinks vary widely in nutrition
Many high-sugar drinks are not obvious to customers
Size choice has a big impact on calories
Data analysis helps reveal healthier vs. less healthy drink options
This project shows how TidyTuesday data can answer real-world questions

final project

Starbucks Nutrition: A TidyTuesday Analysis

setup

Why This Dataset?

Research Questions

Dataset

Loading Data

Highest Sugar Drinks

Calories by Drink Size

Key Findings

Conclusion