For Project 2, we will be tidying 3 datasets. This file focuses on the Generational Takeout Spending dataset. We will import the wide-format CSV file from GitHub into R and transform it into a tidy structure using the tidyr and dplyr packages from tidyverse. We will reshape the generational spending columns into a long format using functions such as pivot_longer(), then standardize column names with rename_with(). Our dataset did not contain any missing or inconsistent values so we did not need to use functions such as drop_na() to prepare the data for analysis. Finally, we will analyze trends in takeout spending across generations and visualize the results using ggplot2, though a potential challenge may be restructuring the generational columns correctly during the wide-to-long transformation.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.6
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.2.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
We created the dataset based on studies showing the takeout spending habits across various generations. Imported this raw CSV file from the main branch of our project 2 public GitHub repository.
To perform an initial inspection of the dataset, we printed the generational_takeout tibble to the console. This provided a summary of the data structure, confirming a dimension of 5 rows and 7 columns. By viewing the data in this format, we were able to verify that the spending values were correctly imported and that the dataset currently exists in a wide format.
We reshaped the dataset from wide to tidy (long) format. We normalized variable structures and renamed variables to follow a consistent naming convention. We did not have missing or inconsistent values to address.
tidy_gen_takeout <- generational_takeout %>%# Reshape from wide to long (tidy) formatpivot_longer(cols =-Generation, # all columns except 'Generation'names_to ="period", # column names (Jan.25, Feb.25...) go into 'period'values_to ="expenses"# values go into 'expenses' ) %>%# Split 'period' into month and year, this is part of the normalization process.#\\. is used to escape the '.' and prevent an error in the separate() function; '.' on it's own is a wildcard character separate(period, into =c("month", "year"), sep ="\\.") %>%# Standardize year to 4 digits and convert Generation to factormutate(year =paste0("20", year),Generation =as.factor(Generation) ) %>%# Standardize column names manuallyrename(generation = Generation )# Preview tidy datasettidy_gen_takeout %>%print(n =Inf)
# A tibble: 30 × 4
generation month year expenses
<fct> <chr> <chr> <int>
1 Gen Z Jan 2025 185
2 Gen Z Feb 2025 172
3 Gen Z Mar 2025 198
4 Gen Z Apr 2025 210
5 Gen Z May 2025 205
6 Gen Z Jun 2025 225
7 Millennials Jan 2025 240
8 Millennials Feb 2025 228
9 Millennials Mar 2025 255
10 Millennials Apr 2025 270
11 Millennials May 2025 265
12 Millennials Jun 2025 285
13 Gen X Jan 2025 195
14 Gen X Feb 2025 188
15 Gen X Mar 2025 205
16 Gen X Apr 2025 215
17 Gen X May 2025 210
18 Gen X Jun 2025 220
19 Baby Boomers Jan 2025 120
20 Baby Boomers Feb 2025 115
21 Baby Boomers Mar 2025 130
22 Baby Boomers Apr 2025 140
23 Baby Boomers May 2025 135
24 Baby Boomers Jun 2025 145
25 Silent Gen Jan 2025 75
26 Silent Gen Feb 2025 70
27 Silent Gen Mar 2025 82
28 Silent Gen Apr 2025 88
29 Silent Gen May 2025 85
30 Silent Gen Jun 2025 90
Analysis
We provided two analyses for our dataset. First we wanted to calculate each generation’s percentage of each month’s total spending. We loaded tidy_gen_takeout into the dataframe gen_monthly_percentage, calculated the percentages then plotted the results in a bar graph and summary table.
The next analysis calculated the average monthly spending for each generation, over the entire 6-month period. We loaded the tidy_gen_takeout into a dataframe named generation_analysis. We ranked the generations in descending order and plotted the results in a bar graph and summary table for visualization.
# Generational Share of Monthly Takeout Spendinggen_monthly_percentage <- tidy_gen_takeout %>%group_by(month) %>%# group by monthmutate(total_month =sum(expenses)) %>%# total spending across all generations per monthungroup() %>%mutate(percent =round((expenses / total_month) *100)) %>%# round to whole %select(month, generation, expenses, percent) %>%arrange(match(month, c("Jan","Feb","Mar","Apr","May","Jun")), generation)gen_monthly_percentage %>%mutate(expenses =paste0("$", expenses),percent =paste0(percent, "%") )
# A tibble: 30 × 4
month generation expenses percent
<chr> <fct> <chr> <chr>
1 Jan Baby Boomers $120 15%
2 Jan Gen X $195 24%
3 Jan Gen Z $185 23%
4 Jan Millennials $240 29%
5 Jan Silent Gen $75 9%
6 Feb Baby Boomers $115 15%
7 Feb Gen X $188 24%
8 Feb Gen Z $172 22%
9 Feb Millennials $228 29%
10 Feb Silent Gen $70 9%
# ℹ 20 more rows
# Month ordermonth_levels <-c("Jan", "Feb", "Mar", "Apr", "May", "Jun")# Convert month column to ordered factorgen_monthly_percentage <- gen_monthly_percentage %>%mutate(month =factor(month, levels = month_levels))ggplot(gen_monthly_percentage, aes(x = month, y = percent, fill = generation)) +geom_bar(stat ="identity", position ="stack") +# stacked by generationgeom_text(aes(label =paste0(percent, "%")),position =position_stack(vjust =0.5), # labels inside each barsize =3) +labs(title ="Generational Percentage of Monthly Takeout Spending",x ="Month",y ="Percentage of Total Monthly Spending (%)",fill ="Generation" ) +theme_minimal()
generational_analysis <- tidy_gen_takeout %>%# Group by Generationgroup_by(generation) %>%# Calculate the average spending across the monthssummarise(avg_monthly_spending =round(mean(expenses))) %>%# Order from highest to lowestarrange(desc(avg_monthly_spending)) # Add $ to amounts only in presentation, as not to chnage the character structure. generational_analysis %>%mutate(avg_monthly_spending =paste0("$", avg_monthly_spending) )
# A tibble: 5 × 2
generation avg_monthly_spending
<fct> <chr>
1 Millennials $257
2 Gen X $206
3 Gen Z $199
4 Baby Boomers $131
5 Silent Gen $82
# View the final ranked resultsprint(generational_analysis)
# A tibble: 5 × 2
generation avg_monthly_spending
<fct> <dbl>
1 Millennials 257
2 Gen X 206
3 Gen Z 199
4 Baby Boomers 131
5 Silent Gen 82
After transforming the dataset from wide to tidy, we performed two analyses: the generational percentage of total monthly spending and the average monthly spending by generation.
The generational percentage of total monthly spending analysis shows Millennials account for the largest share of each month’s takeout spending, ranging from 29–30%. Gen X and Gen Z spend almost identical shares each month, at 23-24% and 22-23%, respectively. And as the older generations, Baby Boomers only account for 15%, while the Silent Generation stays at 9-10%, each month.
The average monthly spending analysis by generation supports these findings by showing Millennials spend the most, with an average monthly takeout expenditure of approximately $257, followed by Gen X at $206 and Gen Z at roughly $199. Baby Boomers were in fourth with an average of $131, while the Silent Generation had the lowest monthly average at $82 per month, $175 less, per month, than the highest spender.
Overall, the analyses presented in the visualizations and summary tables highlight multiple generational differences in spending behavior. Approximately 75% of monthly takeout spending is done by the younger three generations, though Millenials, who are in the middle of the three, account for 30%, a disproportionately large amount. The data also shows an interesting trend where Gen X and Gen Z are spending almost on par each month, though there’s a generation between them. Gen X is often the parent of Gen Z, so it could be informative to explore their spending habits deeper. The older generations, Baby Boomers and the Silent Generation spending the lowest amounts was expected though.
Citation
(Google DeepMind). (2026). Gemini Pro 3.1 [Large language model]. https://gemini.google.com. Accessed March 7th, 2026.