Business Problem
Valentine’s Day is one of the most significant retail holidays, driving substantial consumer spending on gifts, food, and beverages. To capitalize on this seasonal demand, it is essential to analyze different purchasing patterns across different product categories and demographic groups to maximze profit and optimize inventory management. By examining sales trends, price variations and different consumer segments, this analysis identifies key shopping behaviors that can inform decisions regarding inventory management, promotional strategy and pricing models.
Data & Methodology
The analysis utilized transactions data from the
completejourney package, including products and
demographics data for further analysis. The dataset was cleaned and
filtered to focus on five Valentine’s-related product categories:
Alcohol, Chocolates, Flowers, Cards & Gifts, and Frozen Meals.
Various metrics—total sales, quantity sold, and average price per
unit—were analyzed over time. Additionally, spending behavior was
segmented by income and age groups to identify key consumer trends. Data
visualizations were created to highlight patterns and variations.
Proposed Solution & Business Impact
This analysis provides actionable insights that can help the Regork CEO optimize marketing, pricing, and inventory management decisions for future Valentine’s Day promotions. Key proposed solution includes increased inventory for Chocolates, Flowers and Alcohol during peak demand days (February 10-14) and increased prices on high demand items such as Flowers and Alcohol closer to February 14th to maximize revenue. Early promotions, Bundled offers to increase basket size and demographic specific marketing can also be used to increase profits for Regork.
tidyverse, used
for efficient data manipulation, including filtering, summarizing, and
joining datasets.library(completejourney)
## Welcome to the completejourney package! Learn more about these data
## sets at http://bit.ly/completejourney.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(ggplot2)
library(lubridate)
library(knitr)
For my data prepartion, I began by identifying key product categories associated with Valentine’s Day, including Flowers, Chocolates, Alcohol, Cards & Gifts, and Frozen Meals. These items were selected based on their traditional significance in Valentine’s Day celebrations, ranging from romantic gifts to frozen meals for at-home dining. Once the categories were established, I filtered transaction data to focus on the weeks leading up to Valentine’s Day (February 1-14, 2017), as well as a broader year-long timeframe (January to December 2017) to observe seasonal patterns.
With the key items identified, I prepared the data in R by merging the transactions dataset with product and demographic information to gain insight into purchasing behavior across different consumer segments. To ensure data integrity, missing values in sales values and product categories were removed, and transactions were grouped based on the Valentine’s category they belonged to. I also developed custom functions for data visualization to analyze key metrics, including total sales, quantity sold, and average price per unit. These functions allowed for an in-depth examination of purchasing trends over time, highlighting peak shopping periods and variations across different product categories while reducing code reptition. By structuring the dataset this way, I was able to explore which categories saw the highest sales spikes, when consumers started making purchases, and how shopping behaviors varied across demographics.
library(dplyr)
library(ggplot2)
library(lubridate)
library(completejourney)
# Data Cleaning & Preparation
transactions <- get_transactions()
# Ensure dates are in proper format
transactions <- transactions %>%
mutate(transaction_timestamp = as.Date(transaction_timestamp))
# Merge transactions with product and demographic data
transactions_cleaned <- transactions %>%
left_join(products, by = "product_id") %>%
left_join(demographics, by = "household_id")
transactions_cleaned <- transactions_cleaned %>%
rename(
income_bracket = income,
age_group = age
) # Rename necessary columns for clarity
transactions_cleaned <- transactions_cleaned %>%
filter(!is.na(sales_value), !is.na(product_category)) # Remove missing values in key columns
valentines_groups <- data.frame(
product_category = c(
# Flowers
"FLORAL-FRESH CUT", "FLORAL BALLOONS", "FLORAL-FLOWERING PLANTS", "FLORAL-ACCESSORIES", "FLORAL-FOLIAGE PLANTS", "ROSES",
# Chocolates
"CANDY - PACKAGED", "CANDY - CHECKLANE", "CANDY BARS (SINGLES)(INCLUDING)",
"SEASONAL CANDY BAGS-CHOCOLATE", "SEASONAL CANDY BAGS NON-CHOCOL",
"SEASONAL CANDY BOX-CHOCOLATE", "CANDY BAGS-CHOCOCLATE", "CANDY BOXED CHOCOLATES",
"GUM (PACKAGED)", "MINTS CANDY & BREATH (NOT LIF)", "NOVELTY CANDY", "CHEWING GUM",
"CANDY BAGS-CHOCOCLATE W/FLOUR", "CANDY BARS MULTI PACK W/FLOUR",
# Alcohol
"BEERS/ALES", "DOMESTIC WINE", "IMPORTED WINE", "MISC WINE", "LIQUOR",
# Cards & Gifts
"GREETING CARDS/WRAP/PARTY SPLY", "VALENTINE",
# Frozen Meals
"FRZN MEAT/MEAT DINNERS"
),
valentines_group = c(
rep("Flowers", 6), rep("Chocolates", 14), rep("Alcohol", 5),
rep("Cards & Gifts", 2), rep("Frozen Meals", 1)
)
) # Valentines Grouping
# Merge transactions with Valentine's categories
transactions_cleaned <- transactions_cleaned %>%
left_join(valentines_groups, by = "product_category")
transactions_cleaned <- transactions_cleaned %>%
filter(!is.na(valentines_group))
## Functions
#Plot 1
plot_valentines_trend <- function(metric) {
# Ensure valid
valid_metrics <- c("total_sales", "total_quantity", "avg_price")
if (!(metric %in% valid_metrics)) {
stop("Invalid metric. Choose from 'total_sales', 'total_quantity', or 'avg_price'.")
}
# Filter transactions for February 1-14
transactions_feb <- transactions_cleaned %>%
filter(transaction_timestamp >= as.Date("2017-02-01") & transaction_timestamp <= as.Date("2017-02-14"))
# Aggregate data
trend_data <- transactions_feb %>%
group_by(transaction_timestamp, valentines_group) %>%
summarise(
value = case_when(
metric == "total_sales" ~ sum(sales_value, na.rm = TRUE),
metric == "total_quantity" ~ sum(quantity, na.rm = TRUE),
metric == "avg_price" ~ sum(sales_value, na.rm = TRUE) / sum(quantity, na.rm = TRUE)
),
.groups = "drop"
)
# y-axis label
y_label <- case_when(
metric == "total_sales" ~ "Total Sales ($)",
metric == "total_quantity" ~ "Total Quantity Bought",
metric == "avg_price" ~ "Average Price per Unit ($)"
)
# line chart
ggplot(trend_data, aes(x = transaction_timestamp, y = value, color = valentines_group, group = valentines_group)) +
geom_line(size = 1.2) +
geom_point(size = 3) +
scale_color_manual(values = c("#D72638", "#3F88C5", "#F49D37", "#140F2D", "#9A031E")) +
scale_x_date(date_breaks = "1 day", date_labels = "%b %d", expand = c(0.02, 0.02)) +
labs(
title = paste(y_label, "Trends (Feb 1-14)"),
x = "Date",
y = y_label,
color = "Product Category"
) +
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 12),
axis.text.y = element_text(size = 12),
plot.title = element_text(size = 18, face = "bold"),
legend.title = element_text(size = 12, face = "bold"),
legend.text = element_text(size = 10)
)
}
#Plot 2
plot_faceted_trend <- function(metric) {
# Ensure metric is valid
valid_metrics <- c("total_sales", "total_quantity", "avg_price")
if (!(metric %in% valid_metrics)) {
stop("Invalid metric. Choose from 'total_sales', 'total_quantity', or 'avg_price'.")
}
# Filter transactions for February 1-14
transactions_feb <- transactions_cleaned %>%
filter(transaction_timestamp >= as.Date("2017-02-01") & transaction_timestamp <= as.Date("2017-02-14"))
# Aggregate data
trend_data <- transactions_feb %>%
group_by(transaction_timestamp, valentines_group) %>%
summarise(
value = case_when(
metric == "total_sales" ~ sum(sales_value, na.rm = TRUE),
metric == "total_quantity" ~ sum(quantity, na.rm = TRUE),
metric == "avg_price" ~ sum(sales_value, na.rm = TRUE) / sum(quantity, na.rm = TRUE)
),
.groups = "drop"
)
# y-axis label
y_label <- case_when(
metric == "total_sales" ~ "Total Sales ($)",
metric == "total_quantity" ~ "Total Quantity Bought",
metric == "avg_price" ~ "Average Price per Unit ($)"
)
# Generate faceted line chart
ggplot(trend_data, aes(x = transaction_timestamp, y = value, group = valentines_group, color = valentines_group)) +
geom_line(size = 1.2) +
geom_point(size = 3) +
scale_color_manual(values = c("#D72638", "#3F88C5", "#F49D37", "#140F2D", "#9A031E")) +
scale_x_date(date_breaks = "3 days", date_labels = "%b %d", expand = c(0.02, 0.02)) +
labs(
title = paste(y_label, "Trends (Feb 1-14) Across Categories"),
x = "Date",
y = y_label,
color = "Product Category"
) +
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 12),
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14, face = "italic", color = "#666666"),
legend.position = "none",
strip.text = element_text(size = 14, face = "bold")
) +
facet_wrap(~valentines_group, scales = "free_y")
}
#Plot 3
plot_category_trend <- function(category_name, metric) {
# Ensure metric is valid
valid_metrics <- c("total_sales", "total_quantity", "avg_price")
if (!(metric %in% valid_metrics)) {
stop("Invalid metric. Choose from 'total_sales', 'total_quantity', or 'avg_price'.")
}
# Filter data for the selected category and date range
category_data <- transactions_cleaned %>%
filter(valentines_group == category_name,
transaction_timestamp >= as.Date("2017-01-01") & transaction_timestamp <= as.Date("2017-02-28")) %>%
group_by(transaction_timestamp) %>%
summarise(
value = case_when(
metric == "total_sales" ~ sum(sales_value, na.rm = TRUE),
metric == "total_quantity" ~ sum(quantity, na.rm = TRUE),
metric == "avg_price" ~ sum(sales_value, na.rm = TRUE) / sum(quantity, na.rm = TRUE)
),
.groups = "drop"
)
# y-axis label
y_label <- case_when(
metric == "total_sales" ~ "Total Sales ($)",
metric == "total_quantity" ~ "Total Quantity Bought",
metric == "avg_price" ~ "Average Price per Unit ($)"
)
# Plot area chart
ggplot(category_data, aes(x = transaction_timestamp, y = value)) +
geom_area(fill = "#FFC0CB", alpha = 0.8) +
scale_x_date(date_labels = "%b %d", date_breaks = "1 week") +
labs(
title = paste(y_label, "Trends (Jan-Feb):", category_name),
subtitle = paste("Tracking", y_label, "from Jan 1 to Feb 28"),
x = "Date",
y = y_label
) +
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10, face = "bold"),
axis.text.y = element_text(size = 12),
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14, face = "italic", color = "#666666")
)
}
#plot 4
plot_category_trend_yearly <- function(category_name, metric) {
valid_metrics <- c("total_sales", "total_quantity", "avg_price")
if (!(metric %in% valid_metrics)) {
stop("Invalid metric. Choose from 'total_sales', 'total_quantity', or 'avg_price'.")
}
category_data <- transactions_cleaned %>%
filter(valentines_group == category_name,
transaction_timestamp >= as.Date("2017-01-01") &
transaction_timestamp <= as.Date("2017-12-31")) %>%
group_by(transaction_timestamp) %>%
summarise(
value = case_when(
metric == "total_sales" ~ sum(sales_value, na.rm = TRUE),
metric == "total_quantity" ~ sum(quantity, na.rm = TRUE),
metric == "avg_price" ~ sum(sales_value, na.rm = TRUE) / sum(quantity, na.rm = TRUE)
),
.groups = "drop"
) %>%
complete(transaction_timestamp = seq.Date(from = as.Date("2017-01-01"),
to = as.Date("2017-12-31"),
by = "day"),
fill = list(value = 0))
y_label <- case_when(
metric == "total_sales" ~ "Total Sales ($)",
metric == "total_quantity" ~ "Total Quantity Bought",
metric == "avg_price" ~ "Average Price per Unit ($)"
)
ggplot(category_data, aes(x = transaction_timestamp, y = value)) +
geom_area(fill = "#FFC0CB", alpha = 0.8) +
scale_x_date(
date_labels = "%b %d",
breaks = seq(as.Date("2017-01-01"), as.Date("2017-12-31"), by = "1 month"),
limits = c(as.Date("2017-01-01"), as.Date("2017-12-31"))
) +
labs(
title = paste(y_label, "Trends (Jan-Dec):", category_name),
subtitle = paste("Tracking", y_label, "throughout the year"),
x = "Month",
y = y_label
) +
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10, face = "bold"),
axis.text.y = element_text(size = 12),
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14, face = "italic", color = "#666666")
)
}
In this section, we analyze Valentine’s Day sales trends by examining consumer behavior and demographic purchasing patterns.
The Consumer Trends section explores how Valentine’s Day spending changes over time by analyzing key sales metrics—Total Sales, Quantity Sold, and Price—across different product categories.
plot_valentines_trend("total_sales")
Summary: This graph tracks total sales across five product categories—Alcohol, Cards & Gifts, Chocolates, Flowers, and Frozen Meals—during the first two weeks of February leading up to Valentine’s Day.
Insights:
Sales for all categories rise significantly as Valentine’s Day approaches, with a notable spike on February 14, particularly for Flowers and Chocolates. Alcohol sales exhibit multiple peaks throughout the period, alongside categories, such as Chocolates and Cards & Gifts indicating that consumers may be purchasing these items together as complementary products.
Recommendations:
Retailers should prioritize inventory and promotions for flowers and chocolates in the final days before February 14, as demand surges sharply. Fresh flowers on February 11th with higher price could generate more revenue Cross-promotional bundling of chocolates, alcohol, and greeting cards may encourage higher spending.
plot_faceted_trend("total_sales")
Summary: The chart presents the total sales trends for different Valentine’s Day-related product categories from February 1 to February 14.
Insights: Alcohol sales exhibit multiple peaks, particularly around February 3-5 and February 10-12, suggesting early purchases for gatherings and pre-Valentine’s celebrations. Flowers, on the other hand, show a sharp increase right before Valentine’s Day, highlighting their last-minute nature as a gift purchase.
Recommendations: Retailers should consider targeted promotions on chocolates and alcohol in the early weeks of February to capitalize on early planners, while ensuring adequate stock and pricing strategies for flowers closer to Valentine’s Day.
Check out the graphs below to see how total sales varies from January to February and throughout the year
plot_category_trend("Alcohol", "total_sales")
plot_category_trend_yearly("Alcohol", "total_sales")
plot_category_trend("Chocolates", "total_sales")
plot_category_trend_yearly("Chocolates", "total_sales")
plot_category_trend("Flowers", "total_sales")
plot_category_trend_yearly("Flowers", "total_sales")
plot_category_trend("Frozen Meals", "total_sales")
plot_category_trend_yearly("Frozen Meals", "total_sales")
plot_category_trend("Cards & Gifts", "total_sales")
plot_category_trend_yearly("Cards & Gifts", "total_sales")
plot_valentines_trend("total_quantity")
plot_faceted_trend("total_quantity")
Summary: These graphs tracks the total quantity of units sold across five product categories—Alcohol, Cards & Gifts, Chocolates, Flowers, and Frozen Meals—during the first two weeks of February leading up to Valentine’s Day.
Insights: Chocolates consistently lead in quantity sold, with multiple peaks throughout the period, particularly on February 3-5, February 9-11, and a final surge on February 14, indicating high demand throughout the holiday period.
Recommendations: Stocking and promotions for chocolates should begin early, as demand remains steady before peaking on February 14. Bundling alcohol and chocolates as pre-Valentine’s celebration packs could maximize revenue, especially during the early-February peaks.
Check out the graphs below to see how total quantity units sold varies from January to February and throughout the year
plot_category_trend("Alcohol", "total_quantity")
plot_category_trend_yearly("Alcohol", "total_quantity")
plot_category_trend("Chocolates", "total_quantity")
plot_category_trend_yearly("Chocolates", "total_quantity")
plot_category_trend("Flowers", "total_quantity")
plot_category_trend_yearly("Flowers", "total_quantity")
plot_category_trend("Frozen Meals", "total_quantity")
plot_category_trend_yearly("Frozen Meals", "total_quantity")
plot_category_trend("Cards & Gifts", "total_quantity")
plot_category_trend_yearly("Cards & Gifts", "total_quantity")
plot_valentines_trend("avg_price")
plot_faceted_trend("avg_price")
Summary: These graphs track the average price per unit across five product categories—Alcohol, Cards & Gifts, Chocolates, Flowers, and Frozen Meals—during the first two weeks of February leading up to Valentine’s Day.
Insights: Flowers experience a significant price increase in the days leading up to February 14 peaking sharply, suggesting retailers adjust pricing to capitalize on last-minute demand. Chocolates exhibit a gradual rise in price, particularly after February 10, indicating a shift in pricing strategy as Valentine’s Day approaches. Cards & Gifts show a steady increase in price, suggesting higher-margin opportunities as demand rises closer to the holiday.
Recommendations: Dynamic pricing for flowers should be used in the final days before Valentine’s Day to maximize revenue, particularly from late shoppers. Chocolates and greeting cards could benefit from early promotional pricing, then a shift to premium pricing closer to February 14 to optimize both volume and profit margins.
Check out the graphs below to see how average price per unit varies from January to February and throughout the year
plot_category_trend("Alcohol", "avg_price")
plot_category_trend_yearly("Alcohol", "avg_price")
plot_category_trend("Chocolates", "avg_price")
plot_category_trend_yearly("Chocolates", "avg_price")
plot_category_trend("Flowers", "avg_price")
plot_category_trend_yearly("Flowers", "avg_price")
plot_category_trend("Frozen Meals", "avg_price")
plot_category_trend_yearly("Frozen Meals", "avg_price")
plot_category_trend("Cards & Gifts", "avg_price")
plot_category_trend_yearly("Cards & Gifts", "avg_price")
# Filter for transactions from Jan 1 - Feb 28
transactions_filtered <- transactions_cleaned %>%
filter(transaction_timestamp >= as.Date("2017-01-01") &
transaction_timestamp <= as.Date("2017-02-28"))
# Calculate total sales per income bracket per category
income_sales <- transactions_filtered %>%
group_by(income_bracket, valentines_group) %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE), .groups = "drop")
# unique shoppers per income bracket
income_population <- transactions_filtered %>%
group_by(income_bracket) %>%
summarise(total_shoppers = n_distinct(household_id), .groups = "drop")
# Normalize sales per shopper
income_sales_normalized <- income_sales %>%
left_join(income_population, by = "income_bracket") %>%
mutate(sales_per_shopper = total_sales / total_shoppers)
# Plot
ggplot(income_sales_normalized, aes(x = valentines_group, y = sales_per_shopper, fill = income_bracket)) +
geom_bar(stat = "identity", position = "dodge") +
labs(
title = "Income Bracket Spending Per Shopper (Jan-Feb)",
subtitle = "Sales by number of shoppers per income bracket",
x = "Valentine's Product Category",
y = "Average Sales per Shopper ($)",
fill = "Income Bracket"
) +
theme_minimal(base_size = 14) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Summary: This graph visualizes the average spending per shopper across different income brackets for Valentine’s Day product categories during January and February.
Insights: Higher-income brackets ($150K+) tend to spend significantly more per shopper, especially on Alcohol and Cards & Gifts, indicating a preference for premium or bulk purchases. Spending on Chocolates and Flowers is relatively uniform across income groups, suggesting that these items are widely accessible and purchased regardless of financial status. Lower-income shoppers allocate a greater proportion of their spending towards Frozen Meals compared to higher-income groups, potentially indicating different purchasing priorities or celebratory habits.
Recommendations: Retailers targeting high-income consumers should consider premium product offerings and exclusive bundles for Alcohol and Cards & Gifts. Chocolates and Flowers should be marketed broadly to all income segments, with differentiated pricing strategies to appeal to both budget-conscious and premium shoppers. Promotions on Frozen Meals could be optimized for lower-income segments, potentially bundling them with affordable chocolates to increase basket size.
# Calculate total sales per age group per category
age_sales <- transactions_filtered %>%
group_by(age_group, valentines_group) %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE), .groups = "drop")
# unique shoppers per age group
age_population <- transactions_filtered %>%
group_by(age_group) %>%
summarise(total_shoppers = n_distinct(household_id), .groups = "drop")
# Normalize sales per shopper
age_sales_normalized <- age_sales %>%
left_join(age_population, by = "age_group") %>%
mutate(sales_per_shopper = total_sales / total_shoppers)
# Plot
ggplot(age_sales_normalized, aes(x = age_group, y = sales_per_shopper, fill = age_group)) +
geom_bar(stat = "identity") +
facet_wrap(~ valentines_group, scales = "free_y") +
labs(
title = "Age Group Spending Per Shopper (Jan-Feb)",
subtitle = "Sales by number of shoppers per age group",
x = "Age Group",
y = "Average Sales per Shopper ($)",
fill = "Age Group"
) +
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 90, hjust = 1),
strip.text = element_text(size = 14, face = "bold")
)
Summary: This graph illustrates the average spending per shopper across various age groups for different Valentine’s Day product categories during January and February.
Insights: Younger consumers (19-24) exhibit the highest spending on Alcohol, suggesting that this demographic prioritizes beverages for celebrations. Middle-aged shoppers (35-54) dominate spending on Chocolates and Cards & Gifts, indicating their role as primary gift purchasers. Spending on Flowers is relatively uniform across younger and middle-aged groups but declines among shoppers aged 65+, implying a lower tendency for last-minute or traditional gifting. Frozen Meals see the highest spending among younger age groups, possibly due to convenience-driven purchases or single-person households.
Recommendations: Alcohol promotions should target younger consumers with bundle deals or discounts leading up to Valentine’s Day. Marketing campaigns for Chocolates and Cards & Gifts should focus on the 35-54 demographic, emphasizing premium and personalized options. Flower sales may benefit from targeted messaging to younger shoppers, promoting early purchases or pre-order incentives.
library(knitr)
# Create data frame
data <- data.frame(
valentines_group = c("Flowers", "Chocolates & Candy", "Alcohol", "Greeting Cards & Gifts", "Frozen Meat Dinners",
"Flowers", "Chocolates & Candy", "Alcohol", "Greeting Cards & Gifts", "Frozen Meat Dinners"),
total_sales_value = c(806.30, 2476.34, 5529.81, 1512.77, 3323.46,
1059.06, 957.51, 649.07, 959.58, 526.51),
shopper_type = c("Early (Feb 1–12)", "Early (Feb 1–12)", "Early (Feb 1–12)", "Early (Feb 1–12)", "Early (Feb 1–12)",
"Last-Minute (Feb 13–14)", "Last-Minute (Feb 13–14)", "Last-Minute (Feb 13–14)",
"Last-Minute (Feb 13–14)", "Last-Minute (Feb 13–14)")
)
# Display table
kable(data, caption = "Valentine’s Day Sales Summary by Shopper Type")
| valentines_group | total_sales_value | shopper_type |
|---|---|---|
| Flowers | 806.30 | Early (Feb 1–12) |
| Chocolates & Candy | 2476.34 | Early (Feb 1–12) |
| Alcohol | 5529.81 | Early (Feb 1–12) |
| Greeting Cards & Gifts | 1512.77 | Early (Feb 1–12) |
| Frozen Meat Dinners | 3323.46 | Early (Feb 1–12) |
| Flowers | 1059.06 | Last-Minute (Feb 13–14) |
| Chocolates & Candy | 957.51 | Last-Minute (Feb 13–14) |
| Alcohol | 649.07 | Last-Minute (Feb 13–14) |
| Greeting Cards & Gifts | 959.58 | Last-Minute (Feb 13–14) |
| Frozen Meat Dinners | 526.51 | Last-Minute (Feb 13–14) |
Problem Statement
The objective of this analysis was to understand consumer purchasing behavior surrounding Valentine’s Day by examining sales trends, quantity sold, price fluctuations, and demographic spending patterns across key product categories. The goal was to identify actionable insights that could help Regork Grocery Chain optimize its inventory, pricing strategies, and marketing efforts to maximize revenue and enhance customer experience.
Methodology & Data Used
The analysis utilized transactions data from
completejourney package, including demographics and
products data. The dataset was cleaned and filtered to focus on five
Valentine’s-related product categories: Alcohol, Chocolates, Flowers,
Cards & Gifts, and Frozen Meals. Various metrics—total sales,
quantity sold, and average price per unit—were analyzed over time.
Additionally, spending behavior was segmented by income and age groups
to identify key consumer trends. Data visualizations were created to
highlight patterns and variations.
Key Insights:
Demographic Trends:
Implications & Proposal to the Regork CEO:
Limitations & Areas for Improvement: