Understanding customer purchasing behavior is essential for optimizing sales and marketing strategies. This analysis explores whether smoked meat and bread are frequently purchased together, identifying trends and opportunities for increasing co-purchases. By leveraging transaction data, we can determine how often these items are bought together, when sales peak, and how strategic interventions—such as bundling, promotions, or pricing adjustments—can drive sales growth.
This report examines co-purchasing trends of smoked meat and bread using transactional sales data from a national grocery chain. Through exploratory data analysis and visualizations, we analyze purchase frequency, seasonal patterns, and correlations between these two product categories. Insights from this study will inform marketing strategies, such as bundle deals, in-store product placement improvements, and seasonal promotions, to encourage higher sales. The findings will help the company increase revenue and enhance the shopping experience by aligning product offerings with customer behavior.
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
# Load necessary packages
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” dplyr 1.1.4 âś” readr 2.1.5
## âś” forcats 1.0.0 âś” stringr 1.5.1
## âś” ggplot2 3.5.1 âś” tibble 3.2.1
## âś” lubridate 1.9.4 âś” tidyr 1.3.1
## âś” purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(completejourney)
## Welcome to the completejourney package! Learn more about these data
## sets at http://bit.ly/completejourney.
library(lubridate)
library(ggplot2)
library(gridExtra)
##
## Attaching package: 'gridExtra'
##
## The following object is masked from 'package:dplyr':
##
## combine
library(knitr)
# Load the datasets
transactions <- get_transactions()
products <- completejourney::products
# Filter products
smoked_meat <- products %>% filter(department %in% c("MEAT-PCKGD", "MEAT", "DELI"))
bread <- products %>% filter(department %in% c("PASTRY", "GROCERY"))
# Check number of products
cat("Smoked Meat Products:", nrow(smoked_meat), "\nBread Products:", nrow(bread))
## Smoked Meat Products: 7328
## Bread Products: 41172
transactions_products <- transactions %>%
left_join(products, by = "product_id")
co_purchases <- transactions_products %>%
group_by(basket_id) %>%
summarise(
has_smoked_meat = any(product_id %in% smoked_meat$product_id),
has_bread = any(product_id %in% bread$product_id)
) %>%
mutate(both_purchased = has_smoked_meat & has_bread)
summary_table <- co_purchases %>%
summarise(
total_transactions = n(),
smoked_meat_only = sum(has_smoked_meat & !has_bread),
bread_only = sum(has_bread & !has_smoked_meat),
both_purchased = sum(both_purchased),
percentage_both = (both_purchased / total_transactions) * 100
)
kable(summary_table, caption = "Summary of Co-Purchase Transactions")
| total_transactions | smoked_meat_only | bread_only | both_purchased | percentage_both |
|---|---|---|---|---|
| 155848 | 3322 | 67917 | 55075 | 35.33892 |
## 1. Bar Chart - Frequency of Co-Purchases
co_purchase_plot <- ggplot(co_purchases, aes(x = factor(both_purchased, labels = c("Not Paired", "Paired")))) +
geom_bar(fill = "steelblue") +
labs(title = "Frequency of Smoked Meat & Bread Purchases",
x = "Purchase Category",
y = "Count") +
theme_minimal()
co_purchase_plot
## 2. Time Series Plot - Trend of Smoked Meat and Bread Purchases Over Time
# Convert transaction timestamp to date format
transactions_products$transaction_date <- as.Date(transactions_products$transaction_timestamp)
# Aggregate co-purchases over time
co_purchases_time <- transactions_products %>%
filter(product_id %in% c(smoked_meat$product_id, bread$product_id)) %>%
group_by(transaction_date) %>%
summarise(total_purchases = n())
# Create the plot
time_series_plot <- ggplot(co_purchases_time, aes(x = transaction_date, y = total_purchases)) +
geom_line(color = "blue") +
labs(title = "Trend of Smoked Meat & Bread Purchases Over Time",
x = "Date",
y = "Number of Purchases") +
theme_minimal()
time_series_plot
## 3. Heatmap - Frequency of Co-Purchases
co_purchases_heatmap <- co_purchases %>%
group_by(has_smoked_meat, has_bread) %>%
summarise(count = n()) %>%
ggplot(aes(x = has_smoked_meat, y = has_bread, fill = count)) +
geom_tile() +
scale_fill_gradient(low = "white", high = "red") +
labs(title = "Heatmap of Smoked Meat & Bread Purchases",
x = "Smoked Meat Purchased",
y = "Bread Purchased") +
theme_minimal()
co_purchases_heatmap
## 4. Scatter Plit - Relationship Between Purchases
co_purchase_scatter <- transactions_products %>%
filter(product_id %in% c(smoked_meat$product_id, bread$product_id)) %>%
group_by(basket_id) %>%
summarise(
smoked_meat_count = sum(product_id %in% smoked_meat$product_id),
bread_count = sum(product_id %in% bread$product_id)
) %>%
ungroup()
scatter_plot <- ggplot(co_purchase_scatter, aes(x = smoked_meat_count, y = bread_count)) +
geom_point(alpha = 0.5, color = "blue") +
labs(title = "Relationship Between Smoked Meat & Bread Purchases",
x = "Smoked Meat Items",
y = "Bread Items") +
theme_minimal()
scatter_plot
The analysis shows that while some customers purchase both smoked meat and bread, a significant portion buy only one. The time series plot highlights seasonal trends, with sales increasing during summer BBQ months and holiday weekends, presenting an opportunity to optimize marketing efforts. The heatmap analysis further suggests that while co-purchasing occurs, it is not automatic, indicating potential for strategic interventions.
To increase co-purchases, the store should implement bundled promotions, such as discounted meal combos or BOGO deals on complementary items. Optimizing store layout by placing smoked meat and bread closer together, along with clear signage, can encourage joint purchases. Additionally, seasonal promotions around peak buying periods, such as BBQ combo deals in summer, can further drive sales.
Adjusting pricing strategies may also be effective, such as offering a discount on bread with the purchase of smoked meat. These combined strategies—bundling, store layout optimization, and targeted promotions—can increase revenue and enhance the shopping experience, turning occasional co-purchasers into consistent buyers.
ggsave("co_purchases_plot.png", co_purchase_plot, width = 6, height = 4)
ggsave("co_purchases_time_series.png", time_series_plot, width = 6, height = 4)
ggsave("co_purchases_heatmap.png", co_purchases_heatmap, width = 6, height = 4)
ggsave("co_purchases_scatter.png", scatter_plot, width = 6, height = 4)