Introduction

The problem that I am going to tackle is to see how Regork can improve bacon sales using an analytical approach by looking at the demographics that purchase bacon, what products that are frequently purchased with bacon, time of that it is purchased, and how often consumers uses coupons when purchasing bacon. Regork’s CEO can use this information to analyze when they should run promotions, what they are running in those promotions, and what consumer demographic they should target with those promotions.

I have addressed this problem by examining all aspets of bacon sales for Regork, including demographics, timeliness, compliments in consumption, and coupon usage. With this I used the CompleteJourney dataset within R.

Upon presenting my information to the Regork CEO, they will be able to make more educated decisions when it comes to how they market bacon in their stores through various coupons and promotions.

Packages/Libraries Required

completejourney: Provides past data on Regork that can be used for future analysis

ggplot2: Allows for graphical production

dplyr: Makes data manipulation easier

lubridate: Makes date-time data easier to work with

tidyr: Helps create tidy data

stringr: Provides pattern matching functions

Exploratory Data Analysis

What Ages of Consumers are Buying Bacon?

transactions <- transactions %>%
  mutate(transaction_timestamp = ymd_hms(transaction_timestamp))
transactions_products <- transactions %>%
  inner_join(products, by = "product_id")
bacon_sales_age <- transactions_products %>%
  filter(product_category == "BACON") %>%  
  inner_join(demographics, by = "household_id") %>%  
  group_by(age) %>%
  summarize(total_bacon_purchased = sum(quantity, na.rm = TRUE)) %>%
  arrange(age)
ggplot(bacon_sales_age, aes(x = age, y = total_bacon_purchased)) +
  geom_col(fill = "firebrick") +
  labs(
    title = "Bacon Purchases by Age Group",
    x = "Age Group",
    y = "Quantity of Bacon Purchased"
  ) +
  theme_minimal()

This data that analyzes bacon purchases by age group. It is evident that the ones who purchase bacon is most commonly those who are middle-aged. This data can is of value because those that are this age generally have families. The steep drop off to the 55-64 demographic could demonstrate that bacon purchases halt when children move out of the house.

What Income Range is Purchasing the Most Bacon?

transactions <- transactions %>%
  mutate(transaction_timestamp = ymd_hms(transaction_timestamp))
transactions_products <- transactions %>%
  inner_join(products, by = "product_id")
bacon_sales_income <- transactions_products %>%
  filter(product_category == "BACON") %>%  
  inner_join(demographics, by = "household_id") %>%  
  group_by(income) %>%
  summarize(total_bacon_purchased = sum(quantity, na.rm = TRUE)) %>%
  arrange(income)
ggplot(bacon_sales_income, aes(x = income, y = total_bacon_purchased)) +
  geom_col(fill = "firebrick") +
  labs(
    title = "Bacon Purchases by Income",
    x = "Income Range",
    y = "Quantity of Bacon Purchased"
  ) +
  theme_minimal()

This data represents the bacon is most commonly purchased by the middle-class. Sales peaking in the $50-74K range, and closley followed by the $35-49K range. Meaning that Regork could find ways to market to higher-income households in order to increase their sales.

What Time of the Year is Bacon Most Purchased?

transactions <- transactions %>%
  mutate(transaction_timestamp = ymd_hms(transaction_timestamp))
transactions_products <- transactions %>%
  inner_join(products, by = 'product_id')
weekly_bacon_sales <- transactions_products %>%
  filter(product_category == "BACON") %>% 
  mutate(week = week(transaction_timestamp)) %>%
  group_by(week) %>%
  summarize(weekly_bacon_sales = sum(quantity, na.rm = TRUE)) %>%
  arrange(week)
bacon_sales <- weekly_bacon_sales %>%
  complete(week = 1:52, fill = list(weekly_bacon_sales = 0))
ggplot(data = weekly_bacon_sales, aes(x = week, y = weekly_bacon_sales)) +
  geom_line(color = "firebrick") +
  geom_smooth(method = "loess", color = "tan") +
  labs(
    title = "Bacon Sales by Week of the Year",
    x = "Week Number",
    y = "Quantity of Bacon Purchased"
  ) +
  theme_minimal()

This information displays that there is a slight increase of bacon sales during summertime and a dropoff for the colder months. Suggesting that Regork could run promotions during the winter to keep their products moving.

What Products are Most Commonly Purchased with Bacon?

bacon_products <- products %>%
  filter(str_detect(tolower(product_type), "bacon")) %>%
  select(product_id, product_type)
bacon_transactions <- transactions %>%
  filter(product_id %in% bacon_products$product_id) %>%
  select(basket_id, product_id)
co_purchased_products <- transactions %>%
  filter(basket_id %in% bacon_transactions$basket_id) %>%
  filter(!product_id %in% bacon_products$product_id) %>% 
  group_by(product_id) %>%
  summarise(purchase_count = n(), .groups = "drop") %>%
  arrange(desc(purchase_count))
top_co_purchased <- co_purchased_products %>%
  inner_join(products, by = "product_id") %>%
  select(product_type, purchase_count) %>%
  head(10) 
ggplot(top_co_purchased, aes(x = product_type, y = purchase_count)) +
  geom_bar(stat = "identity", fill = "firebrick") +
  coord_flip() +
  labs(
    title = "Top Products Purchased with Bacon",
    x = "Product Type",
    y = "Quantity Purchased"
  ) +
  theme_minimal()

This data provides valuable insight as to what people are buying bacon for. If it were to be with buns and hamburgers it would suggest that it would be for grilling out, but because the complementary products are popular breakfeast items, it would suggest that bacon is bought as a breakfeast item and could be ran in promotions as it relates to breakfeast.

How do Coupon and Discounts Effect Bacon Sales?

bacon_products <- products %>%
  filter(str_detect(tolower(product_type), "bacon")) %>%
  select(product_id, product_type)
bacon_transactions <- transactions %>%
  filter(product_id %in% bacon_products$product_id) %>%
  select(household_id, basket_id, product_id, quantity, sales_value)
bacon_sales <- bacon_transactions %>%
  left_join(coupons, by = "product_id") %>%  
  mutate(promotion_flag = ifelse(!is.na(campaign_id), "With Promotion", "Without Promotion")) %>%
  distinct(product_id, .keep_all = TRUE)  
bacon_summary <- bacon_sales %>%
  group_by(promotion_flag) %>%
  summarise(
    total_quantity = sum(quantity, na.rm = TRUE),  
    total_revenue = sum(sales_value, na.rm = TRUE)
  )
bacon_summary_long <- bacon_summary %>%
  pivot_longer(cols = c(total_quantity, total_revenue), 
               names_to = "metric", 
               values_to = "value")
ggplot(bacon_summary_long, aes(x = promotion_flag, y = value, fill = metric)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "Impact of Bacon Promotions on Sales Volume & Revenue",
    x = "Promotion Status",
    y = "Sales Quantity",
    fill = "Metric"
  ) +
  theme_minimal()

This data on promotions and the quantity of bacon purchased and the revenue generated showcase that far more people are buying bacon when there is some sort of promotion going on, along which spending more money. Regork’s CEO could use this information to decide on the amount of promotions that they run by whether or not they want their customers to use promotions.

Summary

The problem statement that I chose to address was to examine all aspects of Regork’s bacon sales so that their CEO can use the data going forward to make informed decisions to maximize their revenue in bacon sales.

The way that I chose to attack this problem was to analyze the data from the CompleteJourney dataset and look at the demographics, timeliness, complementary products, and coupon usage when it came to bacon consumption. Which ultamintly allowed me to assess who is buying bacon, when they are buying it, what they are buying it with, and if they have a savings method.

The most interesting insights that my analysis provided was that it created a very clear image to me as to what exact demographic was buying the most bacon. It was evidently middle-class families who were buying it the most. This coming from the income and age of the people that were at the highest where you would expect a middle-class adult.

This information could encourage the Regork CEO to seek a new demographic to target for bacon sales. I would reccomend that they target higher-income consumers because bacon is not necesarrily a middle-class food but what forever reason high-class people are electing to not buy it in stores.

While my analysis provide valuable amd useful information that the CEO of Regrok could use, it is not perfect. In order to build on this, you could go more in depth to get a better image of you exact demographic. Also, I would like to see what older and higher class people are buying as a substitute of bacon, if at all. Along with that I would like to see what is being bought in the winter months instead of bacon.