To be able to grow as a company at Regork, we (the marketing team) believe it is important to assess where we are at holistically. We have used data analysis to compile our monthly sales revenue from each month as well as to determine what our best promotional campaigns have been. What we have found is that we don’t have to reinvent the wheel. Our analyses have shown us that we should be able to meet our ultimate goal of maximizing profits and increasing sales revenue by simply answering our two research questions below:
Research Question 1: Which Promotional Campaigns led
to the most sales and transactions?
Research Question 2: How can we utilize these
Promotional Campaigns to maximize profit?
By assessing our top-selling products and cross-referencing the products that sell the best without promotions, we are able to work on key strategies to maximize sales for the products that are typically not in promotions (i.e. soft drinks).
Packages
library(dplyr) # manipulating and transforming data
library(stringr) # text manipulation
library(tidyverse) # tidying data and working with other R packages
library(lubridate) # working with dates and times
library(tidyr) # tidying or cleaning up messy data
library(RColorBrewer) # allows different color palettes for visuals
library(ggplot2) # data visualization
library(gt) # creates more aesthetically pleasing tables
library(completejourney) # grocery store shopping transactions data
Data Frames Created
transacts<-get_transactions()
promos<-get_promotions()
demo<-demographics
cd<-campaign_descriptions
camp<-campaigns
coup<-coupons
cr<-coupon_redemptions
prods<-products
As mentioned, we first wanted to look at where we may have gaps. We looked at the historical data and saw the amount of sales per month for the whole year. We also looked at our top selling products (we will compare these later). Lastly, we wanted to see what our best marketing campaigns were in terms of transactions and sales value. Coincidentally, they were both the same. See below!
sales_by_month <- transacts%>%
mutate(month = month(transaction_timestamp, label = TRUE))%>%
group_by(month)%>%
summarize(total_sales = sum(sales_value, na.rm = TRUE), .groups = "drop")%>%
arrange(month)
ggplot(sales_by_month, aes(x = month, y = total_sales, group = 1)) +
geom_line(size = 1) +
geom_point(size = 2) +
labs(title = "Regork Total Sales by Month (2017)",
x = "Month",
y = "Total Sales") +
theme_minimal()
Looking at this chart, February and
September stand out as opportunities for us to pull
some resources into bolstering these sales months.
highest_selling_products <- transacts%>% #Highest Selling Products df
group_by(product_id)%>%
summarize(
total_sales = sum(sales_value, na.rm = TRUE),
total_units = n(),
.groups = "drop"
)%>%
inner_join(prods, by = "product_id")%>%
select(product_id, total_sales, total_units, product_type)%>%
arrange(desc(total_sales))%>%
slice_max(order_by = total_sales, n = 15)
highest_selling_products%>% #Highest Selling Products Table Creation
gt()%>%
tab_header(title = "Top 15 Highest Selling Products")%>%
fmt_currency(columns = vars(total_sales),
currency = "USD")%>%
cols_label(
product_id = "Product ID",
total_sales = "Total Sales",
total_units = "Total Units",
product_type = "Product Type"
)
| Top 15 Highest Selling Products | |||
| Product ID | Total Sales | Total Units | Product Type |
|---|---|---|---|
| 6534178 | $303,116.02 | 11582 | GASOLINE-REG UNLEADED |
| 6533889 | $27,467.61 | 823 | GASOLINE-REG UNLEADED |
| 1029743 | $22,729.71 | 7874 | FLUID MILK WHITE ONLY |
| 6534166 | $20,477.54 | 722 | GASOLINE-REG UNLEADED |
| 6533765 | $19,451.66 | 1150 | GASOLINE-REG UNLEADED |
| 1082185 | $17,219.59 | 16992 | BANANAS |
| 916122 | $16,120.01 | 2497 | CHICKEN BREAST BONELESS |
| 1106523 | $15,629.95 | 5424 | FLUID MILK WHITE ONLY |
| 995242 | $15,602.59 | 7441 | FLUID MILK WHITE ONLY |
| 5569230 | $13,410.46 | 2771 | SOFT DRINKS 12/18&15PK CAN CAR |
| 1127831 | $11,500.01 | 3551 | STRAWBERRIES |
| 1044078 | $11,445.70 | 2464 | LEAN |
| 844179 | $11,270.46 | 2208 | PRIMAL |
| 1133018 | $8,436.41 | 4231 | FLUID MILK WHITE ONLY |
| 874972 | $7,905.59 | 710 | SELECT BEEF |
While Gasoline remains our top selling item, it is not an item that would be beneficial in campaigning as the demand for gas will always be high. As you will see in the Marketing Plan tab, there are other products on this Highest Selling Products list that we can market better.
top_campaigns<-camp%>%
inner_join(cd, by = "campaign_id")
top_campaigns<-top_campaigns%>%
full_join(coup, by = "campaign_id")
top_campaigns_view<-top_campaigns%>% #Top Campaigns in terms of Coupon Redemptions
semi_join(cr)%>%
group_by(campaign_id, campaign_type, start_date, end_date)%>%
summarize(total_redemptions = n())%>%
arrange(desc(total_redemptions))
transacts_with_coups<-transacts%>%
full_join(coup, by = "product_id")%>%
semi_join(cr, by = "coupon_upc")
group_campaigns_sales<- transacts_with_coups %>% #Top Campaigns in terms of Sales Value
group_by(campaign_id) %>%
summarize(total_sales = sum(sales_value, na.rm = TRUE))%>%
arrange(desc(total_sales))
ggplot(top_campaigns_view, aes(x = reorder(as.factor(campaign_id), total_redemptions),
y = total_redemptions / 1000,
fill = campaign_type)) +
geom_bar(stat = "identity") +
labs(title = "Coupon Redemptions by Campaign (in Thousands)",
x = "Campaign ID",
y = "Total Redemptions (in Thousands)",
fill = "Campaign Type") +
coord_flip() +
theme_minimal()
ggplot(group_campaigns_sales, aes(x = reorder(as.factor(campaign_id), total_sales), y = total_sales / 1000)) +
geom_bar(stat = "identity", fill = "forestgreen") +
labs(title = "Total Sales by Campaign (in $1000's)",
x = "Campaign ID",
y = "Total Sales ($1000's)") +
theme_minimal()
From the tables above you can see that 3 of the 4 Type A Campaigns far
and away bring in the most business for Regork in terms of sales and
coupon redemptions.
We can take full advantage of this knowledge to create the optimal Marketing Plan moving forward!
Our plan is simple:
1. We are going to cut Campaign Types B and C AND run more Type A campaigns.
Reasoning: Roughly 91% of our stores total coupon redemptions comes from Type A campaign. We can allocate the resources used from the other 23 campaigns to develop more Type A ones.
2. Our new Type A campaigns will focus on the top selling items that have not already been attached to a promotion.
Reasoning: If those items are already selling well (Top 15 product overall) then adding a promotion to a target demographic should only benefit Regork in sales and customer traction.
3. We will focus on our least profitable months.
Reasoning: Focusing on our least profitable months allows us to create more demand when sales are traditionally low. This will ultimately improve overall profitability by changing off-peak periods into growth opportunities.
Let’s start by looking at our top selling items that are “non-coupon” transactions.
non_coupon_transacts <- transacts%>%
anti_join(coup, by = "product_id")
highest_selling_non_coupon_products <- non_coupon_transacts %>% #Highest Selling without Coupons
group_by(product_id)%>%
summarize(
total_sales = sum(sales_value, na.rm = TRUE),
total_units = n(),
.groups = "drop"
)%>%
inner_join(prods, by = "product_id")%>%
select(product_id, total_sales, total_units, product_category, product_type)%>%
arrange(desc(total_sales))
group_of_non_coupon<-highest_selling_non_coupon_products%>%
group_by(product_category)%>%
summarize(
total_category_sales = sum(total_sales, na.rm = TRUE),
total_category_units = sum(total_units, na.rm = TRUE),
.groups = "drop"
) %>%
arrange(desc(total_category_sales))%>%
slice_max(order_by = total_category_sales, n = 15)
group_of_non_coupon%>%
gt()%>%
tab_header(title = "Top 15 Product Categories Without Coupons",
subtitle = "Aggregated by Total Sales and Total Units Sold") %>%
fmt_number(columns = vars(total_category_sales, total_category_units),
decimals = 0)%>%
cols_label(
product_category = "Product Category",
total_category_sales = "Total Sales",
total_category_units = "Total Units"
)
| Top 15 Product Categories Without Coupons | ||
| Aggregated by Total Sales and Total Units Sold | ||
| Product Category | Total Sales | Total Units |
|---|---|---|
| COUPON/MISC ITEMS | 379,394 | 14,705 |
| SOFT DRINKS | 114,118 | 42,212 |
| BEERS/ALES | 81,943 | 9,938 |
| BAKED BREAD/BUNS/ROLLS | 67,858 | 35,705 |
| DELI MEATS | 58,197 | 13,668 |
| BAG SNACKS | 54,936 | 25,205 |
| CIGARETTES | 54,350 | 6,834 |
| CANDY - PACKAGED | 40,994 | 18,569 |
| DOMESTIC WINE | 35,928 | 3,294 |
| SOUP | 35,818 | 19,989 |
| BAKED SWEET GOODS | 31,690 | 15,931 |
| COFFEE | 30,364 | 6,471 |
| CANNED JUICES | 29,566 | 12,362 |
| WATER - CARBONATED/FLVRD DRINK | 28,100 | 9,638 |
| CHEESES | 26,704 | 7,147 |
Gasoline and Gift Cards comprise the Coupon/Misc Items product category despite not actually being attached to any coupon promotions. As mentioned earlier we are going to ignore that category. We have devised a plan to use a section of our store that has huge potential.
top_selling_feb_no_coupons <- non_coupon_transacts %>% #February
mutate(month = month(transaction_timestamp, label = TRUE)) %>%
filter(month %in% c("Feb")) %>%
group_by(product_id) %>%
summarize(
total_sales = sum(sales_value, na.rm = TRUE),
total_units = n(),
.groups = "drop"
) %>%
arrange(desc(total_sales)) %>%
inner_join(prods, by = "product_id")
top_selling_feb_no_coupons<-top_selling_feb_no_coupons%>%
filter(product_category != "COUPON/MISC ITEMS") %>%
group_by(product_category)%>%
summarize(
total_category_sales = sum(total_sales, na.rm = TRUE),
total_category_units = sum(total_units, na.rm = TRUE),
.groups = "drop"
) %>%
arrange(desc(total_category_sales))%>%
slice_max(order_by = total_category_sales, n = 10)
top_selling_sep_no_coupons <- non_coupon_transacts %>% #September
mutate(month = month(transaction_timestamp, label = TRUE)) %>%
filter(month %in% c("Sep")) %>%
group_by(product_id) %>%
summarize(
total_sales = sum(sales_value, na.rm = TRUE),
total_units = n(),
.groups = "drop"
) %>%
arrange(desc(total_sales)) %>%
inner_join(prods, by = "product_id")
top_selling_sep_no_coupons<-top_selling_sep_no_coupons%>%
filter(product_category != "COUPON/MISC ITEMS") %>%
group_by(product_category)%>%
summarize(
total_category_sales = sum(total_sales, na.rm = TRUE),
total_category_units = sum(total_units, na.rm = TRUE),
.groups = "drop"
) %>%
arrange(desc(total_category_sales))%>%
slice_max(order_by = total_category_sales, n = 10)
top_selling_sep_no_coupons <- top_selling_sep_no_coupons %>%
mutate(month = "Sep")
top_selling_feb_no_coupons <- top_selling_feb_no_coupons %>%
mutate(month = "Feb")
combined_top_categories <- bind_rows(top_selling_feb_no_coupons, top_selling_sep_no_coupons)
unique_categories <- length(unique(combined_top_categories$product_category))
mycolors <- colorRampPalette(brewer.pal(8, "Paired"))(unique_categories)
ggplot(combined_top_categories, aes(x = month, y = total_category_sales, fill = product_category)) +
geom_col(color = "black") +
scale_fill_manual(values = mycolors) +
labs(title = "Top 10 Product Categories by Month",
subtitle = "Items not linked to campaign promotions (excludes COUPON/MISC ITEMS)",
x = "Month",
y = "Total Sales",
fill = "Product Category") +
theme_minimal()
Soft drinks, Beers & Ales, as well as Domestic Wine bring in the
most collective amount of money as a group. In the months of February
and September that still rings true. We believe we can take advantage of
special promotions and the “smaller” holidays to bolster sales in this
market.
The targeted promotions will be sent to a specific age demographic. This group is a range of 25-54 year-olds (comprised of 3 separate age ranges: 25-34, 35-44, 45-54) as they were the collective age demographic that purchased 81% of the drink sales (see graph below).
sales_by_age <- non_coupon_transacts %>%
inner_join(prods, by = "product_id") %>%
inner_join(demo, by = "household_id") %>%
group_by(product_category, age) %>%
filter(product_category %in% c("SOFT DRINKS", "BEERS/ALES", "DOMESTIC WINE")) %>%
summarize(
total_sales = sum(sales_value, na.rm = TRUE),
total_units = n(),
.groups = "drop"
) %>%
arrange(product_category, desc(total_sales))
ggplot(sales_by_age, aes(x = product_category, y = total_sales, fill = age)) +
geom_bar(stat = "identity", position = "dodge", color = "black") +
labs(title = "Total Sales by Product Category and Age Group",
x = "Product Category",
y = "Total Sales",
fill = "Age Group") +
theme_minimal() +
coord_flip()
February Type A Campaign
Implementation strategy is to run a Type A campaign of our drink section that runs from December 29th-February 28th to the designated age demographic. During this time we would have promotions for Soft Drinks, Beers and Ales, and Domestic Wines through the following holidays or special events with added sales the day before through the day after the event. While we understand this will allow sales to grow in January - which is obviously a good thing - it should also show growth in February. Any growth in this month is a huge win for us.
List of of Holidays and Special Events:
-New Years Eve (January)
-The Super Bowl (February)
-Valentine’s Day (February)
September Type A Campaign
Implementation strategy is to run a Type A campaign of our drink section that runs the entire month of September to the designated age demographic. While the only major holiday is Labor Day, we believe running sales during this time, ‘Peak Fall Sport and Tailgating Season,’ can lead to some exponential growth and movement of product.
Our primary research questions were: Which promotional campaigns drive the most sales and transactions? And how can we leverage these campaigns to boost profit? We needed to create a comprehensive review of Regork and its performance to maximize profits.
What we did:
We analyzed monthly sales data, promotional campaign performance, and product sales (both with and without coupons). Using this data, we were able to create a marketing campaign from transactions, promotions, as well as demographics. We employed R (using packages like dplyr, ggplot2, gt, and others) to clean, summarize, and visualize the data.
What we found:
Our analysis revealed that Type A campaigns account for roughly 91% of coupon redemptions. This indicates that there is a strong correlation with sales. We found that there are other items that could benefit from promotional opportunities outside of gasoline sales. This opportunity lies in promoting high-margin items (e.g., soft drinks, beers, domestic wine) which aren’t typically attached to promotions. We found that February and September emerged as potential months to boost sales while targeting these promotions to 25-54 year-old consumers.
As mentioned in our plan, we propose cutting Campaign Types B and C and reallocating resources to increase Type A campaigns. Focusing these campaigns on top-selling items that were not previously promoted can drive both revenue growth and customer engagement, especially in months that had low-sales historically.
Limitations
The analysis is limited by data quality as well as information provided almost eight years ago. Future work should be done employing these strategies in a fixed experiment to see whether or not this strategy proves to be effective.