The business problem we addressed is how different types of promotion campaigns affect sales value across various product departments and display/mailer locations at Regork. Understanding which campaign types and locations leading to higher sales value is crucial for optimizing promotion strategies, improving customer engagement, and boosting revenue.
We utilized sales transaction data, focusing on coupon types and their redemption rates, campaign display locations, and total sales across different departments. The methodology included analyzing sales value distributions across these variables to identify patterns in consumer purchases.
This analysis offers actionable insights, such as which campaign types drive the most sales and which locations are most effective. This can help Regork stakeholders make data-driven decisions on future promotions, improve resource allocation, and enhance overall sales strategies, especially by focusing on high-performing areas like grocery or drugstore departments.
library(tidyverse) # collection of R packages for data manipulation and visualization
library(here) # simplifies file path management in projects
here() starts at /Users/ezishr/Documents/CINCY/Fall 2024/BANA 4080
library(lubridate) # tools for working with date and time data
library(ggplot2) # data visualization using "Grammar of Graphics"
library(readr) # fast reading and writing of data files
library(scales) # provides functions for scaling and formatting
Attaching package: ‘scales’
The following object is masked from ‘package:purrr’:
discard
The following object is masked from ‘package:readr’:
col_factor
library(completejourney)
Welcome to the completejourney package! Learn more about these data sets at
http://bit.ly/completejourney.
c(promotions, transactions) %<-% get_data(which = 'both', verbose = FALSE)
# Extract the date for transactions and change data type for product_id
transactions <- transactions %>% mutate(date = as.Date(transaction_timestamp))
transactions$product_id <- as.double(transactions$product_id)
promotions$product_id <- as.double(promotions$product_id)
products$product_id <- as.double(products$product_id)
coupons$product_id <- as.double(coupons$product_id)
# Vectors of display and mailer locations to actual names
display_location_labels <- c("0"="No display",
"1"="Store front",
"2"="Store rear",
"3"="Front end cap",
"4"="Mid-aisle end cap",
"5"="Rear end cap",
"6"="Side aisle end cap",
"7"="In-aisle",
"9"="Secondary location display",
"A"="In-shelf")
mailer_location_labels <- c("0"="Not on ad",
"A"="Interior page feature",
"C"="Interior page line item",
"D"="Front page feature",
"F"="Back page feature",
"H"="Wrap front feature",
"J"="Wrap interior coupon",
"L"="Wrap back feature",
"P"="Interior page coupon",
"X"="Free on interior page",
"Z"="Free on front page/back page/or wrap")
Assumption: transactions with any discount are redeemed coupons
# Select only relevant columns in transactions
trans1 <- transactions %>%
dplyr::filter(retail_disc>0 | coupon_disc>0 | coupon_match_disc>0) %>%
select(household_id, store_id, product_id, sales_value:coupon_match_disc, week, date)
# Join coupons, coupon_redemptions, campaign_description to get full information of coupons
redemptions <- coupons %>%
inner_join(campaign_descriptions) %>%
inner_join(coupon_redemptions, relationship = 'many-to-many') %>%
rename(date = redemption_date)
Joining with `by = join_by(campaign_id)`Joining with `by = join_by(coupon_upc, campaign_id)`
# Get every transaction with redemption
sample_df <- redemptions %>%
left_join(trans1, by = c('household_id','product_id','date'), relationship = 'many-to-many')
colSums(is.na(sample_df))
coupon_upc product_id campaign_id campaign_type start_date
0 0 0 0 0
end_date household_id date store_id sales_value
0 0 0 2167372 2167372
retail_disc coupon_disc coupon_match_disc week
2167372 2167372 2167372 2167372
Note: There are roughly 2M coupons redemption without transaction records. We got rid of those transactions in further EDA.
# Combine necessary information for transactions having coupon redeemed
transWithRedemptions <- redemptions %>%
inner_join(trans1, by = c('household_id','product_id','date'), relationship = 'many-to-many') %>%
arrange(desc(product_id))
# Total sales from transactions with redemption, grouped by product_id and store_id
transWithRedemptions_groupedByProductStoreId <- transWithRedemptions %>%
group_by(product_id, store_id) %>%
summarise(total_sales = sum(sales_value))
`summarise()` has grouped output by 'product_id'. You can override using the `.groups` argument.
# Change for shorter name
transWithRedemptions_grouped <- transWithRedemptions_groupedByProductStoreId
#### Check if there is duplicate comb of product_id and store_id in every row
duplicates <- transWithRedemptions_grouped %>%
dplyr::group_by(product_id, store_id) %>%
dplyr::summarize(count = n()) %>%
dplyr::filter(count > 1) %>%
print(duplicates)
`summarise()` has grouped output by 'product_id'. You can override using the `.groups` argument.
Note: There is no duplicate, which is expected.
# Get unique existing product id and store id from the above
product_info_id <- unique(transWithRedemptions_groupedByProductStoreId$product_id)
store_info_id <- unique(transWithRedemptions_groupedByProductStoreId$store_id)
# Get rid of display_location=0 (not display)
promos_display <- promotions %>%
select(-(mailer_location:week)) %>%
dplyr::distinct(product_id, store_id, display_location, .keep_all = TRUE) %>%
dplyr::filter(display_location != 0)
# Join display location to transactions with redemption by keys: product_id and store_id
transWithRedemptions_displayLocation <- transWithRedemptions_groupedByProductStoreId %>%
inner_join(promos_display, by=c('product_id', 'store_id')) %>%
left_join(products, by='product_id') %>%
select(-c('manufacturer_id', 'brand','package_size'))
# Get rid of mailer_location=0 (not presented in mail)
promos_mailer <- promotions %>%
select(-c(display_location, week)) %>%
dplyr::distinct(product_id, store_id, mailer_location, .keep_all = TRUE) %>%
dplyr::filter(mailer_location != 0)
# Join mailer location to transactions with redemption by keys: product_id and store_id
transWithRedemptions_mailerLocation <- promos_mailer %>%
inner_join(transWithRedemptions_groupedByProductStoreId, by=c('product_id', 'store_id')) %>%
left_join(products, by='product_id') %>%
select(-c('manufacturer_id', 'brand','package_size'))
print(paste("Structure of transactions with redemptions, display location ver:"))
[1] "Structure of transactions with redemptions, display location ver:"
print(transWithRedemptions_displayLocation, n = 5)
print(paste("Structure of transactions with redemptions, mailer location ver:"))
[1] "Structure of transactions with redemptions, mailer location ver:"
print(transWithRedemptions_mailerLocation, n = 5)
# Promotions with display_location = 0, which is not displayed
promos_display0 <- promotions %>%
select(-c(mailer_location, week)) %>%
dplyr::filter(display_location == 0) %>%
dplyr::distinct(product_id, store_id, .keep_all = TRUE)
# Join display location = 0 to transactions with redemption by keys product_id and store_id
transWithRedemptions_displayLocationIs0 <- transWithRedemptions_grouped %>%
semi_join(promos_display0, by = c("product_id", "store_id")) %>%
left_join(products, by='product_id') %>%
select(-c('manufacturer_id', 'brand','package_size'))
# Promotions with mailer_location = 0, which is not displayed
promos_mailer0 <- promotions %>%
select(-c(display_location, week)) %>%
dplyr::filter(mailer_location == 0) %>%
dplyr::distinct(product_id, store_id, .keep_all = TRUE)
# Join mailer location to transactions with redemptions by keys product_id and store_id
transWithRedemptions_mailerLocationIs0 <- transWithRedemptions_grouped %>%
semi_join(promos_mailer0, by = c("product_id", "store_id")) %>%
left_join(products, by='product_id') %>%
select(-c('manufacturer_id', 'brand','package_size'))
print(paste("Structure of transactions with redemptions with no display location:"))
[1] "Structure of transactions with redemptions with no display location:"
print(transWithRedemptions_displayLocationIs0, n = 5)
print(paste("Structure of transactions with redemptions with no mailer location:"))
[1] "Structure of transactions with redemptions with no mailer location:"
print(transWithRedemptions_mailerLocationIs0, n = 5)
full_transactions <- transWithRedemptions
trans_w_display <- transWithRedemptions_displayLocation
trans_w_mailer <- transWithRedemptions_mailerLocation
trans_w_display0 <- transWithRedemptions_displayLocationIs0
trans_w_mailer0 <- transWithRedemptions_mailerLocationIs0
We began by analyzing the performance of each campaign type in terms
of redemption counts and total sales value. To do this, we first merged
the coupon-related datasets to gather all available coupon information.
Next, we joined this with the transactions dataset to isolate those
transactions where coupons were redeemed, creating a dataset called
full_transactions. We then grouped the data by
campaign_type to examine the trend in total sales value and
redemption counts.
summary_transactions <- full_transactions %>%
group_by(campaign_type) %>%
summarise(
total_sales = sum(sales_value),
count_type = n()
)
ggplot(data = summary_transactions, aes(x = campaign_type)) +
geom_col(aes(y = count_type, fill = 'Count of campaign type'), width = 0.5) +
geom_line(aes(y = total_sales, color = 'Total Sales'), group = 1) +
geom_point(aes(y = total_sales, color = 'Total Sales'), size = 3.0) +
scale_y_continuous(
name = "Count of campaign type",
breaks = seq(0, max(summary_transactions$count_type) + 1, by = 400),
sec.axis = sec_axis(~.,
name = "Total Sales",
breaks = seq(0, max(summary_transactions$total_sales) + 1, by = 400),
labels = scales::dollar)
) +
labs(x = 'Campaign Types', title = "Total Sales Value and Count of Redemptions Per Campaign Type") +
scale_fill_manual(name = '', values = c("Count of campaign type" = "lightblue")) +
scale_color_manual(name = '', values = c("Total Sales" = 'red')) +
theme_classic() +
theme(
plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y.left = element_text(margin = margin(r = 10), size = 14),
axis.title.y.right = element_text(margin = margin(l = 10), size = 14),
legend.position = 'bottom',
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
) +
annotate("text", x = 1, y = 7603, label = "Exceptionally High", color = "blue", size = 6, vjust = -1)
This analysis revealed that Campaign Type A significantly outperformed other campaigns in both metrics, indicating its effectiveness in driving customer engagement with coupon-related products.
Building on this, we further analyzed the total sales value by department for each campaign to better understand their impact across different product departments.
sample <- full_transactions %>%
group_by(product_id, campaign_type) %>%
summarise(total = sum(sales_value), .groups="drop") %>%
left_join(products, by="product_id") %>%
select(campaign_type, department, total) %>%
group_by(campaign_type, department) %>%
summarise(total_new = sum(total), .groups="drop")
ggplot(data = sample, aes(x = campaign_type, y = department, fill = total_new)) +
geom_tile() +
scale_fill_gradient2(low="white",
mid="lightblue",
high="blue",
name = "Sales Value",
labels = label_dollar()) +
geom_text(aes(label = label_dollar()(total_new)), color="black", size=4) +
theme_classic() +
labs(title = "Distribution of Sales Value by Campaign Types and Departments",
x = "Campaign Types",
y = "Deparments") +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
)
The GROCERY department consistently led in total
sales value across all campaign types, likely due to the essential
nature of these products. For further analysis, we can exclude the
‘GROCERY’ department to focus on the performance of other
departments.
Campaign Type A consistently drove the highest sales across all departments, suggesting its promotional strategy was highly effective and widely appealing. The strong performance of Type A indicates its potential as a model for future campaigns.
Campaign Types B and C showed no sales in several departments, possibly indicating that their promotions were not well-targeted or lacked appeal in certain product categories.
As previously noted, we excluded the GROCERY
department from this EDA to focus on other departments.
Using the dataset that includes transaction details with display locations, we calculated the total sales value by both display location and department for a more detailed analysis.
sample <- trans_w_display %>%
dplyr::filter(department != "GROCERY") %>%
group_by(display_location, department) %>%
summarize(total = sum(total_sales), .groups = 'drop')
print(sample, n = 5)
a. Stack bar plot with actual sales value for each display location
ggplot(data = sample, aes(x = display_location, y = total, fill = department)) +
geom_bar(stat = "identity", position = "stack") +
scale_fill_manual(values = c("DELI" = "#FF9999",
"DRUG GM" = "#66CC99",
"MEAT-PCKGD" = "#FFCC00",
"NUTRITION" = "#3399FF",
"MEAT" = "#FF66CC",
"PASTRY" = "#FF9966",
"SEAFOOD-PCKGD" = "#99CCFF",
"COSMETICS" = "#b3b3b3",
"PRODUCE" = "#ff7f00")) +
scale_x_discrete(labels = display_location_labels) +
scale_y_continuous(labels = label_dollar()) +
theme_classic() +
labs(title = "Total Sales by Department and Display Locations",
x = "Display Locations",
y = "Sales Value",
fill = "Departments") +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
axis.text.y = element_text(size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
)
With display locations included, we identified 9 distinct
departments, with the DRUG GM department consistently
having the highest total sales value. The location
Store rear exhibited the most significant total sales value
across all departments, particularly for MEAT-PCKGD. This
pattern suggests that customers are more likely to purchase products
displayed at the store rear, especially packaged meat.
The COSMETICS department showed notable sales value
at the Front end cap but lacked visibility in other
locations, possibly due to low exposure in those areas.
Although the products dataset contains 32 unique departments, only 9 departments were represented in the graph with transactions involving coupon redemptions. Regork might want to explore further how display locations impact sales value if coupons are introduced for the remaining departments.
To provide a clearer perspective on department contributions, we converted the above graph into a percentage format, highlighting how each department contributed to the sales value at different display locations.
b. Stack bar plot with sales value being normalize to percentage
total_sales_per_display_location <- sample %>%
group_by(display_location) %>%
summarise(total_sales_location = sum(total), .groups = "drop")
final_df <- sample %>%
left_join(total_sales_per_display_location, by = "display_location") %>%
mutate(percentage = round((total / total_sales_location) * 100,2))
ggplot(final_df, aes(x = display_location, y = percentage, fill = department)) +
geom_bar(stat = "identity", position = "stack") +
scale_fill_manual(values = c("DELI" = "#FF9999",
"DRUG GM" = "#66CC99",
"MEAT-PCKGD" = "#FFCC00",
"NUTRITION" = "#3399FF",
"MEAT" = "#FF66CC",
"PASTRY" = "#FF9966",
"SEAFOOD-PCKGD" = "#99CCFF",
"COSMETICS" = "#b3b3b3",
"PRODUCE" = "#ff7f00")) +
scale_x_discrete(labels = display_location_labels) +
scale_y_continuous(labels = scales::percent_format(scale = 1)) +
theme_classic() +
labs(title = "Total Sales by Department and Campaign Type in Percentage",
x = "Display Locations",
y = "Percentage Sales",
fill = "Departments") +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
axis.text.y = element_text(size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
)
In this percentage view, it was evident that DRUG GM
consistently maintained its strong performance across multiple
locations. In contrast, PRODUCE had a significant
contribution at the In Shelf location but underperformed in
others.
To optimize sales, we should consider relocating more
PRODUCE products from the underperforming locations or
adjusting coupon strategies to boost sales in those areas. This could
help distribute sales more evenly and take advantage of the untapped
potential in different locations.
# Display
sample1 <- trans_w_display %>%
group_by(department) %>%
summarize(total = sum(total_sales), .groups = 'drop') %>%
arrange(desc(department)) %>%
mutate(type = "Being Displayed")
# Not Display
sample2 <- trans_w_display0 %>%
group_by(department) %>%
summarise(total = sum(total_sales)) %>%
arrange(desc(department)) %>%
mutate(type = "Not Being Displayed")
combined <- bind_rows(sample1, sample2)
combined
ggplot(data = combined, aes(x = department, y = total, fill = type)) +
geom_bar(stat = "identity", position = "dodge") +
scale_y_continuous(breaks = seq(0, max(combined$total) + 1, by = 500),
labels = label_dollar()) +
geom_text(aes(label = scales::label_dollar()(total)),
position = position_dodge(width = 0.9),
vjust = -1,
color = "black") +
labs(title = "Total Sales Across Departments with Display/Not Display",
x = "Departments",
y = "Total Sales",
fill = "Display Type") +
theme_classic() +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
axis.text.y = element_text(size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
) +
scale_fill_manual(values = c("Being Displayed" = "lightblue",
"Not Being Displayed" = "lightgreen"))
Despite not being actively displayed, the GROCERY
department continued to lead in total sales value, followed by the
DRUG GM department. Notably, the MEAT
department, which likely includes fresh meat, appeared to be purchased
primarily when not displayed.
Another observation is that the TRAVEL & LEISURE
department only saw purchases when its products were not displayed,
suggesting a unique purchasing behavior in this category.
Similar to display locations, we also wanted to learn the total sales value across departments with mailer locations.
We first obtained the data by performing grouping and summary:
sample <- trans_w_mailer %>%
dplyr::filter(department != "GROCERY") %>%
group_by(mailer_location, department) %>%
summarise(total = sum(total_sales), .groups = "drop") %>%
arrange(desc(total))
print(sample, n=5)
a. Stack bar plot with actual sales value for each mailer location
ggplot(data = sample, aes(x = mailer_location, y = total, fill = department)) +
geom_bar(stat = "identity", position = "stack") +
scale_x_discrete(labels = mailer_location_labels) +
scale_fill_manual(values = c("DRUG GM" = "#8da0cb",
"MEAT" = "#a6d854",
"MEAT-PCKGD" = "#e78ac3",
"PRODUCE" = "#66c2a5",
"SEAFOOD-PCKGD" = "#fb9a99",
"DELI" = "#e5c494",
"SEAFOOD" = "#b3b3b3",
"NUTRITION" = "#ffd92f",
"COSMETICS" = "#33a02c",
"COUPON" = "#ff7f00",
"PASTRY" = "#1f78b4",
"TRAVEL & LEISURE" = "#e31a1c")) +
scale_y_continuous(breaks = seq(0, 3500, by = 500),
labels = label_dollar()) +
theme_classic() +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
axis.text.y = element_text(size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
) +
labs(title = "Total Sales by Department and Mailer Locations",
x = "Mailer Locations",
y = "Total Sale",
fill = "Departments")
With mailer locations, more departments were represented compared
to display locations, but MEAT and MEAT-PCKGD
still dominated in total sales value across various locations.
Interestingly, DRUG GM now concentrated its sales value
primarily on the Interior page feature, unlike the broader
distribution seen in display locations.
As noted earlier, while the products dataset
includes 32 unique departments, only 13 were reflected in this analysis.
Regork may want to explore opportunities to promote the remaining
departments using coupons in mailer locations.
The TRAVEL & LEISURE department, which wasn’t
seen in the display locations analysis, showed a small sales value
across three different mailer locations. It may be worth conducting a
survey to determine whether increasing this department’s presence in
mailers, particularly in the Interior page feature, could
boost its sales value.
b. Stack bar plot with sales value being normalize to percentage
total_sales_per_mailer_location <- sample %>%
group_by(mailer_location) %>%
summarise(total_sales_location = sum(total), .groups = "drop")
final_df <- sample %>%
left_join(total_sales_per_mailer_location, by = "mailer_location") %>%
mutate(percentage = round((total / total_sales_location) * 100,2))
ggplot(final_df, aes(x = mailer_location, y = percentage, fill = department)) +
geom_bar(stat = "identity", position = "stack") +
scale_fill_manual(values = c("DRUG GM" = "#8da0cb",
"MEAT" = "#a6d854",
"MEAT-PCKGD" = "#e78ac3",
"PRODUCE" = "#66c2a5",
"SEAFOOD-PCKGD" = "#fb9a99",
"DELI" = "#e5c494",
"SEAFOOD" = "#b3b3b3",
"NUTRITION" = "#ffd92f",
"COSMETICS" = "#33a02c",
"COUPON" = "#ff7f00",
"PASTRY" = "#1f78b4",
"TRAVEL & LEISURE" = "#e31a1c")) +
scale_x_discrete(labels = mailer_location_labels) +
theme_classic() +
scale_y_continuous(expand = c(0,2),
labels = scales::percent_format(scale = 1)) +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
axis.text.y = element_text(size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
) +
labs(title = "Total Sales by Department and Campaign Type in Percentage",
x = "Mailer Locations",
y = "Percentage Sales",
fill = "Departments")
DELI department contribution to
sales value was seen more clearly now. There were too many departments
being presented at Interior page feature. We can try to
minimize this and further study the concentration of each department on
specific mailer location.# Mail presented
sample1 <- trans_w_mailer %>%
dplyr::filter(department!="GROCERY") %>%
group_by(department) %>%
summarize(total = sum(total_sales), .groups = 'drop') %>%
arrange(desc(department)) %>%
mutate(type = "Mail Presented")
# Not Mail presented
sample2 <- trans_w_mailer0 %>%
dplyr::filter(department != "GROCERY") %>%
group_by(department) %>%
summarise(total = sum(total_sales)) %>%
arrange(desc(department)) %>%
mutate(type = "Not Mail Presented")
combined <- bind_rows(sample1, sample2)
combined
ggplot(data = combined, aes(x = department, y = total, fill = type)) +
geom_bar(stat = "identity", position = "dodge") +
scale_y_continuous(breaks = seq(0, max(combined$total) + 100, by = 500),
labels = label_dollar()) +
geom_text(aes(label = scales::label_dollar()(total)),
position = position_dodge(width = 0.9),
vjust = -1,
color = "black") +
labs(title = "Total Sales Across Departments with Mail Presence",
x = "Departments",
y = "Total Sales",
fill = "Mail Presence Type") +
theme_classic() +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
axis.text.y = element_text(size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
) +
scale_fill_manual(values = c("Mail Presented" = "lightblue",
"Not Mail Presented" = "lightgreen"))
It is evident that no department generated higher sales value without
being featured in the mail. Notably, COUPON and
TRAVEL & LEISURE were the only two departments that saw
no sales when not included in mail promotions. In contrast to the
display location analysis, the MEAT department showed a
significant increase in sales when featured in mailers, with sales
approximately 200 times higher than not being displayed. This
underscores the importance of ensuring that MEAT continues
to be prominently featured in mail campaigns.
Next, we aimed to compare the sales values between departments based on their presentation type: display location and mailer location. This analysis will help us identify which promotional strategy—display or mailer—is more effective for each department, allowing us to better understand how presentation type influences customer purchasing behavior and overall sales performance across departments.
mailer_promo <- trans_w_mailer %>%
select(department, total_sales) %>%
group_by(department) %>%
summarise(total = sum(total_sales), .groups="drop") %>%
mutate(promo_type = "Mail")
display_promo <- trans_w_display %>%
ungroup() %>%
select(department, total_sales) %>%
group_by(department) %>%
summarise(total = sum(total_sales), .groups="drop") %>%
mutate(promo_type = "Display")
summary_data <- bind_rows(mailer_promo, display_promo)
ggplot(data = summary_data, aes(x = department, y = total, color = promo_type)) +
geom_point(alpha = 0.6, size = 10) +
geom_segment(aes(xend = department, yend = 0), linetype = "dotted", linewidth = 1.5) +
labs(title = "Sales Comparison Between Display and Mailer Promotions",
x = "Department",
y = "Total Sales",
color = "Presented Type") +
scale_y_continuous(breaks = seq(0, max(summary_data$total) + 1, by = 300),
labels = label_dollar()) +
theme_classic() +
theme(plot.title = element_text(size = 20),
axis.title.x = element_text(margin = margin(t = 10), size = 14),
axis.title.y = element_text(margin = margin(r = 10), size = 14),
axis.text.y = element_text(size = 12),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14)
)
Nevertheless, the GROCERY department remained the top
performer across all promotion types. However, it appears that
presenting promotions through mail generally led to higher sales. This
suggests that customers tend to purchase and redeem coupons more
frequently when they see promotions in the mail, indicating a shift in
consumer behavior toward online shopping.
Regork could consider conducting a study on online shopping behavior and specifically test the effectiveness of Type A promotion campaigns via mailing.
1. Insights
The Type A promotion campaign proved to be the most effective promotional strategy, generating over $7,600 in sales value and approximately 2,400 redemption counts across all departments. This success suggests that Type A can serve as a model for future marketing campaigns. However, there are departments that were not included in this campaign. We recommend conducting further analysis to determine how well Type A strategies could be applied to those departments and whether it could lead to an increase in their sales value.
Additionally, Type B and Type C campaigns did not generate sales from several departments. This warrants further investigation into the factors that may have contributed to this lack of sales, which could help refine future promotional strategies and increase their effectiveness.
a. Display Locations
In the analysis of display locations, the MEAT-PCKGD
department had significantly higher sales value at the
Store rear compared to other locations. This suggests that
Regork might want to consider placing more products from the
MEAT-PCKGD department in the store rear. A similar strategy
could be applied to the PRODUCE department, where placing
products in the In shelf location could boost
sales.
When comparing sales value between displayed and non-displayed
products, the results were surprising. Except for GROCERY,
MEAT had significantly higher sales value when not
displayed compared to when it was displayed. This trend was also
observed in the PRODUCE and
TRAVEL & LEISURE departments, with the latter not
appearing in the display location analysis. We recommend running a
campaign that places TRAVEL & LEISURE products in
display locations with coupons to see if this generates a noticeable
increase in sales value.
b. Mailer Locations
In the mailer locations analysis, more departments showed sales
value compared to the display locations. Specifically, departments like
TRAVEL & LEISURE and DELI, which were not
seen in the display location analysis, became more prominent when
looking at the percentage view. The total sales value in mail locations
was generally higher than in display locations, suggesting a shift in
consumer behavior toward online shopping. Regork may want to consider
investing further in technologies that support online promotions to
capitalize on this trend and drive higher revenue.
A comparison between mail presence and no mail presence indicates that sales value was significantly higher when promotions were presented via mail. This approach should be continued, and further campaign strategies should be developed to optimize the effectiveness of this promotional method.
c. Display vs. Mailer
2. Limitations
We have not yet considered the impact of time frames on these promotional campaigns. Future work should include an analysis of how the timing of promotions influences sales performance.
The assumption made at the beginning of the report, that all transactions with any discount are redeemed coupons, has yet to be confirmed. Further validation of this assumption is necessary to ensure the accuracy of the analysis.
There are approximately 2 million coupon redemptions without corresponding transaction records. Collecting more data on these redemptions could provide valuable insights and potentially affect the total sales value of the different campaign types.