Introduction

Growth Opportunity/Business Problem

We saw an opportunity for Regork to look at what marketing campaigns are successful among different income ranges. We want to help show which campaigns should be prioritized and adjust prices within those product categories to further increase revenue.

Our Process

We took information from our demographics, coupon redemption data, transactions, and product data to pull together what campaigns were most effective to each income range. From there we were able to dissect what campaigns targeted which categories of products, and looking at the average sales value for these products we can find new opportunities in raising prices or decreasing coupon values.

Use of our Report

This report breaks down the marketing campaign effectiveness across different income ranges for Regork. It allows the marketing team and financial team to review their campaigns as well as look for opportunities to grow revenues across products. It gives the company a snapshot of what is being discounted the most and what is driving sales within the marketing campaigns.

Packages Used

CompleteJourney

  • This package provides all the data from Regork company that we used to form our analysis and solution.

Tidyverse

  • This package allows us to neatly and easily transform and present data.

lubridate

  • This package allows us to neatly and easily transform date data.

gridExtra

  • This package allows us to execute simple visualization with combining plots.

DT

  • This package allows us to execute simple visualization with data tables.
library(tidyverse)
library(lubridate)
library(completejourney)
library(gridExtra)
library(DT)

Data Analysis

Sales Distribution

This graph allows us to see which income ranges are driving sales for Regork. We find that Income Ranges 35-49K, 50-74k , and 75-99k are driving approximately 55% of total sales for Regork. (18.45%, 24.34%, 12.79% respectively).

transactions <- get_transactions()
promotions <- get_promotions()

df1 <- transactions %>%
  inner_join(demographics, by = "household_id") %>%
  group_by(income) %>%
  summarize(totalval = (sum(sales_value))/1000) %>%
  arrange(desc(totalval))

ggplot(df1,aes(x = income, y = totalval)) +
  geom_bar(stat = "identity", fill = "skyblue") +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8)) +
  labs(title = "Who is driving the Sales?",
       subtitle = "Based on Income Ranges",
       x = "Income Ranges",
       y = "Sales Value (in 1000$)",
       caption = "data source: https://cran.r-project.org/web/packages/completejourney/vignettes/completejourney.html")

df1 %>%
   mutate(percent_sales = 100 * (totalval*1000 / sum(totalval*1000))) %>%
   group_by("Income" = income) %>%
   summarise("Percent of Sales" = percent_sales) %>%
   arrange(desc("Percent of Sales")) %>%
  datatable(df1, options = list(pageLength = 12),
          colnames = c("Income Range", "Percent of Revenue"))

Top Marketing Campaigns

Here we find that the top 3 marketing campaigns are #13, #18, and #8 across the top 3 income ranges.

transactions <- get_transactions()
demo_data <- demographics
promotions <- get_promotions()
coupon_data <- coupons
redeem_data <- coupon_redemptions

data1 <- redeem_data %>%
  inner_join(demo_data, by  = "household_id") %>%
  filter(income %in% c("50-74K","35-49K","75-99K")) %>%
  group_by(campaign_id) %>%
  summarize(total_redemptions = n(), income = income) %>% 
  ggplot(aes(total_redemptions, fct_reorder(campaign_id, total_redemptions), fill = income)) +
  geom_col() +
  facet_wrap(~income, nrow = 1, scales = "free_y") +
  theme(panel.spacing = unit(1,"lines"), legend.position = "none") +
  labs(title = "Total Coupon Redemptions",
       subtitle = "Sorted by Campaign ID and Income Range",
       x = "Total Redemptions",
       y = "Campaign ID"
  )
data1

Marketing Strategy for Each Campaign

Here are 3 data tables highlighting the top 50 methods of displays and mailers for each marketing campaign. This is important for Regork’s marketing team so they can identify which method of advertising is most effective within the selected marketing campaigns for these 3 income ranges of shoppers. The top method we find for each campaign is a main display with a mailer that has an interior page feature. This means that these customers are paying attention to the feature within each mailer as well as continuing to go in store and shop at these main displays.

  • This data shows the top 50 methods for each campaign.
coupon_info <- function(campaign_num)
{coupon_data %>%
  inner_join(promotions) %>%
  filter(campaign_id == campaign_num) %>%
  mutate(display_type = case_when(
    display_location == 0 ~ "Display", 
    display_location == 1 ~ "Store Front",
    display_location == 2 ~ "Store Rear",
    display_location == 3 ~ "Front End Cap", 
    display_location == 4 ~ "Mid-aisle End Cap",
    display_location == 5 ~ "Rear End Cap",
    display_location == 6 ~ "Side Aisle End Cap",
    display_location == 7 ~ "In-aisle",
    display_location == 9 ~ "Secondary Display",
    display_location == "A" ~ "In-shelf"
  )) %>%
    mutate(mailer_type = case_when(
    mailer_location == "0" ~ "Not on Add", 
    mailer_location == "A" ~ "Interior Page Feature",
    mailer_location == "C" ~ "Interiror Page Line Item",
    mailer_location == "D" ~ "Front Page Feature", 
    mailer_location == "F" ~ "Back Page Feature",
    mailer_location == "H" ~ "Wrap Front Feature",
    mailer_location == "J" ~ "Wrap Interior Coupon",
    mailer_location == "L" ~ "Wrap Back Feature",
    mailer_location == "P" ~ "Interior Page Coupon",
    mailer_location == "X" ~ "Free on Interior Page",
    mailer_location == "Z" ~ "Free on Any Page"
  )) %>%
  mutate(num_of_displays = nchar(display_type), num_of_mailers = nchar(mailer_type)) %>%
  group_by(display_type, mailer_type) %>%
  summarize(across(c(num_of_displays, num_of_mailers), sum)) %>%
    arrange(desc(num_of_displays))}

Campaign 13

datatable(head(coupon_info("13"), 50), options = list(pageLength = 10),
          colnames = c("Display Type", "Mailer Type", "Number of Displays", "Number of Mailers"))

Campaign 18

datatable(head(coupon_info("18"), 50), options = list(pageLength = 10),
          colnames = c("Display Type", "Mailer Type", "Number of Displays", "Number of Mailers"))

Campaign 8

datatable(head(coupon_info("8"), 50), options = list(pageLength = 10),
          colnames = c("Display Type", "Mailer Type", "Number of Displays", "Number of Mailers"))

Types of Products in Each Campaign

In our next step of analysis, we wanted to identify which types of products were successful in each campaign.

find_data3 <- function(campaign_input) 
  {redeem_data %>%
    filter(campaign_id == campaign_input) %>%
    inner_join(coupon_data, by = "campaign_id") %>%
    inner_join(products, by = "product_id") %>%
    group_by(product_category) %>%
    summarise(redemptions = n()) %>%
    arrange(desc(redemptions)) %>%
    slice(1:3)}

df1 <- find_data3("13")
df2 <- find_data3("18")
df3 <- find_data3("8")


p1 <- ggplot(df1, aes(y = product_category, x = redemptions), positon = position_stack) +
  geom_col(fill = "#11b5e4") +
  labs(
    y = "",
    x = "",
    title = "Total Redemptions for Top 3 Product Categories",
    subtitle = "For Campaign #13"
  ) +
  scale_x_continuous(labels = scales:: comma, limits = c(0, 1600000)) +
  theme(axis.text.y = element_text(angle = 0, vjust = 0.5, hjust = 1)) +
  theme(axis.ticks.x = element_blank())

p2 <- ggplot(df2, aes(y = product_category, x = redemptions)) +
  geom_col(fill = "#1481ba") +
  labs(
    y = "Product Category",
    x = "",
    subtitle = "For Campaign #18",
  ) +
  scale_x_continuous(labels = scales:: comma, limits = c(0, 1600000)) +
  theme(axis.text.y = element_text(angle = 0, vjust = 0.5, hjust = 1)) +
  theme(axis.ticks.x = element_blank())

p3 <- ggplot(df3, aes(y = product_category, x = redemptions)) +
  geom_col(fill = "#01438A") +
  labs(
    y = "",
    x = "Number of Redemptions",
    subtitle = "For Campaign #8",
  ) +
  scale_x_continuous(labels = scales:: comma, limits = c(0, 1600000)) +
  theme(axis.text.y = element_text(angle = 0, vjust = 0.5, hjust = 1))

grid.arrange(p1, p2, p3, nrow = 3)

Through this we see that campaigns 13 and 18 target and are redeemed for similar products. Campaign 8 stands out as it targets more meat product categories than the others. We see that the highest number of coupons redeemed is within the Makeup and Treatment category. This gives us insight that maybe a certain demographic within the 35-99K income range is being effectively targeted. This data helps Regrok understand which products are successful in marketing campaigns and how the financial team may leverage these redemptions in a way to further increase revenue.

How the Product Categories are Discounted

After determining which product categories the campaigns targeted, we wanted to take a look at how these categories are driving sales. Additionally, we want to analyze the current discounts given on these products to help highlight areas the finance team could increase revenues.

discount_finder <- function(data_num, campaign_num)
{data_num %>%
    inner_join(products, by = "product_category") %>%
    inner_join(coupon_data, by = "product_id") %>%
    inner_join(transactions, by = "product_id") %>%
    filter(campaign_id == campaign_num) %>%
    group_by("Product Category" = product_category) %>%
    summarise("Total Sales Value in $1,000" = sum(sales_value)/1000,
              "Average Discount in $" = mean(retail_disc)) %>%
    arrange(desc("Total Sales Value"))
}

Campaign 13

datatable(discount_finder(df1, "13"))

Campaign 18

datatable(discount_finder(df2, "18"))

Campaign 8

datatable(discount_finder(df3, "8"))

When are the Campaigns Effective?

Lastly, we wanted to address when the campaigns are effective by looking at the redemption dates for coupons from these top 3 campaigns.

year_trend <- function(data_num, campaign_num)
{data_num %>% 
    inner_join(products, by = "product_category") %>%
    inner_join(coupon_data, by = "product_id") %>%
    inner_join(transactions, by = "product_id") %>%
    inner_join(redeem_data, by = "campaign_id") %>%
    filter(campaign_id == campaign_num) %>%
    group_by(month_val = month((redemption_date), label = TRUE)) %>%
    summarise(redemptions = n())
}

Campaign 13

year_plot1 <- ggplot(year_trend(df1, "13"), aes(x = month_val, y = redemptions)) +
  geom_col(fill = "#11b5e4") +
  labs(title = "Redemptions Across the Year", 
       x     = 'Months',
       y     = 'Number of Redemptions') +
  scale_y_continuous(labels = scales::comma) +
  geom_text(
    aes(label = scales:: comma(redemptions)),
    vjust = -.5, hjust = .5, size = 4, color = "black", fontface = "bold")
year_plot1

Campaign 18

year_plot2 <- ggplot(year_trend(df1, "18"), aes(x = month_val, y = redemptions)) +
  geom_col(fill = "#1481ba") +
  labs(title = "Redemptions Across the Year", 
       x     = 'Months',
       y     = 'Number of Redemptions') +
  scale_y_continuous(labels = scales::comma) +
  geom_text(
    aes(label = scales:: comma(redemptions)),
    vjust = -.5, hjust = .5, size = 4, color = "black", fontface = "bold")
year_plot2

Campaign 8

year_plot3 <- ggplot(year_trend(df1, "8"), aes(x = month_val, y = redemptions)) +
  geom_col(fill = "#01438A") +
  labs(title = "Redemptions Across the Year", 
       x     = 'Months',
       y     = 'Number of Redemptions') +
  scale_y_continuous(labels = scales::comma) +
  geom_text(
    aes(label = scales:: comma(redemptions)),
    vjust = -.5, hjust = .5, size = 4, color = "black", fontface = "bold")
year_plot3

Summary

Our goal was to identify which marketing campaigns are most successful based of the income ranges that drive sales for Regork. Through this, we looked to find where the company can increase prices or decrease discount values to further increase revenues from marketing campaigns.

Business Problem Solution

To get to the solution, first we had to dig into the demographic data collected by Regork to find what income group was driving sales. After identifying the top 3 income ranges that drive over 50% of revenues, we filtered down the data to show which campaigns were targeting these 3 groups. This allowed us to see which campaigns were most effective in the 3 major income ranges that drives sales for Regork. This data was driven by the number of redemptions for each campaign, not sales value. We decided that the number of redemptions would best represent the effectiveness of the marketing campaigns over the total sales value. Once we pin pointed the marketing campaigns 13, 18, and 8. We re-analyzed the data sets and filtered for the way these campaigns were marketed to customers. We took the top 50 methods to see which forms of advertisement were most effective. We found that for all campaigns, having a main display paired with a interior page feature on the mailer was the top method. This helps the marketing team analyze what is being read and responded to within successful marketing campaigns and how they can incorporate that into less successful campaigns. After looking at the marketing methods, we analyzed the products within each campaign to further understand the demographics within income ranges driving sales. For campaigns 13 and 18, targeted the same product categories where as campaign 8 targeted more food product categories. To get a better understanding of these targeted product categories for each campaign, we looked at the average discount value give to customers for products within these categories. We found that the Makeup and Treatments had the largest discount value but drove the least amount of sales dollars. The Frozen Meats products drove most of the sales dollars with the middle discount value. Lastly we wanted to fully understand the scope of each campaign, so we plotted the redemptions per month to see when these purchases were occurring. This allowed us to fully analyze what marketing campaigns were successful and why, by understanding the time frame, products, and discount values offered by these campaigns.

Our Recommendation

We recommend increasing product prices and or decreasing the average dollar value discounted on products. The exact amounts would have to be calculated by the financial and marketing team to ensure that the correct price changes would maximize revenue while minimizing losses in sales. We also see an opportunity in targeting certain product groups during certain seasons as we saw some trends within the campaign coupon redemptions. Smaller items like makeup, treatments, and hair care all had high redemptions during holiday months. For example, November and December where there are many get-togethers, proms, parties, and holidays celebrated so these items may be used as gifts. There were also increased redemptions for food product categories in May and June where many are having cookouts, graduation parties, and celebrating holidays such as Memorial Day. By following this trend of seasonality, Regork may be able to grow revenues. Additionally, Regork should limit the use of certain ads on mailers to reduce costs and further maximize revenue by focusing on the most effective types and areas.

Limitations to our Analysis

The main limitation to our analysis is the lack of more data. This data set is only for the year of 2017, so in order to fully view trends and effectiveness of campaigns in the long run, we would like more data to be considered. We would also like to view the gender of consumers because this would help the marketing department make even more effective campaigns to a more niche target audience. As we saw in our initial analysis, many coupons were for Makeup and Hair care which we can assume was driven by female customers; however, we would love to get more demographic data to confirm this assumption. We would also like to note that this data is only given by households and customers that provide demographic data so this only sums up a sample of customers. Given a larger data set and more customer involvement in demographics, we feel that this analysis will be more effective in the long run for Regork.