Regork is looking to optimize its promotional strategies to encourage customer engagement and increase sales. The important question to ask is whether promotional efforts should be focused on loyalty programs or coupons to maximize their impact. Understanding which method drives more purchases in different demographics will help Regork allocate its marketing more effectively. By addressing this issue, the CEO can make data-driven decisions to enhance customer retention and revenue growth.
To find the answer to our problem, transaction data was combined with product and household information to assess purchasing behavior. An equation was then added to see whether a transaction used a coupon or loyalty discount, as well as customer demographics. With data visualization, we were able to see trends between different customer segments to find which promotional strategy had the greatest impact.
The results from this data analysis will help the Regork CEO better understand which promotional method drives more sales and engagement. If loyalty programs are more effective, Regork should increase member benefits and targeted incentives. If coupons generate higher sales, increasing targeted coupon distribution could be more beneficial. Our analysis will ensure that promotional spending is optimized to maximize customer retention and overall profitability.
# Packages
library(completejourney) # grocery store shopping data
library(dplyr) # transforming and manipulating data
library(ggplot2) # visualize data plot system
library(ggrepel) # re-position overlapping text labels with "ggplot2"
library(tidyverse) # tidying data and incorporates other R packages
In order to prepare the data, the full list of transactions needs to be retrieved and assigned to a dataframe for ease of access.
transactions <- get_transactions()
transactions
## # A tibble: 1,469,307 × 11
## household_id store_id basket_id product_id quantity sales_value retail_disc
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 900 330 31198570044 1095275 1 0.5 0
## 2 900 330 31198570047 9878513 1 0.99 0.1
## 3 1228 406 31198655051 1041453 1 1.43 0.15
## 4 906 319 31198705046 1020156 1 1.5 0.29
## 5 906 319 31198705046 1053875 2 2.78 0.8
## 6 906 319 31198705046 1060312 1 5.49 0.5
## 7 906 319 31198705046 1075313 1 1.5 0.29
## 8 1058 381 31198676055 985893 1 1.88 0.21
## 9 1058 381 31198676055 988791 1 1.5 1.29
## 10 1058 381 31198676055 9297106 1 2.69 0
## # ℹ 1,469,297 more rows
## # ℹ 4 more variables: coupon_disc <dbl>, coupon_match_disc <dbl>, week <int>,
## # transaction_timestamp <dttm>
This section of the data preparation combines all the required dataframes that are used with the visualizations in this R project.
prod_trans <- transactions %>%
inner_join(products, by = 'product_id')
grocery_counts <- prod_trans %>%
filter(department == "GROCERY") %>%
group_by(brand) %>%
summarize(count = n()) %>%
filter(brand %in% c("Private", "National"))
grocery_discounts <- prod_trans %>%
filter(department == "GROCERY") %>%
mutate(
used_coupon = ifelse(coupon_disc > 0, "Yes", "No"),
used_loyalty = ifelse(retail_disc > 0, "Yes", "No")
) %>%
group_by(brand, used_coupon, used_loyalty) %>%
summarise(count = n(), .groups = "drop") %>%
filter(brand %in% c("Private", "National"))
private_grocery_loyalty_income <- prod_trans %>%
inner_join(demographics, by = "household_id") %>%
mutate(used_loyalty = ifelse(retail_disc > 0, "Yes", "No")) %>%
filter(department == "GROCERY") %>%
group_by(brand, income, used_loyalty) %>%
summarise(count = n(), .groups = "drop") %>%
filter(brand %in% c("Private"))
national_grocery_loyalty_income <- prod_trans %>%
inner_join(demographics, by = "household_id") %>%
mutate(used_loyalty = ifelse(retail_disc > 0, "Yes", "No")) %>%
filter(department == "GROCERY") %>%
group_by(brand, income, used_loyalty) %>%
summarise(count = n(), .groups = "drop") %>%
filter(brand %in% c("National"))
private_grocery_loyalty_kids <- prod_trans %>%
inner_join(demographics, by = "household_id") %>%
mutate(used_loyalty = ifelse(retail_disc > 0, "Yes", "No")) %>%
filter(department == "GROCERY") %>%
group_by(brand, kids_count, used_loyalty) %>%
summarise(count = n(), .groups = "drop") %>%
filter(brand %in% c("Private"))
national_grocery_loyalty_kids <- prod_trans %>%
inner_join(demographics, by = "household_id") %>%
mutate(used_loyalty = ifelse(retail_disc > 0, "Yes", "No")) %>%
filter(department == "GROCERY") %>%
group_by(brand, kids_count, used_loyalty) %>%
summarise(count = n(), .groups = "drop") %>%
filter(brand %in% c("National"))
This bar graph compares the number of grocery purchases between private and national brands. Understanding which brand type is more frequently purchased helps identify consumer preferences. If private label brands have higher sales, Regork may benefit from expanding its private brand selection. If national brands dominate, negotiating better supplier deals could be a key strategy. This insight helps align promotions with consumer demand.
This stacked bar chart examines how frequently coupons are used for private versus national brand groceries. Coupons are a key promotional tool, and this graph helps determine if discount strategies drive sales for specific brands. If one brand type benefits more from coupon usage, Regork can adjust its promotional efforts accordingly. Understanding coupon effectiveness ensures that marketing budgets are used efficiently to maximize customer engagement.
This visualization highlights the impact of loyalty rewards on grocery purchases, split by brand type. If loyalty programs drive higher sales for a specific brand, Regork can strengthen its reward partnerships for those products. If national brands benefit more, exclusive loyalty discounts for private brands could encourage more purchases. The goal is to optimize loyalty rewards to increase overall basket size and retention.
These histograms analyze income levels of customers using loyalty rewards for private and national brands. Income-based insights help Regork determine if loyalty programs appeal to higher or lower-income shoppers. If lower-income groups use loyalty programs more, discount-driven strategies may be more effective. If higher-income shoppers use them, premium loyalty benefits could be explored. This analysis ensures loyalty rewards are tailored to the right customer segments.
These bar graphs explore how family size (number of kids) influences loyalty reward usage for different brand types. If larger households use loyalty rewards more, promotions could be targeted toward bulk purchases or family-sized products. If there is no clear trend, Regork may need broader engagement strategies for families. Understanding the relationship between household size and loyalty usage allows for better-targeted promotions to increase repeat purchases.
The primary business problem addressed in this analysis is determining the effectiveness of Regork’s loyalty rewards program and its influence on consumer purchasing behavior. Specifically, the analysis explores whether customers are more likely to use loyalty rewards or coupons when purchasing private vs. national brand groceries and how factors such as income and household size impact loyalty program participation. Understanding these trends can help optimize promotional efforts and drive consumer engagement.
To analyze the impact of loyalty rewards and coupons on grocery purchases, we used transaction data containing details on brand type, total sales, coupon usage, loyalty program participation, income range, and household size. The methodology involved data filtering and grouping to categorize purchases based on these factors, followed by visualizing trends using bar graphs and histograms. The analysis focused on comparing loyalty program usage across demographics, highlighting which income groups and household sizes engage most with the program.
The analysis revealed that loyalty rewards are used more often than not, suggesting they play a major role in consumer purchasing decisions. The two income groups with the highest engagement were $50K–$74K and $35K–$49K, indicating that middle-income consumers are the most active users of loyalty rewards. Additionally, households without children (0 kids) showed the highest participation in the program, meaning single-person or child-free households may be the most engaged segment. While both private and national brands benefit from loyalty usage, these insights suggest opportunities to further refine promotional strategies to target key demographic groups more effectively.
Given that loyalty rewards significantly impact consumer spending, Regork should expand and enhance its loyalty program to further incentivize purchases. Since middle-income consumers ($35K–$74K) are the most engaged, targeted promotions such as cash-back rewards or tiered discounts for this income range could be beneficial. Additionally, since households without children are the most frequent users, the program could introduce more promotions on single-serving or convenience-focused grocery items. These strategic adjustments will help increase customer retention, drive higher sales, and strengthen brand loyalty.
One limitation is that the analysis does not account for external factors like competitor promotions, seasonal trends, or product availability, which could influence purchasing behavior. Additionally, while we identified income and household size trends, we did not examine purchase frequency or customer lifetime value, which could provide deeper insights. Future research could incorporate customer surveys, competitor benchmarking, or time-series analysis to provide a more comprehensive view. Expanding the dataset to include purchase frequency and basket composition would further refine recommendations and improve the accuracy of the findings.