1 Executive Summary

This report analyzes coupon redemption across income groups for Regork. Key results challenge common assumptions and point to a higher-ROI targeting strategy.

At a glance

  • High-income households ($100K+) redeem coupons most often.
  • When these customers use coupons, their baskets are larger vs. trips without coupons.
  • Engagement peaks in Q4; timing matters.
  • Coupon spend concentrates on family essentials (diapers, bath tissue, detergents).

What we did

  • Integrated Complete Journey transactions, demographics, and promotions.
  • Segmented households by income and measured redemption, basket impact, seasonality, and category mix.

What to do next

  • Rebalance budget toward affluent segments; emphasize Q4.
  • Prioritize premium family essentials with bundles and brand partnerships.
  • Track segment redemption, basket lift, and ROI monthly.

2 Strategic Summary and Recommendations

Context

  • Promotional ROI depends on knowing which customers respond, when, and on what. Conventional wisdom favors lower-income focus; our data shows otherwise.

Method (brief)

  • Combined Complete Journey transactions, demographics, and coupons.
  • Segmented households into Low, Middle, High income; computed redemption, basket lift, seasonality, and category mix.

Key insights

  • High-income ($100K+) households lead coupon redemption.
  • Coupons increase their average basket value vs. non-coupon trips.
  • Engagement peaks in Q4.
  • Spend concentrates on family essentials (diapers, bath tissue, detergents).

What to do

  • Reallocate more budget to affluent segments; emphasize Q4.
  • Build premium bundles and brand partnerships in top categories.
  • Reduce broad, low-ROI discounting; focus on targeted offers.
  • Stand up tracking for segment redemption, basket lift, and ROI.

2.1 Analysis Limitations and Future Research Opportunities

2.1.1 Current Analysis Constraints

Temporal Scope: Analysis covers historical patterns; ongoing validation required to confirm sustained behavioral trends in evolving market conditions.

Geographic Limitations: Dataset represents specific regional markets; expansion to national analysis recommended for comprehensive strategic planning.

Competitive Context: Analysis doesn’t account for competitor promotional activities that may influence customer behavior patterns.

2.1.3 Strategic Implementation Success Metrics

  • Customer Segment Performance: Track redemption rates and basket values by income group quarterly
  • Revenue Impact Measurement: Monitor incremental sales growth from targeted promotional campaigns
  • Market Share Enhancement: Assess competitive positioning improvements in high-value customer segments
  • Operational Efficiency: Measure promotional ROI improvements through precision targeting

2.2 Conclusion

High-income households are the most responsive to coupons and show larger baskets when engaged. Concentrate promotions on this segment, time major pushes in Q4, and emphasize family essentials with targeted, premium-aligned offers. Track redemption, basket lift, and ROI to iterate.


3 Introduction

3.1 Background and objective

  • Promotional ROI depends on targeting the right customers at the right time with the right offers.
  • We test common assumptions by measuring coupon response across income groups and its impact on basket size, timing, and categories.

3.2 Key questions

  • Which income segments respond most to coupons?
  • Do coupons increase basket value or merely discount purchases?
  • When are responses strongest?
  • Which categories lead coupon-driven spend?

3.3 Methodology (condensed)

  • Data: Complete Journey transactions, demographics, promotions.
  • Segmentation: Low (<$50K), Middle ($50–99K), High ($100K+); excluded missing income.
  • Measures: redemption rate, basket value with/without coupons, monthly seasonality, category mix.

3.4 Decision value

  • Target offers where response and lift are highest.
  • Align timing with peak months (Q4).
  • Emphasize categories that drive incremental spend.

4 Required Libraries and Environment Setup

This analysis leverages several specialized R packages to process, analyze, and visualize grocery transaction data. All library loading messages and warnings are suppressed to maintain report clarity while ensuring robust analytical capabilities.

# Core Data Manipulation and Analysis Libraries
library(dplyr)        # Essential for data manipulation, filtering, grouping, and summarization
                      # Enables efficient handling of large transaction datasets
library(lubridate)    # Specialized date and time manipulation for temporal analysis
                      # Critical for seasonal pattern identification and monthly trending
library(tidyr)        # Data reshaping and cleaning utilities for consistent data structure

# Advanced Data Visualization Libraries  
library(ggplot2)      # Grammar of graphics plotting system for professional visualizations
                      # Provides layered approach to creating publication-quality charts
library(scales)       # Scale functions for ggplot2 formatting (currency, percentages, commas)
                      # Ensures proper formatting of financial and statistical displays

# Specialized Dataset Access
library(completejourney) # Complete Journey grocery transaction dataset
                         # Industry-standard retail analytics dataset containing:
                         # - 2+ years of transaction history from major grocery retailer
                         # - Household demographics including income classifications  
                         # - Product catalog with 92,000+ UPCs across all categories
                         # - Coupon redemption records with promotion details
                         # - Essential for realistic grocery retail analysis

# Professional Report Formatting Libraries
library(knitr)        # Dynamic report generation and reproducible research
                      # Enables seamless integration of R code with markdown narratives
library(kableExtra)   # Enhanced table formatting with Bootstrap styling
                      # Creates publication-ready tables with professional appearance

# Establish Consistent Visual Theme for Professional Presentation
# This theme ensures all visualizations maintain consistent, executive-ready formatting
theme_set(theme_minimal() + 
          theme(plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
                plot.subtitle = element_text(hjust = 0.5, size = 12),
                legend.position = "bottom"))

4.0.1 Why These Libraries Matter for Regork’s Analysis

Data Processing Power: The dplyr and tidyr combination enables efficient processing of millions of transaction records, essential for enterprise-scale grocery analytics.

Temporal Intelligence: lubridate provides sophisticated date handling capabilities crucial for identifying seasonal shopping patterns and optimal promotion timing.

Industry-Standard Dataset: The completejourney package provides access to real grocery retail data, ensuring our analysis reflects actual customer behavior patterns rather than synthetic examples.

Executive Presentation: ggplot2 and kableExtra deliver publication-quality visualizations and tables suitable for C-suite presentation and strategic decision-making.


5 Strategic Analysis: Coupon Response by Customer Segment

The following analysis addresses Regork’s core strategic question by examining coupon redemption patterns across income groups. This investigation challenges conventional assumptions about coupon usage and reveals opportunities for optimized promotional targeting.

5.1 Phase 1: Income Segmentation and Redemption Rate Analysis

Our first analytical phase establishes customer income segments and measures their relative responsiveness to coupon promotions. This foundational analysis will identify which income groups demonstrate the highest engagement with promotional offers.

# Create income group classifications
demographics_clean <- demographics %>%
  mutate(
    income_group = case_when(
      income %in% c("Under 15K", "15-24K", "25-34K", "35-49K") ~ "Low Income",
      income %in% c("50-74K", "75-99K") ~ "Middle Income", 
      income %in% c("100-124K", "125-149K", "150-174K", "175-199K", "200-249K", "250K+") ~ "High Income",
      TRUE ~ "Unknown"
    )
  ) %>%
  filter(income_group != "Unknown")

# Prepare coupon redemption data with dates
coupon_data <- coupon_redemptions %>%
  left_join(coupons, by = "coupon_upc") %>%
  mutate(
    month = month(redemption_date, label = TRUE),
    year = year(redemption_date)
  )

# Calculate redemption rates by income group to identify highest redeeming group
household_coupon_summary <- coupon_data %>%
  group_by(household_id) %>%
  summarise(
    total_coupons_redeemed = n(),
    .groups = "drop"
  )

household_analysis <- demographics_clean %>%
  left_join(household_coupon_summary, by = "household_id") %>%
  mutate(
    is_coupon_user = !is.na(total_coupons_redeemed),
    total_coupons_redeemed = replace_na(total_coupons_redeemed, 0)
  )

redemption_by_income <- household_analysis %>%
  group_by(income_group) %>%
  summarise(
    redemption_rate = round(mean(is_coupon_user) * 100, 1),
    .groups = "drop"
  ) %>%
  arrange(desc(redemption_rate))

# Display comprehensive redemption rate analysis
kable(redemption_by_income, 
      caption = "Table 1: Coupon Redemption Rates by Income Segment - Revealing High-Income Leadership",
      col.names = c("Income Group", "Redemption Rate (%)")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE,
                position = "center") %>%
  add_header_above(c("Strategic Customer Segmentation Analysis" = 2)) %>%
  footnote(general = "Source: Complete Journey Dataset Analysis | Higher percentages indicate greater promotional responsiveness",
           general_title = "Note: ")
Table 1: Coupon Redemption Rates by Income Segment - Revealing High-Income Leadership
Strategic Customer Segmentation Analysis
Income Group Redemption Rate (%)
High Income 45
Middle Income 41
Low Income 31
Note:
Source: Complete Journey Dataset Analysis | Higher percentages indicate greater promotional responsiveness
# Create professional coupon redemption rate visualization
ggplot(redemption_by_income, 
       aes(x = reorder(income_group, redemption_rate), 
           y = redemption_rate, fill = income_group)) +
    geom_col(show.legend = FALSE, width = 0.6, alpha = 0.8) +
    geom_text(aes(label = paste0(redemption_rate, "%")), 
              hjust = -0.1, size = 4.5, fontface = "bold") +
    coord_flip() +
    scale_y_continuous(expand = expansion(mult = c(0, 0.18))) +
    scale_fill_manual(values = c("Low Income" = "#e74c3c", 
                                "Middle Income" = "#f39c12", 
                                "High Income" = "#27ae60")) +
    labs(
        title = "Strategic Discovery: High-Income Households Lead Coupon Redemption",
        subtitle = "Challenging conventional assumptions about promotional responsiveness",
        x = "Customer Income Segment",
        y = "Redemption Rate (%)",
        caption = "Figure 1: Analysis challenges traditional marketing assumptions | Source: Complete Journey Dataset"
    ) +
    theme(
        panel.grid = element_blank(),
        axis.text.y = element_text(size = 12, face = "bold"), 
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.title.x = element_text(size = 11, face = "bold"),
        axis.title.y = element_text(size = 11, face = "bold"),
        plot.title = element_text(hjust = 0.5, size = 15, face = "bold"),
        plot.subtitle = element_text(hjust = 0.5, size = 12, color = "gray30"),
        plot.caption = element_text(size = 10, color = "gray50", hjust = 0)
    )

5.1.0.1 Key takeaways

  • High-income households lead coupon redemption—contrary to conventional wisdom.
  • Strategies focused only on lower-income segments risk missing higher-ROI audiences.
  • Immediate opportunity to reallocate budget toward affluent customers.

5.1.0.2 Actions

  1. Shift budget toward high-income acquisition and retention.
  2. Position offers as exclusive/value-add (not pure discounting).
  3. Use targeted channels to reach affluent segments.

5.2 Phase 2: Basket Value Impact Analysis

Having identified the most responsive income segment, we now examine whether coupon usage drives incremental spending or simply discounts existing purchase patterns. This analysis is crucial for understanding the true ROI of promotional investments.

5.2.1 High-Income Household Basket Analysis

# 1) High Income households
high_income_demo <- demographics_clean %>%
  filter(income_group == "High Income")

# 2) Transactions for those households + coupon flag (coupon_disc > 0)
transactions <- get_transactions()
tx_hi <- transactions %>%
  semi_join(high_income_demo, by = "household_id") %>%
  mutate(has_coupon = coupon_disc > 0 | coupon_match_disc > 0)

# 3) Aggregate to basket level
baskets_hi <- tx_hi %>%
  group_by(household_id, basket_id) %>%
  summarise(
    basket_spend = sum(sales_value, na.rm = TRUE),
    any_coupon   = any(has_coupon),
    .groups = "drop"
  )

# 4) Avg basket spend WITH vs WITHOUT coupon 
avg_hi <- baskets_hi %>%
  mutate(coupon_flag = if_else(any_coupon, "With Coupon", "Without Coupon")) %>%
  group_by(coupon_flag) %>%
  summarise(
    avg_spend = mean(basket_spend, na.rm = TRUE),
    n_baskets = n(),
    .groups = "drop"
  )


# 5) Create professional basket spend comparison visualization
ggplot(avg_hi, aes(x = coupon_flag, y = avg_spend, fill = coupon_flag)) +
  geom_col(show.legend = FALSE, width = 0.6, alpha = 0.8) +
  geom_text(aes(label = scales::dollar(round(avg_spend, 2))),
            vjust = -0.35, size = 5, fontface = "bold") +
  scale_y_continuous(expand = expansion(mult = c(0, 0.15)),
                     labels = scales::dollar_format()) +
  scale_fill_manual(values = c("With Coupon" = "#27ae60", 
                              "Without Coupon" = "#e74c3c")) +
  labs(
    title = "Coupon Impact: High-Income Household Basket Value Analysis",
    subtitle = "Demonstrating revenue amplification effect of promotional engagement",
    x = "Shopping Behavior",
    y = "Average Basket Value ($)",
    caption = "Figure 2: Coupons drive larger baskets among affluent customers | Source: Complete Journey Dataset"
  ) +
  theme(
    panel.grid.major.y = element_line(color = "gray90", linetype = "dotted"),
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank(),
    axis.text.y = element_text(size = 11),
    axis.text.x = element_text(size = 12, face = "bold"),
    axis.title = element_text(size = 11, face = "bold"),
    plot.title = element_text(hjust = 0.5, size = 15, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12, color = "gray30"),
    plot.caption = element_text(size = 10, color = "gray50", hjust = 0)
  )

5.2.1.1 Key takeaways

  • High-income households are the most responsive to coupons.
  • When they use coupons, their average basket is larger—coupons act as purchase amplifiers.

5.2.1.2 Actions

  1. Rebalance promotion toward affluent segments.
  2. Focus on higher-margin, family-essential categories.
  3. Frame offers as exclusive value (not deep discounts).
  4. Create cross-category bundles that mirror how these customers shop.

5.3 Phase 3: Temporal Optimization - Seasonality Analysis

With high-income households identified as the priority segment, we now examine when these valuable customers are most responsive to promotional offers. Understanding seasonal patterns enables Regork to optimize campaign timing for maximum impact and ROI.

5.3.1 Monthly Redemption Patterns for Strategic Planning

# Identify the highest redeeming income group
highest_redeeming_group <- redemption_by_income %>%
  slice_max(redemption_rate, n = 1) %>%
  pull(income_group)

# cat("Highest redeeming group:", highest_redeeming_group, "\n")

# Get households from highest redeeming group
high_redeeming_households <- household_analysis %>%
  filter(income_group == highest_redeeming_group, is_coupon_user) %>%
  pull(household_id)

# Monthly coupon redemption patterns for highest redeeming group
monthly_seasonality <- coupon_data %>%
  filter(household_id %in% high_redeeming_households) %>%
  group_by(month, year) %>%
  summarise(
    total_redemptions = n(),
    unique_households = n_distinct(household_id),
    .groups = "drop"
  ) %>%
  group_by(month) %>%
  summarise(
    avg_monthly_redemptions = round(mean(total_redemptions), 0),
    .groups = "drop"
  ) %>%
  mutate(
    month_num = as.numeric(month)
  ) %>%
  arrange(month_num)

# Display comprehensive seasonality analysis table
kable(monthly_seasonality %>% select(-month_num),
      caption = paste("Table 2: Monthly Seasonality Patterns -", highest_redeeming_group, "Customer Segment"),
      col.names = c("Month", "Average Monthly Redemptions")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE,
                position = "center") %>%
  add_header_above(c("Temporal Intelligence for Campaign Optimization" = 2)) %>%
  footnote(general = "Seasonal patterns enable precision timing of promotional investments for maximum ROI",
           general_title = "Strategic Insight: ")
Table 2: Monthly Seasonality Patterns - High Income Customer Segment
Temporal Intelligence for Campaign Optimization
Month Average Monthly Redemptions
Jan 44
Feb 592
Mar 452
Apr 263
May 50398
Jun 11470
Jul 728
Aug 50241
Sep 47071
Oct 6194
Nov 129249
Dec 7253
Strategic Insight:
Seasonal patterns enable precision timing of promotional investments for maximum ROI
# Create the main seasonality graph
seasonality_plot <- monthly_seasonality %>%
  ggplot(aes(x = month_num, y = avg_monthly_redemptions)) +
  geom_line(size = 1.5, color = "#2c3e50", alpha = 0.9) +
  geom_point(size = 4, color = "#e74c3c") +
  geom_smooth(method = "loess", se = TRUE, alpha = 0.3, color = "#3498db", linetype = "dashed") +
  scale_x_continuous(
    breaks = 1:12,
    labels = month.abb,
    limits = c(1, 12)
  ) +
  scale_y_continuous(labels = comma_format()) +
  labs(
    title = paste("Seasonal Intelligence:", highest_redeeming_group, "Coupon Redemption Patterns"),
    subtitle = "Clear Q4 peak performance enables strategic campaign timing optimization",
    x = "Month",
    y = "Average Monthly Redemptions",
    caption = "Figure 3: Temporal analysis reveals optimal promotional windows | Source: Complete Journey Dataset"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    plot.subtitle = element_text(size = 13, hjust = 0.5, color = "gray30"),
    axis.text = element_text(size = 12),
    axis.title = element_text(size = 13, face = "bold"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_line(color = "gray90", linetype = "dotted"),
    panel.grid.major.y = element_line(color = "gray90", linetype = "dotted"),
    plot.caption = element_text(size = 10, color = "gray50")
  )

print(seasonality_plot)

# Calculate peak and low months for analysis
peak_month <- monthly_seasonality %>% slice_max(avg_monthly_redemptions, n = 1)
low_month <- monthly_seasonality %>% slice_min(avg_monthly_redemptions, n = 1)

5.3.1.1 Key takeaways

  • Peak month: Nov with 1.29249^{5} average redemptions.
  • Low month: Jan with 44 (≈ 2937.48x difference).

5.3.1.2 Actions

  1. Emphasize Q4 budgets and major campaigns.
  2. Shift spend from low months to peak months.
  3. Plan inventory and supplier funding against peak windows.

5.4 Phase 4: Product Category Intelligence

The final phase of our analysis examines what high-income households purchase with coupons, providing Regork with specific product category insights for tactical promotional development.

5.4.1 Top-Performing Categories Among High-Value Customers

#get the transactions that were redeemed by coupons for the highest redeeming households
coupon_transactions <- transactions %>%
  filter((coupon_disc > 0 | coupon_match_disc >0), household_id %in% high_redeeming_households)

# Create professional product category analysis visualization
coupon_transactions %>%
  inner_join(products) %>%
  group_by(product_category) %>%
  summarize(sales = sum(sales_value, na.rm = TRUE), .groups = "drop") %>%
  arrange(desc(sales)) %>%
  slice(1:5) %>%
  ggplot(aes(x = reorder(product_category, sales), y = sales, fill = product_category)) +
  geom_col(show.legend = FALSE, width = 0.7, alpha = 0.8) +
  coord_flip() +
  geom_text(aes(label = scales::dollar(sales)), 
            hjust = -0.05, size = 4.5, fontface = "bold") +
  scale_y_continuous(expand = expansion(mult = c(0, 0.15)),
                     labels = scales::dollar_format(scale = 1e-3, suffix = "K")) +
  scale_fill_brewer(type = "qual", palette = "Set2") +
  labs(
    title = paste("Product Category Intelligence:", highest_redeeming_group, "Coupon-Driven Sales Leadership"),
    subtitle = "Family essentials dominate promotional spending among affluent households",
    x = "Product Category",
    y = "Total Coupon-Driven Sales Value",
    caption = "Figure 4: Strategic categories for targeted promotional development | Source: Complete Journey Dataset"
  ) +
  theme(
    panel.grid.major.x = element_line(color = "gray90", linetype = "dotted"),
    panel.grid.major.y = element_blank(),  
    panel.grid.minor = element_blank(),
    axis.text.y = element_text(size = 11, face = "bold"),
    axis.text.x = element_text(size = 10),
    axis.title = element_text(size = 11, face = "bold"),
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5, size = 12, color = "gray30"),
    plot.caption = element_text(size = 10, color = "gray50", hjust = 0)
  )

5.4.1.1 Key takeaways

  1. Essentials dominate: diapers, bath tissue, soft drinks, cold cereal, detergents.
  2. Affluent shoppers use coupons on high-frequency needs, not discretionary items.

5.4.1.2 Actions

  1. Direct coupons to essential household items
  2. Bundle offers (e.g., Cereal + Milk; Diapers + Wipes) to grow basket size.

6 Final Takeaways

  1. High Income groups are the highest coupon redeemers
  2. Their basket spend increases by 146% with coupons
  3. November is the highest coupon redemption month
  4. Diapers, Bath Tissues, Soft Drinks, Cold Cereal and Laundry Detergents are the highest product categories they spend their coupons on

7 Recommendations

  1. Direct coupons towards the high-income segment
  2. Increase coupons count in November to see an increase in sales
  3. Bundle products like Cereal + Milk, Diapers + Wipes to increase basket size