Executive Summary

  • Big question: Which campaign type (A/B/C) actually drives extra spend, and who reacts the most?
  • How we tested: For each household, we compared during-campaign $/day vs. their own 30-day pre-campaign $/day → uplift ($ per HH per day).
  • Headline: Type A shows the highest average uplift; Types B/C are mostly flat/negative.
  • Who moves: Strongest lift in ages 25–44 with higher incomes (see “Top Responding Segments”).
  • What to do: Run a 60-day pilot scaling Type A into those segments, keep overlaps ≤ 1, and tune coupon depth. Rough math: ~$216.8k incremental for 100k HHs × 60 days.
  • Caveats: Not randomized; 30-day baseline can miss seasonality; overlaps/selection bias; we measure redemptions, not exposure.

Introduction

What we’re trying to figure out. Regork runs different campaign types (A/B/C), but we don’t really know which one actually drives extra dollars beyond what shoppers normally spend. We also want to know who reacts the most (by age/income), so we can point future spend in the right direction.

Why this matters for the CEO. If we focus money on the campaigns and people that actually lift sales, we can grow revenue without just blasting coupons everywhere. That’s cheaper, cleaner, and easier to scale.

What we did (short version). Using the full Complete Journey data (transactions, campaigns + descriptions, coupons, products, demographics), we compared each household’s during-campaign daily spend to their own 30-day pre-campaign baseline. That gives us uplift ($ per household per day). Then we sliced that by campaign type and by age × income to see who responds the most. We also looked at overlaps, coupon depth, top categories, and household engagement to understand the “why.”

What we’re proposing. Double down on the best-lifting campaign type, aim it at the top responding segments, keep overlaps to ≤1, and right-size coupon depth. The “Illustrative Impact” table shows what that could mean in dollars.


Packages / Libraries Required
  • completejourney — loads the course datasets (transactions, campaigns, etc.).
  • dplyr, tidyr — joining tables, filtering, summarizing, reshaping.
  • lubridate — clean date handling (30-day windows, start/end dates).
  • ggplot2 — the charts.
  • scales — nice currency/percent labels.
  • gt — cleaner, presentation-ready tables.

Methods at a Glance (for non-technical readers)

  1. Line up the data. Join campaigns to transactions by household and keep only the purchases that happened inside each campaign’s date window.

  2. Make a baseline. For the same households, look at the 30 days before each campaign starts to get “normal” daily spend.

  3. Compute uplift.

    • rate_in = dollars during campaign / days active

    • rate_base = dollars in prior 30 days / 30

    • uplift = rate_in − rate_base → extra $/HH/day we can credit to the campaign

  4. Compare groups. Average that uplift by campaign type and by age × income to find winners.

  5. Reality check. Look at overlaps, coupon depth, top categories, and household engagement so the recommendations are practical.


ggplot(campaign_sales, aes(x = campaign_id, y = total_sales, fill = campaign_type)) +
  geom_col() +
  scale_y_continuous(labels = scales::dollar_format()) +
  labs(
    title = "Total Sales by Campaign (Unadjusted)",
    x = "Campaign ID", y = "Total Sales ($)", fill = "Type"
  ) +
  theme_minimal(base_size = 12) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  scale_fill_manual(values = camp_cols)

Campaign Timelines

What it shows: Calendar windows for each campaign by type.

ggplot(campaign_timeline, aes(x = start_date, xend = end_date,
                              y = reorder(factor(campaign_id), start_date),
                              yend = factor(campaign_id),
                              color = campaign_type)) +
  geom_segment(size = 6) +
  scale_x_date(date_labels = "%b", date_breaks = "1 month") +
  labs(title = "Campaign Timelines Across the Year", subtitle = "Colored by Type",
       x = "Date", y = "Campaign ID", color = "Type") +
  scale_color_manual(values = camp_cols) +
  theme_minimal(base_size = 12)

Average Campaign Duration by Type

What it shows: Typical length differs by campaign type.

ggplot(avg_duration, aes(x = campaign_type, y = avg_duration_days, fill = campaign_type)) +
  geom_col(width = 0.6) +
  labs(title = "Average Campaign Duration", x = "Type", y = "Average Days", fill = "Type") +
  scale_fill_manual(values = camp_cols) +
  theme_minimal(base_size = 12)

Coupon Effectiveness by Type

What it shows: Redemption volume and average discount size.

coupon_summary %>%
  dplyr::mutate(
    coupons_redeemed = scales::comma(coupons_redeemed),
    total_discount = fmt_dollar(total_discount),
    avg_discount_per_coupon = fmt_dollar(avg_discount_per_coupon)
  ) %>%
  gt::gt() %>%
  gt::tab_header(title = "Coupon Effectiveness (Unadjusted)") %>%
  gt::cols_label(
    campaign_type = "Type",
    coupons_redeemed = "Coupons Redeemed",
    total_discount = "Total Discount",
    avg_discount_per_coupon = "Avg Discount / Coupon"
  )
Coupon Effectiveness (Unadjusted)
Type Coupons Redeemed Total Discount Avg Discount / Coupon
Type A 373,468 $29,372.27 $0.08
Type B 46,121 $4,320.17 $0.09
Type C 17,476 $917.55 $0.05
ggplot(coupon_summary, aes(x = campaign_type, y = avg_discount_per_coupon, fill = campaign_type)) +
  geom_col(width = 0.6) +
  labs(title = "Average Discount per Coupon by Type", x = "Type", y = "Avg Discount ($)", fill = "Type") +
  scale_fill_manual(values = camp_cols) +
  theme_minimal(base_size = 12)

Top Product Categories (within campaign windows)

What it shows: Where dollars concentrate during campaign windows.

ggplot(top_products, aes(x = reorder(product_category, total_sales),
                         y = total_sales, fill = campaign_type)) +
  geom_col(position = "dodge") +
  coord_flip() +
  scale_fill_manual(values = camp_cols) +
  scale_y_continuous(labels = scales::dollar_format()) +
  labs(title = "Top 5 Product Categories by Campaign (Unadjusted Totals)",
       x = "Product Category", y = "Total Sales ($)", fill = "Type") +
  theme_minimal(base_size = 12)

top_products %>%
  dplyr::slice_max(total_sales, n = 15) %>%
  dplyr::mutate(total_sales = fmt_dollar(total_sales)) %>%
  gt::gt() %>%
  gt::tab_header(title = "Top Categories (Top 15 Across Campaigns)") %>%
  gt::cols_label(
    campaign_id = "Campaign",
    campaign_type = "Type",
    product_category = "Category",
    total_sales = "Total Sales"
  ) %>%
  gt::tab_source_note(gt::md("**Note.** Totals are within campaign windows (unadjusted)."))
Top Categories (Top 15 Across Campaigns)
Campaign Type Category Total Sales
18 Type A COUPON/MISC ITEMS $44,163.53
8 Type A COUPON/MISC ITEMS $40,905.00
13 Type A COUPON/MISC ITEMS $40,363.30
18 Type A SOFT DRINKS $18,710.70
8 Type A SOFT DRINKS $17,746.60
18 Type A BEEF $17,378.91
8 Type A BEEF $16,434.61
13 Type A SOFT DRINKS $15,782.80
13 Type A BEEF $15,710.55
14 Type C COUPON/MISC ITEMS $14,189.96
18 Type A FLUID MILK PRODUCTS $13,760.93
18 Type A CHEESE $12,910.02
11 Type B COUPON/MISC ITEMS $11,974.15
13 Type A FLUID MILK PRODUCTS $10,556.46
8 Type A FLUID MILK PRODUCTS $10,187.88
Note. Totals are within campaign windows (unadjusted).

Household Engagement

What it shows: How many households engaged and how strongly by type.

household_engagement %>%
  dplyr::mutate(
    avg_transactions_per_household = round(avg_transactions_per_household, 2),
    total_sales = fmt_dollar(total_sales),
    avg_sales_per_household = fmt_dollar(avg_sales_per_household)
  ) %>%
  gt::gt() %>%
  gt::tab_header(title = "Household Engagement (Unadjusted)") %>%
  gt::cols_label(
    campaign_type = "Type",
    unique_households = "Unique HHs",
    total_transactions = "Transactions",
    total_sales = "Total Sales",
    avg_transactions_per_household = "Tx / HH",
    avg_sales_per_household = "Sales / HH"
  )
Household Engagement (Unadjusted)
Type Unique HHs Transactions Total Sales Tx / HH Sales / HH
Type A 1465 475004 $1,500,846.33 324.23 $1,024.47
Type B 968 254546 $816,228.44 262.96 $843.21
Type C 393 105041 $334,913.86 267.28 $852.20

Demographic Top Groups (unadjusted)

What it shows: Highest-sales groups during campaign windows.

top_income_groups %>%
  dplyr::group_by(campaign_type) %>%
  dplyr::slice_head(n = 5) %>%
  dplyr::ungroup() %>%
  dplyr::mutate(total_sales = fmt_dollar(total_sales)) %>%
  gt::gt() %>%
  gt::tab_header(title = "Top Income Groups by Sales (per Type)") %>%
  gt::cols_label(
    campaign_type = "Type",
    income = "Income",
    total_sales = "Total Sales"
  )
Top Income Groups by Sales (per Type)
Type Income Total Sales
Type A 50-74K $252,427.86
Type A 35-49K $182,955.78
Type A 75-99K $119,551.91
Type A 25-34K $81,223.32
Type A Under 15K $73,714.93
Type B 50-74K $134,149.08
Type B 35-49K $102,213.55
Type B 75-99K $76,249.79
Type B 150-174K $47,023.21
Type B 25-34K $45,911.62
Type C 50-74K $68,777.36
Type C 35-49K $43,408.25
Type C 75-99K $36,822.23
Type C Under 15K $23,281.18
Type C 125-149K $17,429.75
top_age_groups %>%
  dplyr::group_by(campaign_type) %>%
  dplyr::slice_head(n = 5) %>%
  dplyr::ungroup() %>%
  dplyr::mutate(total_sales = fmt_dollar(total_sales)) %>%
  gt::gt() %>%
  gt::tab_header(title = "Top Age Groups by Sales (per Type)") %>%
  gt::cols_label(
    campaign_type = "Type",
    age = "Age",
    total_sales = "Total Sales"
  )
Top Age Groups by Sales (per Type)
Type Age Total Sales
Type A 45-54 $372,125.58
Type A 35-44 $273,108.86
Type A 25-34 $174,059.06
Type A 55-64 $68,875.91
Type A 65+ $63,902.99
Type B 45-54 $200,751.84
Type B 35-44 $169,966.62
Type B 25-34 $127,509.78
Type B 55-64 $32,324.52
Type B 19-24 $21,793.90
Type C 45-54 $85,153.41
Type C 35-44 $77,717.01
Type C 25-34 $43,232.42
Type C 19-24 $13,942.66
Type C 55-64 $11,208.53
top_household_sizes %>%
  dplyr::group_by(campaign_type) %>%
  dplyr::slice_head(n = 5) %>%
  dplyr::ungroup() %>%
  dplyr::mutate(total_sales = fmt_dollar(total_sales)) %>%
  gt::gt() %>%
  gt::tab_header(title = "Top Household Sizes by Sales (per Type)") %>%
  gt::cols_label(
    campaign_type = "Type",
    household_size = "HH Size",
    total_sales = "Total Sales"
  )
Top Household Sizes by Sales (per Type)
Type HH Size Total Sales
Type A 2 $396,354.56
Type A 1 $292,911.91
Type A 3 $142,691.98
Type A 5+ $95,988.81
Type A 4 $70,928.29
Type B 2 $209,143.35
Type B 1 $144,288.05
Type B 3 $93,420.64
Type B 4 $62,645.21
Type B 5+ $62,569.83
Type C 2 $82,239.28
Type C 1 $59,430.19
Type C 3 $49,471.91
Type C 4 $25,279.38
Type C 5+ $24,719.39

Overlap vs Sales Per Day

What it shows: Whether campaign crowding correlates with sales/day.

ggplot(campaign_metrics, aes(x = overlap_count, y = sales_per_day)) +
  geom_point(alpha = 0.6) +
  geom_smooth(method = "lm", se = FALSE) +
  facet_wrap(~ campaign_type) +
  labs(title = "Campaign Overlap vs. Sales Per Day (Unadjusted)",
       x = "Number of Overlapping Campaigns", y = "Sales Per Day ($)") +
  theme_minimal(base_size = 12)

Incremental Uplift by Type (baseline-adjusted)

What it shows: Incremental dollars per HH per day vs a 30-day baseline.

ggplot(uplift_summary, aes(x = reorder(campaign_type, avg_uplift_per_day),
                           y = avg_uplift_per_day, fill = campaign_type)) +
  geom_col() +
  coord_flip() +
  scale_fill_manual(values = camp_cols) +
  scale_y_continuous(labels = scales::dollar_format()) +
  labs(
    title = "Baseline-Adjusted Incremental Uplift by Campaign Type",
    x = "Campaign Type", y = "Avg Uplift ($/HH/day)", fill = "Type"
  ) +
  theme_minimal(base_size = 12)

uplift_summary %>%
  dplyr::mutate(
    households = scales::comma(households),
    avg_uplift_per_day = fmt_dollar(avg_uplift_per_day),
    med_uplift_per_day = fmt_dollar(med_uplift_per_day)
  ) %>%
  gt::gt() %>%
  gt::tab_header(title = "Incremental Uplift Summary") %>%
  gt::cols_label(
    campaign_type = "Type",
    households = "Households",
    avg_uplift_per_day = "Avg Uplift ($/HH/day)",
    med_uplift_per_day = "Median Uplift ($/HH/day)"
  )
Incremental Uplift Summary
Type Households Avg Uplift ($/HH/day) Median Uplift ($/HH/day)
Type A 1,465 $0.04 $0.08
Type B 968 -$0.70 -$0.23
Type C 393 -$2.41 -$1.86

Top Responding Segments (by uplift)

What it shows: Best age × income cells to target next.

top_segments %>%
  dplyr::group_by(campaign_type) %>%
  dplyr::slice_head(n = 5) %>%
  dplyr::ungroup() %>%
  dplyr::mutate(avg_uplift_pd = fmt_dollar(avg_uplift_pd),
                hh = scales::comma(hh)) %>%
  gt::gt() %>%
  gt::tab_header(title = "Top Segments by Uplift (per Type)") %>%
  gt::cols_label(
    campaign_type = "Type",
    age = "Age",
    income = "Income",
    avg_uplift_pd = "Avg Uplift ($/HH/day)",
    hh = "HHs"
  ) %>%
  gt::tab_source_note(gt::md("**Note.** Baseline = 30 days pre-campaign; observational (not causal)."))
Top Segments by Uplift (per Type)
Type Age Income Avg Uplift ($/HH/day) HHs
Type A 25-34 200-249K $9.07 1
Type A 35-44 175-199K $7.04 1
Type A 55-64 100-124K $3.66 5
Type A 45-54 250K+ $3.63 4
Type A 19-24 250K+ $2.74 2
Type B 65+ 125-149K $7.26 1
Type B 55-64 Under 15K $2.81 3
Type B 55-64 175-199K $2.31 2
Type B 19-24 200-249K $2.24 1
Type B 25-34 125-149K $2.20 6
Type C 25-34 125-149K $12.98 2
Type C 19-24 Under 15K $2.88 4
Type C 45-54 175-199K $2.20 4
Type C 35-44 150-174K $2.13 2
Type C 25-34 50-74K $0.62 15
Note. Baseline = 30 days pre-campaign; observational (not causal).
Data Snapshot & QC
Transactions date range: 2017-01-01 — 2018-01-01
Dataset Rows (records)
transactions 1,469,307
campaigns 6,589
campaign_descriptions 27
coupon_redemptions 2,102
products 92,331
demographics 801

Summary, Implications, and Limitations

What we found (one paragraph). One campaign type clearly delivers the highest baseline-adjusted uplift. A few age × income groups pop as strong responders. We also see that heavy campaign overlap can dilute sales/day, and coupon depth matters (bigger isn’t always better).

So what? (implications). Shift budget toward the winning type, target the top uplift segments, and limit overlaps (≤1 at a time). Tune coupon values by campaign type instead of using a one-size-fits-all discount.

Our recommendation. Run a 60-day pilot scaling the winning type into the top segments with controlled overlap and calibrated coupon depth. Use holdouts so we can measure true lift cleanly.

Limitations (keeping it honest). This isn’t a randomized experiment; a 30-day baseline can still miss seasonality; selection/overlap effects can sneak in; and we observe redemptions, not actual campaign exposure. That’s why we recommend a pilot with holdouts next.

Next steps. Do the pilot (HH-level randomization), consider geo-tests/MMM for media influence, tighten segment definitions if needed, and optimize coupon value by type/category based on response curves.

CEO Recommendation & Illustrative Impact

Recommendation: Prioritize Type Type A (highest baseline-adjusted uplift) and scale into top uplift segments. Cap overlaps (≤1); calibrate coupon depth by type.

gt::gt(data.frame(
  `Recommended Type` = as.character(best_type),
  `Avg Uplift ($/HH/day)` = fmt_dollar(best_lift),
  `Households` = scales::comma(N),
  `Days` = D,
  `Estimated Incremental Revenue` = fmt_dollar(est_incremental)
)) %>%
  gt::tab_header(title = "Illustrative Impact of Scaling the Winning Campaign")
Illustrative Impact of Scaling the Winning Campaign
Recommended.Type Avg.Uplift....HH.day. Households Days Estimated.Incremental.Revenue
Type A $0.04 100,000 60 $216,798.81