1 Introduction

Regork (a national grocery chain) invests heavily in promotions: in-store displays, mailers, and coupons. Promotions are intended to increase sales, but discounts can also erode revenue and profit if they do not generate enough incremental volume.

This report answers one focused business question:

Which customer segments respond profitably to promotions (campaigns/coupons), and where are we over-discounting with little incremental lift?

1.1 Why the CEO should care

Promotions are a major, controllable lever for growth. If discounting is not paying back in incremental revenue (and especially if it is not paying back in margin), then:

  • revenue growth becomes “expensive,”
  • profitability suffers, and
  • the organization may be training customers to buy only when items are discounted.

Finding which segments (and discount tools) are most efficient lets Regork reallocate promotional spend toward higher-return activity.

1.2 What I did (data + methodology)

Data used (2 required tables):

  • transactions: customer purchases (household, basket, product, quantities, net sales, discount components, week)
  • promotions: product-store-week display/mailer placements

Key engineering:

  • Created a gross sales proxy = sales_value + total_discount (since list price isn’t directly provided).
  • Flagged promotion exposure per product-store-week (promo_any, promo_display, promo_mailer).
  • Built behavioral segments:
    • Value segment = quartiles of household net sales
    • Promo-prone segment = quartiles of household share of spend on promoted items

Incremental lift approach (interpretable and robust):

  • For promotions, estimated within-product-store changes over time using fixed-effect regressions:
    • controls for persistent differences by product-store
    • controls for common week-level shocks (seasonality)
  • For coupons, estimated association between coupon dollars and gross basket value using basket-level fixed effects.

Important: The dataset does not include item-level margins/COGS, so “profitability” is assessed using a revenue-based ROI proxy. I highlight limitations and how to improve the measurement.

1.3 How the analysis helps Regork

This analysis produces:

  1. A clear diagnostic: promotions appear to be over-discounting across all value segments (negative ROI proxy).
  2. A targeting insight: couponing correlates with larger baskets and looks more efficient than broad promotions (under a revenue proxy).
  3. Actionable next steps: reduce low-return promo exposure, reallocate spend toward targeted couponing, and implement holdout testing to measure true incrementality and profit.

2 Packages / Libraries

This report uses:

Messages/warnings from package loading are suppressed for readability and reproducibility.

3 Data Overview

inventory <- readr::read_csv(file.path(tbl_dir, "data_inventory.csv"), show_col_types = FALSE)
render_table(inventory, caption = "Data inventory (row/column counts from the analysis pipeline)")
Data inventory (row/column counts from the analysis pipeline)
table rows cols
transactions 1460438 19
promotions 20927744 6
products_from_get_data 0 0
demographics_from_get_data 0 0
tx_joined 1460438 23

Notes on available tables.
In this course distribution of Complete Journey, only transactions and promotions were available (no product hierarchy table, no demographics table). As a result:

4 Exploratory Data Analysis (EDA) and Findings

4.2 2) Segment profiles: who drives sales and who consumes discounts?

4.2.1 Value segments (households split into quartiles by total net sales)

kpi_value <- readr::read_csv(file.path(tbl_dir, "segment_kpis_value_segment.csv"), show_col_types = FALSE) %>%
  mutate(
    net_sales = dollar(net_sales),
    gross_sales = dollar(gross_sales),
    discount = dollar(discount),
    discount_rate = percent(discount_rate, accuracy = 0.1),
    promo_net_sales = dollar(promo_net_sales),
    promo_share_net = percent(promo_share_net, accuracy = 0.1),
    coupon_discount = dollar(coupon_discount)
  )

render_table(kpi_value, caption = "Segment KPIs by Value Segment (quartiles of household net sales)")
Segment KPIs by Value Segment (quartiles of household net sales)
value_segment households trips net_sales gross_sales discount discount_rate promo_net_sales promo_share_net coupon_discount
Value Q4 (Top) 617 80978 $2,846,046 $3,328,250 $482,204 14.5% $530,990 18.7% $16,622.95
Value Q3 617 43415 $1,110,066 $1,316,928 $206,861 15.7% $229,408 20.7% $5,492.73
Value Q2 617 22480 $501,210 $600,736 $99,526 16.6% $113,511 22.6% $1,693.64
Value Q1 (Low) 618 8645 $138,705 $166,121 $27,416 16.5% $30,609 22.1% $548.93

Interpretation:

  • Top value households (Value Q4) drive the majority of revenue, as expected.
  • Discount rates are slightly lower for high-value customers than for mid/low value groups.
  • Mid/low segments have higher promo share of spend, suggesting they are more promotion-reliant.

4.2.2 Promo-prone segments (households split into quartiles by promo share of spend)

kpi_promo <- readr::read_csv(file.path(tbl_dir, "segment_kpis_promo_prone_segment.csv"), show_col_types = FALSE) %>%
  mutate(
    net_sales = dollar(net_sales),
    gross_sales = dollar(gross_sales),
    discount = dollar(discount),
    discount_rate = percent(discount_rate, accuracy = 0.1),
    promo_net_sales = dollar(promo_net_sales),
    promo_share_net = percent(promo_share_net, accuracy = 0.1),
    coupon_discount = dollar(coupon_discount)
  )

render_table(kpi_promo, caption = "Segment KPIs by Promo-Prone Segment (quartiles of promo share)")
Segment KPIs by Promo-Prone Segment (quartiles of promo share)
promo_prone_segment households trips net_sales gross_sales discount discount_rate promo_net_sales promo_share_net coupon_discount
Promo Med-Low 617 44928 $1,390,030 $1,606,986 $216,955 13.5% $234,294 16.9% $6,621.02
Promo Med-High 617 43062 $1,215,543 $1,455,331 $239,789 16.5% $278,263 22.9% $6,398.46
Promo Low 618 36104 $1,163,959 $1,299,016 $135,057 10.4% $122,910 10.6% $4,095.98
Promo High 617 31424 $826,495 $1,050,701 $224,206 21.3% $269,049 32.6% $7,242.79

Key pattern:
The Promo High segment has:

  • the highest discount rate, and
  • the highest promo share of spend,
    while contributing a smaller share of overall net sales than the mid segments.

This is a classic sign of a segment that may be price conditioned (buying mainly when discounted).

4.3 3) Are promotions paying back? (Promo ROI proxy by Value Segment)

Because margins are not available, I use a revenue-based ROI proxy:

\[ \text{ROI proxy} = \frac{\text{Gross lift} - \text{Discount lift}}{\text{Discount lift}} \]

  • If ROI proxy > 0: gross lift exceeds discount cost (good, on a revenue basis)
  • If ROI proxy < 0: discount cost exceeds gross lift (over-discounting risk)
promo_lift <- readr::read_csv(file.path(tbl_dir, "promo_lift_by_value_segment.csv"), show_col_types = FALSE) %>%
  mutate(
    net_lift_promo_any = round(net_lift_promo_any, 4),
    discount_lift_promo_any = round(discount_lift_promo_any, 4),
    roi_proxy = round(roi_proxy, 4)
  )

render_table(promo_lift, caption = "Estimated promo lift and ROI proxy (promotion vs non-promotion) by Value Segment")
Estimated promo lift and ROI proxy (promotion vs non-promotion) by Value Segment
value_segment net_lift_promo_any discount_lift_promo_any roi_proxy
Value Q1 (Low) -0.2432 0.7873 -0.3088
Value Q2 -0.1716 0.7751 -0.2213
Value Q3 -0.2179 0.7650 -0.2848
Value Q4 (Top) -0.1621 0.8734 -0.1855
knitr::include_graphics(file.path(fig_dir, "promo_roi_proxy_value_segment.png"))

Finding: ROI proxy is negative for every value segment.

  • The worst segment is Value Q1 (Low) (most negative ROI proxy).
  • Value Q4 (Top) is “least bad,” but still negative.

CEO implication (revenue proxy): current broad promotions look like over-discounting, especially for low-value households. If margins are thin (typical grocery), the profit impact is likely even worse.

4.4 4) Coupons: do they appear more efficient than broad promotions?

Coupon “efficiency” is evaluated at the basket level as:

  • estimated gross lift per $1 coupon discount
  • net return proxy = (gross lift per $1) − 1
coupon_eff <- readr::read_csv(file.path(tbl_dir, "coupon_efficiency_by_segment.csv"), show_col_types = FALSE) %>%
  mutate(
    coef_gross_per_1_coupon = round(coef_gross_per_1_coupon, 3),
    net_return_per_1_coupon = round(net_return_per_1_coupon, 3)
  )

render_table(coupon_eff, caption = "Coupon efficiency (revenue proxy) by segment")
Coupon efficiency (revenue proxy) by segment
segment_var segment_level coef_gross_per_1_coupon net_return_per_1_coupon
value_segment Value Q3 9.371 8.371
value_segment Value Q1 (Low) 4.791 3.791
value_segment Value Q4 (Top) 11.206 10.206
value_segment Value Q2 11.728 10.728
promo_prone_segment Promo Med-Low 11.198 10.198
promo_prone_segment Promo Low 12.461 11.461
promo_prone_segment Promo High 9.004 8.004
promo_prone_segment Promo Med-High 11.131 10.131
coupon_highlights <- coupon_eff %>%
  group_by(segment_var) %>%
  arrange(desc(net_return_per_1_coupon)) %>%
  slice_head(n = 1) %>%
  ungroup()

render_table(coupon_highlights, caption = "Highest coupon net-return segments (within each segmentation)")
Highest coupon net-return segments (within each segmentation)
segment_var segment_level coef_gross_per_1_coupon net_return_per_1_coupon
promo_prone_segment Promo Low 12.461 11.461
value_segment Value Q2 11.728 10.728

Interpretation (cautious):

  • All segments show positive coupon net-return proxies.
  • The lowest coupon net-return is in Value Q1 (Low).
  • The Promo Low segment has the highest coupon net-return proxy, which is consistent with coupons being more incremental for customers who are not already discount-conditioned.

Caution: Coupon usage is not randomly assigned. This result should be treated as a prioritization signal for testing, not a definitive causal estimate.

4.5 5) “Over-discounting” flags: where should we stop subsidizing?

promo_flags <- readr::read_csv(file.path(tbl_dir, "promo_overdiscount_flags.csv"), show_col_types = FALSE) %>%
  mutate(roi_proxy = round(roi_proxy, 4))

coupon_flags <- readr::read_csv(file.path(tbl_dir, "coupon_overdiscount_flags.csv"), show_col_types = FALSE) %>%
  mutate(
    coef_gross_per_1_coupon = round(coef_gross_per_1_coupon, 3),
    net_return_per_1_coupon = round(net_return_per_1_coupon, 3)
  )

render_table(promo_flags, caption = "Promotion over-discount flags (ROI proxy ≤ 0)")
Promotion over-discount flags (ROI proxy ≤ 0)
value_segment net_lift_promo_any discount_lift_promo_any roi_proxy over_discount_flag
Value Q1 (Low) -0.2431593 0.7873239 -0.3088 TRUE
Value Q2 -0.1715567 0.7751448 -0.2213 TRUE
Value Q3 -0.2179016 0.7650251 -0.2848 TRUE
Value Q4 (Top) -0.1620533 0.8733747 -0.1855 TRUE
render_table(coupon_flags, caption = "Coupon over-discount flags (net return < 0)")
Coupon over-discount flags (net return
segment_var segment_level coef_gross_per_1_coupon net_return_per_1_coupon over_discount_flag
value_segment Value Q3 9.371 8.371 FALSE
value_segment Value Q1 (Low) 4.791 3.791 FALSE
value_segment Value Q4 (Top) 11.206 10.206 FALSE
value_segment Value Q2 11.728 10.728 FALSE
promo_prone_segment Promo Med-Low 11.198 10.198 FALSE
promo_prone_segment Promo Low 12.461 11.461 FALSE
promo_prone_segment Promo High 9.004 8.004 FALSE
promo_prone_segment Promo Med-High 11.131 10.131 FALSE

Bottom line:

  • Promotions: flagged as over-discounting in every value segment.
  • Coupons: no segments flagged under the revenue proxy metric.

5 Recommendations to the CEO

Based on these results, the most defensible strategy is not “more promotions,” but “smarter promotions.” Specifically:

5.1 1) Reduce broad promotions where ROI proxy is most negative

  • Start with Value Q1 (Low) and Promo High households.
  • Pilot reducing promotion depth/frequency on products heavily purchased by these segments.

5.2 2) Reallocate spend toward targeted coupons (with guardrails)

  • Coupons appear more efficient than broad promotions in this dataset.
  • Guardrails to prevent waste:
    • minimum basket thresholds,
    • limit stacking,
    • personalized offers for non–promo-conditioned customers.

5.3 3) Make it causal: implement holdout testing

To move from “correlation with fixed effects” to decision-grade evidence:

  • Create randomized holdouts (no promo / reduced promo) by store-week or customer-week
  • Measure incremental net revenue and incremental margin
  • Update targeting based on measured incrementality

6 Summary and Limitations

6.1 Summary

  • Promotions show negative ROI proxy across all value segments → likely over-discounting on a revenue basis.
  • Promo-prone households show much higher discount rates → suggests discount conditioning.
  • Coupons correlate with larger baskets and show positive net-return proxies across segments.

6.2 Limitations (what could improve)

  1. No margin/COGS data. Revenue-based ROI can still be unprofitable if gross margins are thin.
  2. Selection bias. Promotions/coupons are not randomly assigned.
  3. Limited product hierarchy. Without a products table, “where” analysis is constrained to product_id rather than department/category.
  4. Short/long-term effects. Promotions can shift future behavior (stockpiling, brand switching), not captured here.

6.3 How to build on this work

  • Add product hierarchy (department/commodity) + margin to compute true profit lift.
  • Run holdout experiments for promotion and coupon strategies.
  • Extend segmentation beyond behavioral quartiles (demographics, life stage, store region) if available.

7 Appendix A: Reproducibility Notes (Optional)

The report is designed to knit quickly by loading precomputed outputs in outputs/.
To reproduce outputs from raw data, run the full analysis script first (the one that: loads get_transactions() and get_promotions(), aggregates promotions, creates segments, runs fixed-effect models, and writes the tables/figures).

8 Appendix B: Full Analysis Code (Not Executed During Knit)

# ============================
# Full analysis script used to generate the outputs/ folder
# (shown for reproducibility; not executed when knitting)
# ============================

options(stringsAsFactors = FALSE)
set.seed(123)

pkgs <- c(
  "completejourney",
  "data.table",
  "janitor",
  "lubridate",
  "fixest",
  "ggplot2",
  "dplyr",
  "readr",
  "scales"
)

to_install <- pkgs[!vapply(pkgs, requireNamespace, logical(1), quietly = TRUE)]
if (length(to_install) > 0) install.packages(to_install, repos = "https://cloud.r-project.org")

suppressPackageStartupMessages({
  library(completejourney)
  library(data.table)
  library(janitor)
  library(lubridate)
  library(fixest)
  library(ggplot2)
  library(dplyr)
  library(readr)
  library(scales)
})

dir.create("outputs", showWarnings = FALSE)
dir.create("outputs/tables", showWarnings = FALSE)
dir.create("outputs/figures", showWarnings = FALSE)
dir.create("outputs/models", showWarnings = FALSE)

# 1) Load data
transactions_raw <- completejourney::get_transactions()
promotions_raw   <- completejourney::get_promotions()

# 2) Clean transactions
tx <- transactions_raw %>%
  janitor::clean_names() %>%
  mutate(
    week = as.integer(week),
    transaction_timestamp = suppressWarnings(lubridate::ymd_hms(transaction_timestamp, tz = "UTC")),
    quantity = as.numeric(quantity),
    sales_value = as.numeric(sales_value),
    retail_disc = as.numeric(retail_disc),
    coupon_disc = as.numeric(coupon_disc),
    coupon_match_disc = as.numeric(coupon_match_disc)
  ) %>%
  filter(!is.na(household_id), !is.na(product_id), !is.na(store_id), !is.na(week)) %>%
  filter(!is.na(quantity), quantity > 0) %>%
  mutate(
    retail_disc_amt       = abs(coalesce(retail_disc, 0)),
    coupon_disc_amt       = abs(coalesce(coupon_disc, 0)),
    coupon_match_disc_amt = abs(coalesce(coupon_match_disc, 0)),
    coupon_total_amt      = coupon_disc_amt + coupon_match_disc_amt,
    total_disc_amt        = retail_disc_amt + coupon_total_amt,
    gross_sales_est       = sales_value + total_disc_amt,
    used_coupon           = as.integer(coupon_total_amt > 0),
    any_discount          = as.integer(total_disc_amt > 0)
  )

# 3) Promotions aggregation with data.table (fast)
prom_dt <- as.data.table(promotions_raw)
setnames(prom_dt, names(prom_dt), janitor::make_clean_names(names(prom_dt)))

need_cols <- c("product_id", "store_id", "week", "display_location", "mailer_location")
prom_dt <- prom_dt[, ..need_cols]
prom_dt[, week := as.integer(week)]
prom_dt[, display_location := as.character(display_location)]
prom_dt[, mailer_location  := as.character(mailer_location)]

prom_dt[, promo_display := fifelse(is.na(display_location) | display_location %in% c("0","NONE","none","","NA"), 0L, 1L)]
prom_dt[, promo_mailer  := fifelse(is.na(mailer_location)  | mailer_location  %in% c("0","NONE","none","","NA"), 0L, 1L)]

promotions <- prom_dt[
  ,
  .(
    promo_display = as.integer(max(promo_display)),
    promo_mailer  = as.integer(max(promo_mailer)),
    promo_any     = as.integer(max(promo_display | promo_mailer))
  ),
  by = .(product_id, store_id, week)
]

rm(promotions_raw, prom_dt); gc()

# Join promotions -> transactions
tx <- as.data.table(tx)
setDT(promotions)
setkey(tx, product_id, store_id, week)
setkey(promotions, product_id, store_id, week)

tx <- promotions[tx]  # left join
tx[is.na(promo_display), promo_display := 0L]
tx[is.na(promo_mailer),  promo_mailer  := 0L]
tx[is.na(promo_any),     promo_any     := 0L]

# 4) Household segments
hh <- tx[
  ,
  .(
    total_net_sales   = sum(sales_value, na.rm = TRUE),
    total_gross_sales = sum(gross_sales_est, na.rm = TRUE),
    total_discount    = sum(total_disc_amt, na.rm = TRUE),
    coupon_discount   = sum(coupon_total_amt, na.rm = TRUE),
    promo_net_sales   = sum(fifelse(promo_any == 1L, sales_value, 0), na.rm = TRUE),
    trips             = uniqueN(basket_id),
    weeks_active      = uniqueN(week),
    units             = sum(quantity, na.rm = TRUE)
  ),
  by = .(household_id)
]

hh[, discount_rate := fifelse(total_gross_sales > 0, total_discount / total_gross_sales, NA_real_)]
hh[, promo_share_net := fifelse(total_net_sales > 0, promo_net_sales / total_net_sales, NA_real_)]

hh[, value_quartile := dplyr::ntile(total_net_sales, 4)]
hh[, promo_quartile := dplyr::ntile(fifelse(is.na(promo_share_net), 0, promo_share_net), 4)]

hh[, value_segment := factor(value_quartile,
                             levels = 1:4,
                             labels = c("Value Q1 (Low)", "Value Q2", "Value Q3", "Value Q4 (Top)"))]
hh[, promo_prone_segment := factor(promo_quartile,
                                   levels = 1:4,
                                   labels = c("Promo Low", "Promo Med-Low", "Promo Med-High", "Promo High"))]

fwrite(hh, "outputs/tables/household_segments.csv")

# Attach segments
setkey(hh, household_id)
setkey(tx, household_id)
tx <- hh[tx]

# 5) Segment KPI tables (fixed)
seg_kpis <- function(segcol) {
  stopifnot(segcol %in% names(tx))
  tx[
    ,
    .(
      households = uniqueN(household_id),
      trips      = uniqueN(basket_id),
      net_sales  = sum(sales_value, na.rm = TRUE),
      gross_sales = sum(gross_sales_est, na.rm = TRUE),
      discount   = sum(total_disc_amt, na.rm = TRUE),
      discount_rate = ifelse(sum(gross_sales_est, na.rm = TRUE) > 0,
                             sum(total_disc_amt, na.rm = TRUE) / sum(gross_sales_est, na.rm = TRUE),
                             NA_real_),
      promo_net_sales = sum(fifelse(promo_any == 1L, sales_value, 0), na.rm = TRUE),
      promo_share_net = ifelse(sum(sales_value, na.rm = TRUE) > 0,
                               sum(fifelse(promo_any == 1L, sales_value, 0), na.rm = TRUE) / sum(sales_value, na.rm = TRUE),
                               NA_real_),
      coupon_discount = sum(coupon_total_amt, na.rm = TRUE)
    ),
    by = segcol
  ][order(-net_sales)]
}

k1 <- seg_kpis("value_segment"); fwrite(k1, "outputs/tables/segment_kpis_value_segment.csv")
k2 <- seg_kpis("promo_prone_segment"); fwrite(k2, "outputs/tables/segment_kpis_promo_prone_segment.csv")

# 6) Promo lift by value segment (product-store FE + week FE)
psw <- tx[
  ,
  .(
    gross_sales = sum(gross_sales_est, na.rm = TRUE),
    discount    = sum(total_disc_amt, na.rm = TRUE),
    promo_any   = as.integer(max(promo_any))
  ),
  by = .(value_segment, product_id, store_id, week)
]

promo_lift <- list()
for (seg in levels(psw$value_segment)) {
  d <- psw[value_segment == seg]

  m_g_any <- feols(gross_sales ~ promo_any | product_id^store_id + week, data = d, cluster = ~store_id)
  m_d_any <- feols(discount    ~ promo_any | product_id^store_id + week, data = d, cluster = ~store_id)

  net_lift <- as.numeric(coef(m_g_any)["promo_any"]) - as.numeric(coef(m_d_any)["promo_any"])
  disc_lift <- as.numeric(coef(m_d_any)["promo_any"])
  roi <- ifelse(abs(disc_lift) > 1e-9, net_lift / disc_lift, NA_real_)

  promo_lift[[seg]] <- data.frame(
    value_segment = seg,
    net_lift_promo_any = net_lift,
    discount_lift_promo_any = disc_lift,
    roi_proxy = roi
  )

  saveRDS(m_g_any, paste0("outputs/models/promo_", make.names(seg), "_gross_any.rds"))
  saveRDS(m_d_any, paste0("outputs/models/promo_", make.names(seg), "_disc_any.rds"))
}

promo_lift_df <- do.call(rbind, promo_lift)
write_csv(promo_lift_df, "outputs/tables/promo_lift_by_value_segment.csv")

p_roi <- ggplot(promo_lift_df, aes(x = value_segment, y = roi_proxy)) +
  geom_col() +
  labs(
    title = "Promo ROI Proxy by Value Segment",
    subtitle = "ROI proxy = (gross lift - discount lift) / discount lift (revenue proxy, not margin)",
    x = NULL, y = "ROI proxy"
  )
ggsave("outputs/figures/promo_roi_proxy_value_segment.png", p_roi, width = 9, height = 4.5, dpi = 160)

# 7) Coupon efficiency by segment (safe column names)
basket <- tx[
  ,
  .(
    gross_sales = sum(gross_sales_est, na.rm = TRUE),
    coupon_disc = sum(coupon_total_amt, na.rm = TRUE)
  ),
  by = .(household_id, basket_id, store_id, week, value_segment, promo_prone_segment)
]

coupon_eff <- function(segcol) {
  out <- list()
  seg_levels <- unique(basket[[segcol]])
  for (s in seg_levels) {
    d <- basket[get(segcol) == s]
    m <- feols(gross_sales ~ coupon_disc | household_id + week + store_id,
               data = d, cluster = ~household_id)
    b <- as.numeric(coef(m)["coupon_disc"])

    out[[as.character(s)]] <- data.frame(
      segment_var = segcol,
      segment_level = as.character(s),
      coef_gross_per_1_coupon = b,
      net_return_per_1_coupon = b - 1
    )
  }
  rbindlist(out, fill = TRUE)
}

coupon_df <- rbind(
  coupon_eff("value_segment"),
  coupon_eff("promo_prone_segment")
)
fwrite(coupon_df, "outputs/tables/coupon_efficiency_by_segment.csv")

# 8) Weekly summary + plots (label_number_si2 replacement)
label_number_si2 <- function(...) {
  scales::label_number(scale_cut = scales::cut_si(""), ...)
}

weekly <- tx[
  ,
  .(
    net_sales = sum(sales_value, na.rm = TRUE),
    gross_sales = sum(gross_sales_est, na.rm = TRUE),
    discount = sum(total_disc_amt, na.rm = TRUE),
    discount_rate = sum(total_disc_amt, na.rm = TRUE) / sum(gross_sales_est, na.rm = TRUE),
    promo_net_sales = sum(fifelse(promo_any == 1L, sales_value, 0), na.rm = TRUE),
    promo_share_net = sum(fifelse(promo_any == 1L, sales_value, 0), na.rm = TRUE) / sum(sales_value, na.rm = TRUE)
  ),
  by = .(week)
][order(week)]

fwrite(weekly, "outputs/tables/weekly_summary.csv")

p_week_sales <- ggplot(as.data.frame(weekly), aes(x = week, y = net_sales)) +
  geom_line() +
  labs(title = "Weekly Net Sales", x = "Week", y = "Net sales ($)") +
  scale_y_continuous(labels = label_number_si2())
ggsave("outputs/figures/weekly_net_sales.png", p_week_sales, width = 9, height = 4, dpi = 160)

p_week_disc <- ggplot(as.data.frame(weekly), aes(x = week, y = discount_rate)) +
  geom_line() +
  labs(title = "Weekly Discount Rate (Discount / Gross Sales)", x = "Week", y = "Discount rate") +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1))
ggsave("outputs/figures/weekly_discount_rate.png", p_week_disc, width = 9, height = 4, dpi = 160)

# Over-discount flags
promo_flags <- promo_lift_df %>%
  mutate(over_discount_flag = (discount_lift_promo_any > 0 & roi_proxy <= 0))
write_csv(promo_flags, "outputs/tables/promo_overdiscount_flags.csv")

coupon_flags <- as.data.table(coupon_df)
coupon_flags[, over_discount_flag := (net_return_per_1_coupon < 0)]
fwrite(coupon_flags, "outputs/tables/coupon_overdiscount_flags.csv")