Instructions

Many of the questions on this homework are deliberately open-ended. They invite you to apply what we covered in class — RFM construction, breakeven decision rules, segment-cell sample size, lift and gains, the cost of acting on noisy estimates — together with the strategic reasoning you have built across the rest of the programme. A strong answer is not one that recites the lab; it is one that shows you thinking like an analyst making a recommendation a marketing director could act on. Treat the homework as a reflection exercise: the goal is to consolidate what we have learned and generalise it to settings we did not cover.

This homework extends the work you did in the RFM lab. It is built around a new dataset — a transaction-level export from GourmetBox, an artisan food-and-wine retailer — which you will turn into RFM segments yourself. The lab’s StyleHive customer-level table is not used here.

  • Part A (Questions 1–4, 40 marks): Reflective questions. Each answer should be concise and specific — generic textbook responses earn fewer marks than answers anchored in the scenario or your own analysis.
  • Part B (Questions 5–6, 60 marks): Coding questions.
  • Bonus (optional, up to 10 marks): Question 7 extends the lab into segment-migration analysis — a topic we did not cover in class. Marks count on top of the 100 above; a complete homework without it can still earn the full 100.
  • All code should run from top to bottom in a fresh R session. Submit the completed .Rmd and the knitted output (pdf or html)
  • Word counts are suggested guides, not binding limits — write tightly enough that every sentence earns its place, but do not pad to a target or trim a strong point to fit.

A note on AI tools. You may use AI tools to learn syntax or check your understanding. The questions are designed so that copy-pasted AI answers will not get you far: the qualitative questions reward specific, scenario-grounded reasoning, and the coding questions ask you to interpret your own numbers. Markers can tell when an answer doesn’t match the candidate’s code output.

# Setup — run this block first.
rm(list = ls())
options(scipen = 100, digits = 3)

library(dplyr)
library(ggplot2)
library(readr)
library(tidyr)

# Read the homework data (transactions and customer file).
transactions <- read.csv("data/gourmetbox_transactions.csv",
                          stringsAsFactors = FALSE)
transactions$order_date <- as.Date(transactions$order_date)

customers <- read.csv("data/gourmetbox_customers.csv",
                       stringsAsFactors = FALSE)
customers$signup_date <- as.Date(customers$signup_date)

Part A: Reflective Questions [40 marks]


Question 1: Apply RFM to a non-retail context [10 marks]

Pick one of the three scenarios below — none of them were covered in class. Stay with that single scenario for parts (a)–(c).

  • (i) Subscription news platform. A UK news outlet runs a £12/month digital subscription. Activity data: which articles each subscriber reads, how long they spend on each, which sections they prefer, which days they log in.
  • (ii) Donor management for a UK charity. A medium-sized charity tracks donors over years. Activity data: each donation (date, amount, appeal it responded to), each event RSVP, gift-aid status, communication preferences. Donors are not “customers” — they have no transactional purchase relationship.
  • (iii) B2B SaaS feature usage. A B2B SaaS company sells annual subscriptions to teams. Activity data: per-user feature events (which features used, how often, by whom inside the customer team), team-level seat counts, support tickets opened. Buyers are accounts, but engagement is per-user.

For your chosen scenario:

(a) Define R, F, and M operationally: which raw fields would you use, over what aggregation window, and as of what “snapshot” date? State the units.

(b) Identify two specific ways the StyleHive-style RFM framework would mislead in this scenario. Use specifics of your scenario, not generic complaints about RFM.

(c) Propose one modification — an extra dimension, an alternative metric, or a different segmentation rule — that addresses one of the failure modes from (b). State the trade-off your modification introduces (e.g., “this gives finer segmentation but doubles data requirements”).

(a)

For the chosen scenario of a (i) Subscription News Platform, the snapshot date is set to 3 May 2026 to represent the point at which targeting decisions are made in TODAY. The aggregation window covers the 90 days prior to this snapshot, a timeframe selected because it effectively captures recent engagement behavior while avoiding the noise associated with very short windows or the outdated signals found in longer horizons.

Recency (R) is measured by the number of days since a user’s last login or article read prior to the snapshot date (i.e. 3 May 2026). This metric is calculated using raw data fields for the “last login” and “last article read dates”, measured in units of “days”. A lower recency value indicates a more recently engaged user, which suggests they are more likely to maintain their news subscription.

Frequency (F) tracks the number of active days or articles read within the 90-day aggregation window. Derived from “login events” and “article read logs” raw fields, this count helps interpret user habits. A higher frequency signifies habitual usage and suggests that repeated readers possess high brand loyalty.

In a standard model, Monetary (M) would measure the average or total spend per transaction, which in this context refers to subscription fees. The raw fields used for this calculation include the “subscription fee” and the “number of subscription months”, measured in “GBP (£)”. However, because the subscription fee is fixed at £12 per month, average spend per transaction (M) is CONSTANT. Therefore, monetary value can be replaced with a duration-based metric (D), specifically the total time spent reading articles over the past 90 days. Using raw fields like “time spent per article” and “session duration”, this metric captures deeper user engagement, where a higher duration indicates a stronger perceived value from the subscription.

(b)

The main issue is that Monetary (M) doesn’t really tell us anything in a subscription model. Unlike the retail environment of “StyleHive” where spending varies significantly (some people spend way more than others), everyone here just pays a flat £12 a month. Since the price is the same for everyone, it breaks the RFM rule that “big spenders” are your most valuable customers. Hence, we end up with a model that only looks at “Recency” and “Frequency”. This means a “hardcore daily reader” and “rare reader” look exactly the same on paper because they both pay £12 a month, even though the rare reader is way more likely to cancel soon as they rarely use the subscription. If we stick to this, we’ll end up overvaluing people who aren’t actually engaged and making bad marketing calls.

Furthermore, using a “count-only” method for “frequency (F)” can misrepresent the quality of user engagement. While the StyleHive framework measures frequency through “transaction counts” to capture purchase habits, applying this to a news platform by counting logins or article reads ignores the depth of the interaction. A user who logs in frequently just to skim headlines may receive a high frequency score, whereas a user who visits less often but spends a lot of time on reading deep-dive articles appears less valuable under this metric. This approach may overvalue superficial activity and undervalue meaningful consumption, ultimately resulting in incorrect segmentation and flawed targeting strategies.

(c)

Since the money part doesn’t work, it is suggested to replace Monetary (M) by Engagement Duration (D). This would track the total time someone actually spends reading articles over the 90-day window. It’s a move we see in the real world—like how “Spotify” looks at listening time (Duration) instead of just the subscription fee to see how much someone actually likes the service (as mentioned in the lecture, Spotify is R,F,D). Duration is a much better way to see “behavioral intensity,” basically showing if a user actually cares about the content or is just paying for a subscription they don’t use. This fixes the issue I mentioned earlier where the monetary value was the same for everyone. By using time spent, the model can finally tell the difference between a high-value user who spends hours deep-diving into news and a low-value user who barely clicks anything. This makes the churn risk much easier to spot.

However, it is messier to track. The big trade-off here is that the data gets a lot more complicated. Subscription fees are very clean —you either paid £12 or you didn’t. Tracking duration means we need to monitor things like session time or how far someone scrolls, which can be noisy. For example, if someone leaves a tab open while they go make lunch, it might look like they’re “highly engaged” when they aren’t even in the room. It’s definitely more accurate for engagement, but the data is just harder to keep perfect.


Question 2: Diagnose a campaign that under-delivered [10 marks]

GourmetBox ran an RFM-targeted email campaign last month. The team’s pre-launch deck projected a campaign-level ROMI of ~1.5×. The actual ROMI came in at 0.7×. The post-mortem report contains the table below, plus the operational facts beneath it.

Segment Predicted resp. rate Actual resp. rate n_mailed n_responded
111 (best) 38% 22% 412 91
112 31% 18% 510 92
121 28% 25% 488 122
211 26% 14% 605 85
212 23% 12% 712 85
221 21% 11% 690 76
311 (mid) 19% 7% 803 56
511 (worst) 8% 6% 945 57

Operational facts (from the post-mortem):

  • The campaign was sent on a Saturday at 10am, the team’s first-ever weekend send (previous campaigns were Tuesday-Thursday).
  • 12% of recipient email addresses bounced (i.e., did not deliver). Bounces are counted in n_mailed but generated no responses.
  • Predicted rates came from the previous quarter’s RFM segment rates; the lookback window used to score customers was 6 months (the lab default is 12).
  • The product offered was a £10-off coupon on orders over £40 — a stricter minimum than the company’s typical £30 minimum.
  • The coupon expired in 5 days (versus the team’s usual 14-day window).

(a) Identify three plausible reasons for the gap between predicted and actual ROMI. Cite specific numbers from the table or facts above for each reason.

(b) For each reason, propose one corrective action that is testable in the next campaign.

(c) Rank your three reasons by your confidence that they were the primary driver of the underperformance, and explain in one sentence why your top reason wins.

(a)

The first big problem is that the data being used is both outdated and poorly selected. The predicted response rates are based on the previous quarter’s performance and a short 6-month lookback window, which doesn’t really show stable customer habits. While using OLD RFM segment rates can be helpful, applying them directly assumes that customer behavior and market conditions never change, which just is NOT TRUE in the real world (Environment is keep changing day by day). Also, because the aggregation window was shortened to 6 months—instead of the usual 12 months used in the lab, the model over-emphasizes recent activity. This means a customer who had a temporary spike in activity might get misclassified as “high-value,” which inflates the predicted response rate. All of this creates a systematic overestimation because the inputs and expected outputs aren’t aligned with how customers are actually behaving, leading to that gap in ROMI.

Second, the response rates are skewed because of delivery failures. According to the “operational facts,” 12% of emails bounced, yet they were still counted in the “n_mailed” total. Since these people never saw the offer, they had zero chance to respond. This inflates the denominator, making the campaign look less effective than it actually was and dragging down the observed ROMI.

Finally, the timing and the actual deal were just worse. Sending it at 10am on a Saturday is a big shift from the usual Tuesday–Thursday window, which likely hurt open rates and affect the response rates. The offer itself also had more friction: the minimum spend was increased to £40 (instead of £30) and the deadline was cut to just 5 days instead of 14. This explains why even the “111” segment—the best customers—saw their response rate tank from a predicted 38% to just 22%.

(b)

To address the issue of outdated and mis-specified data inputs, the firm should both improve how customers are scored and how response rates are estimated. First, aggregation window should be extended (e.g., to 12 months) to better capture stable behavioural patterns and reduce noise from short-term activity spikes. Second, we need to update our response rate estimates using more recent data instead of solely looking at what happened last quarter. These estimates should also be validated using a holdout group to measure actual campaign impact and ensure predictions reflect true incremental response. Finally, we should also keep running A/B tests to see how behavior changes and keep our segments up to date. Making these changes will make our targeting much more accurate and help us get a better handle on our real ROMI.

To fix the delivery issue, the team should clean the email list and exclude invalid addresses before deployment. This includes removing hard bounces, verifying emails, and so on. The next campaign can test this by comparing a “cleaned list” group versus a “raw list” group, measuring improvements in effective response rate (responses divided by successfully delivered emails). This ensures that performance metrics reflect TRUE customer behaviour rather than technical inefficiencies.

To improve campaign effectiveness, the team should optimise timing and offer design through controlled experiments. For example, test weekday (Tuesday–Thursday) versus weekend sends, and compare different offer structures (e.g., £10 off for £30 purchase vs £10 off for £40 purchase, and 5-day vs 14-day expiry). Running A/B tests helps figure out exactly how much the timing or the actual discount affects response rates. This lets the company find the best mix to get the most sales without losing money.

(c)

Ranking:

1-Outdated and mis-specified data inputs;

2-Weaker campaign design and timing;

3-Delivery inefficiency due to bounce rates.

The data issue is the primary driver because it causes systematic overestimation across all segments, whereas campaign design and delivery issues reduce performance but do not fully explain the consistent gap between predicted and actual response rates.


Question 3: Critique an analyst’s report [10 marks]

A junior analyst at GourmetBox has produced the following report for the marketing director. Read carefully — there are deliberate methodological errors. Your job is to catch them.

Memo: RFM rollout recommendation

To: Marketing Director From: Analyst

“I built an RFM model on Q1 data using customers who had opened at least one marketing email in the previous 6 months — these are our most engaged customers and the most relevant for evaluation. I scored each customer and assigned them to one of 125 segments using independent quintiles. I then computed the response rate per segment on the same Q1 data. The mean response rate across all 125 segments is 17.4%, well above the breakeven of 8%, so I recommend mailing every segment. The ROMI estimate from this analysis is 1.46× — a strong return.

I also recommend prioritising the top-performing segments in the next campaign. Segments 113, 152, and 313 had response rates of 88%, 75%, and 71% respectively — these are clearly the strongest cells and should receive repeat investment. (Segment 113 contained 4 customers, 152 had 4, and 313 had 7.)

For the monetary cutoffs, I used the median order value within each segment (rather than the mean) to make the segments more representative. The breakeven of 8% was computed using AOV = £45 × 35% margin minus £6 coupon value, divided by £2 cost per coupon.”

(a) Identify four distinct methodological errors in this report. For each, explain why it is a problem and propose a specific correction.

(b) Of the four errors, which is most likely to lead to real money lost if the recommendation is acted on? Defend in one sentence.

(a)

  1. The first error is severe selection bias caused by restricting the sample to customers who “opened at least one marketing email”. The analyst builds the RFM model using only customers who had opened an email in the past six months, which excludes less engaged/ inactive users. This creates a biased sample that over-represents highly responsive and active customers, inflating estimated response rates. As a result, the reported mean response rate of 17.4% is NOT representative of the full customer base and will overestimate expected campaign performance. This directly explains why the analyst recommends mailing every segment.

Correction for (1): Build the RFM model using the entire eligible customer base, not just prior openers, so that response rates reflect true population behaviour.

  1. The second error is that the model is both trained and evaluated on the same Q1 data. The analyst builds the RFM segments using Q1 data and then computes response rates on that “same Q1 dataset” as mentioned in the memo, which leads to overfitting. This approach inflates performance because the model is effectively capturing patterns specific to that dataset rather than learning behaviour that generalises to future campaigns. As a result, the estimated ROMI of 1.46× is likely overly optimistic and unreliable.

Correction for (2): To fix this, the model should be tested on different data rather than the same Q1 data used to build it. For example, the analyst could build the model using Q1 data and then check how well it performs on a later period (such as Q2) or on a separate group of customers.

  1. The third error is the recommendation to “mail every segment” based on an average response rate exceeding breakeven. Using the overall mean response rate (17.4%) to justify mailing all segments ignores the fundamental principle of RFM targeting: different segments have different profitability levels. Some segments—especially low-value ones—may still fall below the breakeven threshold of 8%, even if the overall average is above it. This leads to inefficient resource allocation and potentially negative returns in weaker segments.

Correction for (3): Evaluate response rates and profitability at the segment level, and only target segments whose expected response rates exceed breakeven, rather than relying on an aggregate average.

  1. The fourth error is drawing strong conclusions from extremely small segment sizes, leading to unreliable estimates.The analyst highlights segments 113 (88%), 152 (75%), and 313 (71%) as top performers, yet these segments contain only 4, 4, and 7 customers respectively. Such small sample sizes produce highly volatile and statistically unreliable response rates, meaning the observed performance is likely due to random variation rather than true underlying behaviour. Prioritising these segments risks overinvesting in noise rather than signal.

Correction for (4): Impose a minimum sample size threshold (e.g., at least 50–100 customers per segment) or apply statistical smoothing techniques before making decisions based on segment performance.

(b)

Using the average response rate to justify mailing all segments is most likely to lead to real money lost because it leads to contacting segments whose response rates fall below the breakeven threshold, where the cost per contact (including send fee and discount leakage) exceeds the expected profit per converter, resulting in systematic financial losses.


Question 4: Memo to the CMO [10 marks]

The CMO has just returned from a conference and has emailed the analytics team:

“After this week’s keynotes I’m convinced we should retire the segment-rule targeting we use today and replace it with a per-customer ML model. The conference vendors said they can build one for us in six months. Please prepare a one-page recommendation.”

Write a memo of around 250 words to the CMO. You must:

  • Take a position: RFM-only, ML-only, or a hybrid (one position — don’t hedge).
  • Cite at least three specific concepts from the lecture or lab to support your case. Acceptable examples: breakeven-as-decision-rule, segment-cell sample-size, lift / gains, empirical-Bayes shrinkage of segment rates, the cost of complex-model maintenance versus a simple rule. Do not rely on generic “ML is more accurate” or “RFM is simpler” without specifics.
  • Identify one risk of your recommendation and one early signal the team should watch for if your recommendation is wrong.

I recommend adopting a hybrid approach (RFM + ML) rather than fully replacing our current segment-rule targeting with a pure per-customer ML model.

First, targeting must remain anchored on the breakeven-as-decision rule, which determines whether contacting a customer is profitable. As shown in our analysis, a campaign should only target customer segments whose expected conversion rate exceeds the breakeven threshold (i.e. 8% in this case). RFM segmentation is well-suited for this because it provides stable, interpretable segment-level response rates that can be directly compared to breakeven. In contrast, a pure ML model optimises predicted probability, not profitability, and may target customers who are likely to respond but still generate negative returns after costs.

Second, ML should be used to improve performance through lift and gains, not to replace the segmentation structure. The lecture emphasises that a valuable model produces a “fat” gains curve, which basically means the model is great at finding the most likely buyers and putting them at the very top of the list. While RFM already does a good job of grouping customers, ML can make it even sharper. It can rank people within those high-performing RFM groups so we only spend our limited budget on the absolute best leads.The correct test is whether ML delivers incremental lift over the RFM baseline, not whether it is more complex.

Third, RFM is great because it’s stable. Since it groups people into segments, the numbers are more reliable than looking at each person individually. We can even use “shrinkage” techniques to fix the math for small groups so the estimates aren’t so jumpy.

Fourth, we need to make sure we’re actually causing people to buy, which is why we use holdout groups. As shown in the lab, these groups help us see if the campaign actually made a difference or if people were going to buy anyway. ML models are good at guessing who might buy, but they don’t prove the “why.” Without these tests, we might think the model is doing a better job than it actually is.

Risk: The main risk of a hybrid approach is that it might limit how much ML can improve performance, because we are still restricting decisions using RFM segments. This means we might miss some good customers that ML could identify outside those segments, so we don’t fully benefit from using ML. An early sign that this approach is not working well is if using ML does not actually improve results. For example, if customers selected using ML do not show higher response rates, or if ROMI does not increase compared to using RFM alone, then ML is not adding value and the approach may need to be changed.


Part B: Coding & Analysis [60 marks]


Question 5: Build RFM from raw transactions [30 marks]

You have two files:

  • gourmetbox_transactions.csv — one row per order line, with order_id, customer_id, order_date, order_value, is_return, channel, device.
  • gourmetbox_customers.csv — one row per customer, with customer_id, signup_date, first_order_days, acquisition_channel, device, email_opens_6m, campaign, converted.

The transactions span roughly two years and include returns. Your job is to construct the RFM table that the rest of the homework relies on. Several modelling decisions are yours to make and defend.

5.1 Choose your snapshot date and lookback window

The maximum order date in the data is 2026-04-28.

In the chunk below:

  • Choose a snapshot date for “today”. It can be the data’s maximum date, the day after, or any value you can defend.
  • Choose a lookback window for Frequency and Monetary (e.g., 90 days, 6 months, 12 months).
  • Choose a treatment of returns (is_return == 1): exclude returns from purchase counts? subtract returned values from monetary? something else?

Then write a short justification (3–5 sentences) below the chunk explaining each choice. Your three choices together should be internally consistent.

snapshot_date <- as.Date("2026-04-29")
lookback_days <- 365  #12 months 
return_policy <- "Exclude returns from purchase counts and monetary value"

Justification: The snapshot date was set to “2026-04-29”, ensuring a complete view of all transactions up to the maximum order date. A 365-day (12 months) lookback window was selected to account for the seasonality of the food-and-wine industry, effectively capturing annual cycles such as holidays (e.g., Christmas) and gift-giving (e.g., Thanksgiving) periods. To maintain the integrity of “successful demand,” returns were excluded from both frequency and monetary calculations, whcih means that these metrics reflect only completed orders and actual retained revenue.This allows more accurate estimates.

5.2 Aggregate transactions to one row per customer

Build a customer-level table with columns customer_id, recency, frequency, monetary. Use dplyr — no custom helpers. Show head() and dim() of the result.

Edge case: customers in customers.csv who do not have a non-return purchase in your lookback window. Decide whether to drop them, mark them with NA, or assign them a default (e.g., recency = lookback_days, frequency = 0, monetary = 0). State your choice in a comment.

rfm <- customers %>%
  select(customer_id, campaign, converted) %>%
  left_join(          #keep all customers, even if they have no purchase in the lookback window.
    transactions %>%
      mutate(order_date = as.Date(order_date)) %>%
      filter(
        is_return == 0,
        order_date >= snapshot_date - lookback_days,
        order_date < snapshot_date
      ) %>%
     
     group_by(customer_id) %>%
      summarise(
        recency   = as.numeric(snapshot_date - max(order_date)),
        frequency = n_distinct(order_id),
        monetary  = sum(order_value),
        .groups = "drop"
      ),
    by = "customer_id"
  ) %>%

  # Edge case choice:
  # Customers with no non-return purchase in the lookback window are kept.
  # They receive recency = lookback_days, frequency = 0, monetary = 0.
  mutate(
    recency   = if_else(is.na(recency), lookback_days, recency),
    frequency = if_else(is.na(frequency), 0L, frequency),
    monetary  = if_else(is.na(monetary), 0, monetary)
  )

head(rfm)
##   customer_id campaign converted recency frequency monetary
## 1           1        1         0     117         4      211
## 2           2        1         0     365         0        0
## 3           3        0        NA       7         3      149
## 4           4        1         0     365         0        0
## 5           5        1         0      14         7      199
## 6           6        1         0      37         6      141
dim(rfm)
## [1] 12000     6
# The above process is to is to convert transaction-level data (raw data) into a customer-level RFM table by calculating Recency, Frequency, and Monetary value for each customer within the chosen lookback window (365 days), while also handling customers with no purchases. 

5.3 Independent-quintile RFM segmentation

Apply independent quintile segmentation using ntile() (the same idiom as RFM_lab_student.Rmd lines 41–44 — ntile(recency, 5) for R, and 6 - ntile(...) for F and M so that 1 = best on every dimension). Concatenate into an RFM_indep code (e.g., “111”, “315”).

rfm <- rfm %>%
  mutate(
    Rgroup = ntile(recency, 5),
    Fgroup = 6 - ntile(frequency, 5),
    Mgroup = 6 - ntile(monetary, 5),
    RFM_indep = paste0(Rgroup, Fgroup, Mgroup)
  )

head(rfm)
##   customer_id campaign converted recency frequency monetary Rgroup Fgroup
## 1           1        1         0     117         4      211      3      3
## 2           2        1         0     365         0        0      4      5
## 3           3        0        NA       7         3      149      1      3
## 4           4        1         0     365         0        0      4      5
## 5           5        1         0      14         7      199      1      2
## 6           6        1         0      37         6      141      2      2
##   Mgroup RFM_indep
## 1      2       332
## 2      5       455
## 3      3       133
## 4      5       455
## 5      3       123
## 6      3       223
table(rfm$Rgroup)
## 
##    1    2    3    4    5 
## 2400 2400 2400 2400 2400
table(rfm$Fgroup)
## 
##    1    2    3    4    5 
## 2400 2400 2400 2400 2400
table(rfm$Mgroup)
## 
##    1    2    3    4    5 
## 2400 2400 2400 2400 2400
length(unique(rfm$RFM_indep))
## [1] 48

5.4 Compare two operationalisations of Frequency

Build a second version of frequency: orders in the last 90 days (rather than your lookback window from 5.1). Re-bin into quintiles using ntile(). Call this column Fgroup_90.

Build a contingency table comparing Fgroup (from your 5.1 lookback) and Fgroup_90. How many customers move between top quintile (1) on one definition and bottom three quintiles (3, 4, 5) on the other — a “definition disagreement”?

Discuss: which definition would you use if GourmetBox’s marketing director asked for “frequent buyers”, and why? Are there business questions where the other definition would be more appropriate?

# Original Fgroup used 365 days (12 months) as the lookback window
# Following will creates another frequency measure using only the last 90 days (as provided in the question)
freq_90 <- transactions %>%
  mutate(order_date = as.Date(order_date)) %>%
  filter(
    is_return == 0,
    order_date >= snapshot_date - 90,
    order_date < snapshot_date
  ) %>%
  group_by(customer_id) %>%
  summarise(
    frequency_90 = n_distinct(order_id),
    .groups = "drop"
  )

# Remove old 90-day columns first, in case this chunk is re-run
rfm <- rfm %>%
  select(-any_of(c("frequency_90", "Fgroup_90"))) %>%
  left_join(freq_90, by = "customer_id") %>%
  mutate(
    frequency_90 = if_else(is.na(frequency_90), 0L, frequency_90),
    Fgroup_90 = 6 - ntile(frequency_90, 5)
  )

freq_compare <- table(rfm$Fgroup, rfm$Fgroup_90)
freq_compare
##    
##        1    2    3    4    5
##   1 1555  567  201   49   28
##   2  641  921  494  190  154
##   3  204  644  778  369  405
##   4    0  268  692  713  727
##   5    0    0  235 1079 1086
definition_disagreement <- rfm %>%
  summarise(
    n_disagree = sum(
      (Fgroup == 1 & Fgroup_90 %in% c(3, 4, 5)) |
      (Fgroup_90 == 1 & Fgroup %in% c(3, 4, 5))
    )
  )

definition_disagreement
##   n_disagree
## 1        482

Discussion (3–5 sentences): The contingency table shows that many customers are classified into similar frequency groups under both the 12-month and 90-day definitions, especially along the diagonal. However, there are still a number of customers who move between very different groups. In particular, 482 customers (201 + 49 + 28 + 204) fall into the top quintile in one definition but into the bottom three quintiles in the other, which shows a clear definition disagreement. This suggests that the choice of time window has a noticeable impact on how “frequent buyers” are identified.

If GourmetBox’s marketing director asked for “frequent buyers”, I would use the 12-month (365-day) definition. This is because it captures customers’ long-term purchasing behaviour and reflects whether they have a consistent buying habit over time. It is also more stable and less affected by short-term fluctuations.This is also suitable for the food and wine industry where purchases could be hugely impacted by seasonality.

However, the 90-day definition may be more appropriate for certain business questions. For example, if the purpose is to identify “recently active” customers or to target a short-term promotion, 90-day measure is more useful because it reflects current engagement situation.

Question 6: Validate, stress-test, and target [30 marks]

The customer file includes a campaign flag and a converted outcome for the customers who were emailed (campaign == 1). Use the campaigned customers’ actual conversions to evaluate the RFM-based targeting rule, then stress-test the targeting decision against changes in the economic parameters.

6.1 Build and read the gains chart

Build a cumulative gains chart on the campaigned customers (campaign == 1) by adapting the code in Appendix A of RFM_lab_student.Rmd (“The lift / gains code, for after class”). You will need to:

  1. Compute segment-level response rates first — group the campaigned customers by RFM_indep and take the mean of converted. This is the homework’s equivalent of the lab’s respRFM_df table.
  2. Follow the appendix’s five-step score → sort → chunk → count → cumulate recipe to build a lift_table. Use RFM_indep wherever the lab uses RFM_seq.
  3. Plot proportion of customers targeted (x) vs. proportion of converters captured (y), with the 45° random-targeting baseline as a reference line.
respRFM_df <- rfm %>%
  filter(campaign == 1) %>%
  group_by(RFM_indep) %>%
  summarise(
    n_resp    = sum(converted),
    n_mail    = n(),
    resp_rate = mean(converted),
    .groups = "drop"
  )

scored <- rfm %>%
  filter(campaign == 1) %>%
  left_join(respRFM_df %>% select(RFM_indep, resp_rate), by = "RFM_indep") %>%
  arrange(desc(resp_rate)) %>%
  mutate(decile = ntile(desc(resp_rate), 10))

decile_summary <- scored %>%
  group_by(decile) %>%
  summarise(
    n_customers = n(),
    n_buyers    = sum(converted),
    response    = round(mean(converted), 4),
    .groups = "drop"
  )

total_buyers    <- sum(scored$converted)
total_customers <- nrow(scored)
overall_rate    <- total_buyers / total_customers

lift_table <- decile_summary %>%
  mutate(
    cum_customers = cumsum(n_customers),
    cum_buyers    = cumsum(n_buyers),
    cum_lift      = cum_buyers / cum_customers / overall_rate,
    cum_gains     = cum_buyers / total_buyers,
    pct_customers = cum_customers / total_customers
  )

# Plot cumulative gains chart
ggplot(lift_table, aes(x = pct_customers, y = cum_gains)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 2) +
  geom_abline(intercept = 0, slope = 1, linetype = "dashed") +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(labels = scales::percent) +
  labs(
    title = "Cumulative Gains Chart",
    x = "Proportion of customers targeted",
    y = "Proportion of converters captured"
  ) +
  theme_minimal()

Write your interpretation (~150 words suggested) covering:

  • What does the curve look like — does the model give meaningful lift over random targeting, and where (which deciles) is the lift strongest?
  • At what targeting fraction does the marginal gain flatten out, and what does that suggest for how deep into the list GourmetBox should mail?

The cumulative gains curve sits clearly above the 45-degree random-targeting baseline, showing that the RFM model delivers a meaningful lift. This indicates that the model is effective at ranking customers by their likelihood to convert. The lift is most pronounced in the top deciles, especially within the first 20% – 40% of customers, where the curve increases very steeply. For instance, targeting roughly the top 30% of customers captures about two-thirds of all converters, which is hugely better than random targeting.

After you reach about 50% - 60% of the list, the curve starts to flatten out. This means that there are diminishing marginal returns: for every extra person we target after that point, we get fewer and fewer new sales. It basically becomes less worth the effort and the efficicency is reducing.

To conclude, GourmetBox should stick to the top-ranked segments, probably the best 40% to 60% of customers. Targeting beyond this range may lead to higher costs without a proportional increase in conversions.

6.2 Breakeven targeting and ROMI

Use the homework’s economic parameters:

  • Cost per coupon sent: £2.00
  • AOV: £45.00
  • Gross margin: 35%
  • Coupon face value (deducted from margin per response): £6.00

Compute the breakeven response rate. Identify the segments above breakeven on the campaigned data. Report total profit and ROMI across those segments (using the campaigned customers’ actual conversions, not predictions).

cost          <- 2.00
aov           <- 45.00
gross_margin  <- 0.35
coupon_value  <- 6.00

profit_per_converter <- gross_margin * aov - coupon_value
breakeven <- cost / profit_per_converter

cat(sprintf("Profit per converter: £%.2f\n", profit_per_converter))
## Profit per converter: £9.75
cat(sprintf("Breakeven response rate: %.4f (%.2f%%)\n", breakeven, breakeven * 100))
## Breakeven response rate: 0.2051 (20.51%)
segment_profit <- respRFM_df %>%
  mutate(
    target = resp_rate >= breakeven,
    revenue_from_converters = n_resp * profit_per_converter,
    contact_cost = n_mail * cost,
    profit = revenue_from_converters - contact_cost
  )

target_segments <- segment_profit %>%
  filter(target == TRUE) %>%
  arrange(desc(resp_rate))

target_segments
## # A tibble: 18 × 8
##    RFM_indep n_resp n_mail resp_rate target revenue_from_converters contact_cost
##    <chr>      <int>  <int>     <dbl> <lgl>                    <dbl>        <dbl>
##  1 242            1      1     1     TRUE                      9.75            2
##  2 112           33     78     0.423 TRUE                    322.            156
##  3 111          312    741     0.421 TRUE                   3042            1482
##  4 122          142    375     0.379 TRUE                   1384.            750
##  5 134           10     27     0.370 TRUE                     97.5            54
##  6 121           26     71     0.366 TRUE                    253.            142
##  7 212           27     78     0.346 TRUE                    263.            156
##  8 133           61    191     0.319 TRUE                    595.            382
##  9 211          152    478     0.318 TRUE                   1482             956
## 10 234           16     53     0.302 TRUE                    156             106
## 11 123           18     61     0.295 TRUE                    175.            122
## 12 144           17     61     0.279 TRUE                    166.            122
## 13 232           20     75     0.267 TRUE                    195             150
## 14 321           13     54     0.241 TRUE                    127.            108
## 15 222           95    396     0.240 TRUE                    926.            792
## 16 311           48    203     0.236 TRUE                    468             406
## 17 221           21     89     0.236 TRUE                    205.            178
## 18 132           11     49     0.224 TRUE                    107.             98
## # ℹ 1 more variable: profit <dbl>
target_summary <- target_segments %>%
  summarise(
    n_segments = n(),
    n_customers = sum(n_mail),
    n_converters = sum(n_resp),
    total_profit = sum(profit),
    total_cost = sum(contact_cost),
    ROMI = total_profit / total_cost
  )

target_summary
## # A tibble: 1 × 6
##   n_segments n_customers n_converters total_profit total_cost  ROMI
##        <int>       <int>        <int>        <dbl>      <dbl> <dbl>
## 1         18        3081         1023        3812.       6162 0.619

6.3 Sensitivity to cost and margin assumptions

Build a 2 × 2 scenario grid: cost ∈ {−20%, +20%} × (gross_margin × AOV − coupon_value) ∈ {−20%, +20%}. For each of the four scenarios, compute:

  • the breakeven response rate
  • the number of segments that pass breakeven
  • the total profit and ROMI under that scenario

Tabulate the four results. (Tip: a tidyr::expand_grid() followed by a mutate() with the formulas keeps this tidy.)

scenario_grid <- expand_grid(
  cost_change = c(-0.20, 0.20),
  margin_change = c(-0.20, 0.20)
) %>%
  mutate(
    scenario = paste0(
      "Cost ", if_else(cost_change < 0, "-20%", "+20%"),
      ", Margin ", if_else(margin_change < 0, "-20%", "+20%")
    ),
    scenario_cost = cost * (1 + cost_change),
    scenario_profit_per_converter = profit_per_converter * (1 + margin_change),
    scenario_breakeven = scenario_cost / scenario_profit_per_converter
  )

scenario_results <- scenario_grid %>%
  rowwise() %>%
  mutate(
    n_segments_pass = sum(respRFM_df$resp_rate >= scenario_breakeven),
    total_profit = sum(
      (respRFM_df$n_resp * scenario_profit_per_converter -
         respRFM_df$n_mail * scenario_cost)[respRFM_df$resp_rate >= scenario_breakeven]
    ),
    total_cost = sum(
      (respRFM_df$n_mail * scenario_cost)[respRFM_df$resp_rate >= scenario_breakeven]
    ),
    ROMI = total_profit / total_cost
  ) %>%
  ungroup() %>%
  select(
    scenario,
    scenario_cost,
    scenario_profit_per_converter,
    scenario_breakeven,
    n_segments_pass,
    total_profit,
    ROMI
  )

scenario_results
## # A tibble: 4 × 7
##   scenario               scenario_cost scenario_profit_per_…¹ scenario_breakeven
##   <chr>                          <dbl>                  <dbl>              <dbl>
## 1 Cost -20%, Margin -20%           1.6                    7.8              0.205
## 2 Cost -20%, Margin +20%           1.6                   11.7              0.137
## 3 Cost +20%, Margin -20%           2.4                    7.8              0.308
## 4 Cost +20%, Margin +20%           2.4                   11.7              0.205
## # ℹ abbreviated name: ¹​scenario_profit_per_converter
## # ℹ 3 more variables: n_segments_pass <int>, total_profit <dbl>, ROMI <dbl>
print(scenario_results, width = Inf)
## # A tibble: 4 × 7
##   scenario               scenario_cost scenario_profit_per_converter
##   <chr>                          <dbl>                         <dbl>
## 1 Cost -20%, Margin -20%           1.6                           7.8
## 2 Cost -20%, Margin +20%           1.6                          11.7
## 3 Cost +20%, Margin -20%           2.4                           7.8
## 4 Cost +20%, Margin +20%           2.4                          11.7
##   scenario_breakeven n_segments_pass total_profit  ROMI
##                <dbl>           <int>        <dbl> <dbl>
## 1              0.205              18        3050. 0.619
## 2              0.137              25        7299. 1.23 
## 3              0.308               9        1063. 0.217
## 4              0.205              18        4575. 0.619

6.4 Identify robust target segments

A “robust” segment is one that is profitable (above its scenario-specific breakeven) under all four scenarios from 6.3. Report:

  • The list of robust segments and their campaigned-customer response rates.
  • The list of “fragile” segments — profitable in 1–3 of the 4 scenarios but not all.
  • A one-paragraph recommendation: should GourmetBox commit to mailing only the robust list, or include some fragile segments? What would you watch for in the next campaign to decide?
segment_scenario <- respRFM_df %>%
  crossing(scenario_grid) %>%
  mutate(
    profitable = resp_rate >= scenario_breakeven
  )

robust_segments <- segment_scenario %>%
  group_by(RFM_indep, resp_rate) %>%
  summarise(
    profitable_scenarios = sum(profitable),
    .groups = "drop"
  ) %>%
  filter(profitable_scenarios == 4) %>%
  arrange(desc(resp_rate))

robust_segments
## # A tibble: 9 × 3
##   RFM_indep resp_rate profitable_scenarios
##   <chr>         <dbl>                <int>
## 1 242           1                        4
## 2 112           0.423                    4
## 3 111           0.421                    4
## 4 122           0.379                    4
## 5 134           0.370                    4
## 6 121           0.366                    4
## 7 212           0.346                    4
## 8 133           0.319                    4
## 9 211           0.318                    4
fragile_segments <- segment_scenario %>%
  group_by(RFM_indep, resp_rate) %>%
  summarise(
    profitable_scenarios = sum(profitable),
    .groups = "drop"
  ) %>%
  filter(profitable_scenarios >= 1, profitable_scenarios <= 3) %>%
  arrange(desc(profitable_scenarios), desc(resp_rate))

fragile_segments
## # A tibble: 16 × 3
##    RFM_indep resp_rate profitable_scenarios
##    <chr>         <dbl>                <int>
##  1 234           0.302                    3
##  2 123           0.295                    3
##  3 144           0.279                    3
##  4 232           0.267                    3
##  5 321           0.241                    3
##  6 222           0.240                    3
##  7 311           0.236                    3
##  8 221           0.236                    3
##  9 132           0.224                    3
## 10 223           0.203                    1
## 11 412           0.2                      1
## 12 421           0.176                    1
## 13 233           0.172                    1
## 14 244           0.166                    1
## 15 243           0.162                    1
## 16 332           0.152                    1

Recommendation: GourmetBox should mainly focus on the robust RFM segments since they stay profitable across all four scenarios, so they are actually the safest group to target. It might still be okay to include some fragile segments, but only if they have really high response rates and they only fail in the worst-case situations. I wouldn’t suggest sending to all fragile segments though, because their profit can change a lot if costs or margins change a bit. For the next campaign, GourmetBox should watch several things like actual response rate (are people really buying?), how much the coupons are really costing, and the margin per order (How much profit is left after everything is paid?). Also need to check if those fragile segments are still making profit after costs. This can help decide whether to keep using them, reduce them, or just stop targeting them later on.


Bonus (Optional)

Question 7: Segment migration and intervention design [up to 10 bonus marks]

This question is optional and goes beyond what we covered in class. The lab and lecture treated RFM as a single-snapshot tool; here you extend it to study how customers move between segments over time. Bonus marks are added on top of your Q1–Q6 total — there is no penalty for skipping this question, and a complete homework without it can still earn the full 100.

The transaction data spans roughly two years, so you can compute RFM at two snapshot dates and analyse the migration. Restrict this question to customers with first_order_days >= 365 (so everyone has history at both snapshots).

7.1 Earlier-snapshot RFM

Construct a second RFM table as if “today” were 180 days before your snapshot from Q5.1. Use the same lookback window and return policy you used in 5.1. Apply independent quintile segmentation (same ntile() idiom).

Customers who appear in your earlier-snapshot must have at least one purchase in the earlier-snapshot’s lookback window — drop the rest with a comment in the code.

earlier_snapshot <- snapshot_date - 180

eligible_customers <- customers %>%
  filter(first_order_days >= 365) %>%
  select(customer_id)

rfm_earlier <- transactions %>%
  mutate(order_date = as.Date(order_date)) %>%
  semi_join(eligible_customers, by = "customer_id") %>%
  filter(
    is_return == 0,
    order_date >= earlier_snapshot - lookback_days,
    order_date < earlier_snapshot
  ) %>%

  group_by(customer_id) %>%
  summarise(
    recency = as.numeric(earlier_snapshot - max(order_date)),
    frequency = n_distinct(order_id),
    monetary = sum(order_value),
    .groups = "drop"
  ) %>%


  mutate(
    Rgroup_earlier = ntile(recency, 5),
    Fgroup_earlier = 6 - ntile(frequency, 5),
    Mgroup_earlier = 6 - ntile(monetary, 5),
    RFM_earlier = paste0(Rgroup_earlier, Fgroup_earlier, Mgroup_earlier)
  )

head(rfm_earlier)
## # A tibble: 6 × 8
##   customer_id recency frequency monetary Rgroup_earlier Fgroup_earlier
##         <int>   <dbl>     <int>    <dbl>          <int>          <dbl>
## 1           1      40         4    172.               2              4
## 2           3     117         3    127.               4              4
## 3           4     210         2     81.2              5              5
## 4           5     174         3     90.1              4              4
## 5           6      57         5    172.               3              3
## 6           7      26         3    104.               2              4
## # ℹ 2 more variables: Mgroup_earlier <dbl>, RFM_earlier <chr>
dim(rfm_earlier)
## [1] 8334    8

7.2 Segment migration matrix

Build the full migration matrix between the earlier and current segments as a long data frame — one row per (earlier_segment, current_segment) pair, with a column giving the number of customers in that cell. Keep this full table; Q7.3 will use it.

Then print the top 10 migration cells by customer count, sorted from largest to smallest. A simple arrange(desc(n)) %>% slice_head(n = 10) is enough.

migration_full <- rfm_earlier %>%
  inner_join(
    rfm %>%
      semi_join(eligible_customers, by = "customer_id") %>%
      select(
        customer_id,
        RFM_current = RFM_indep,
        current_monetary = monetary
      ),
    by = "customer_id"
  ) %>%
  count(RFM_earlier, RFM_current, name = "n")

top10_migration <- migration_full %>%
  arrange(desc(n)) %>%
  slice_head(n = 10)

knitr::kable(top10_migration)
RFM_earlier RFM_current n
555 555 620
111 111 275
211 111 196
111 211 178
444 444 131
211 211 120
455 444 105
544 555 104
111 311 95
311 111 91

7.3 Identify the three most consequential migrations

A migration is consequential if it represents both (i) a meaningful number of customers and (ii) a meaningful change in expected value (e.g., high-value customers moving toward a “lost” segment costs more than low-value customers doing the same).

Define your own metric for “consequential” (e.g., customer count × current monetary, or count × the response-rate gap between the two segments). Defend your metric briefly. Then identify your top three migrations and report the customer count and your metric value for each.

migration_value <- rfm_earlier %>%
  inner_join(
    rfm %>%
      semi_join(eligible_customers, by = "customer_id") %>%
      select(
        customer_id,
        RFM_current = RFM_indep,
        current_monetary = monetary
      ),
    by = "customer_id"
  ) %>%
  group_by(RFM_earlier, RFM_current) %>%
  summarise(
    n = n(),
    avg_current_monetary = mean(current_monetary, na.rm = TRUE),
    consequential_score = n * avg_current_monetary,
    .groups = "drop"
  ) %>%
  arrange(desc(consequential_score))

top3_migrations <- migration_value %>%
  slice_head(n = 3)

knitr::kable(top3_migrations)
RFM_earlier RFM_current n avg_current_monetary consequential_score
111 111 275 732 201184
211 111 196 715 140223
111 211 178 712 126793

Metric definition and rationale (3–5 sentences): I define a consequential migration by multiplying customer count by their current spending. This metric is helpful because it looks at both the volume of the movement and the actual money involved. It helps show that a huge group of people moving might not matter much if they barely spend anything, whereas a smaller group of big spenders moving could be a much bigger deal.

7.4 Design an intervention campaign

Pick one of your top-three migrations from 7.3. Propose an intervention campaign (~150 words suggested) that includes:

  • Target segment definition (which migration are you intervening on?)
  • Channel and message (one sentence each)
  • Assumed response-rate uplift relative to a do-nothing baseline (with a brief justification — what is your anchor?)
  • Breakeven check (does the assumed rate clear the homework’s £2 / £9.75 breakeven?)
  • Expected ROMI

Include any code below for the breakeven and ROMI calculation.

selected_migration <- top3_migrations %>%
  filter(RFM_earlier == "111", RFM_current == "211") # I chose 111 -> 211 customers 

target_n <- selected_migration$n

cost <- 2
profit_per_converter <- 9.75

breakeven_rate <- cost / profit_per_converter

assumed_response_rate <- 0.30 
# I assume a 30% response rate because these customers were previously highly engaged (111) and should respond better than average customers to a personalised retention campaign

expected_converters <- target_n * assumed_response_rate
campaign_cost <- target_n * cost
expected_profit <- expected_converters * profit_per_converter - campaign_cost
expected_ROMI <- expected_profit / campaign_cost

intervention_results <- tibble(
  target_migration = paste0(
    selected_migration$RFM_earlier,
    " to ",
    selected_migration$RFM_current
  ),
  target_n = target_n,
  breakeven_rate = breakeven_rate,
  assumed_response_rate = assumed_response_rate,
  expected_converters = expected_converters,
  campaign_cost = campaign_cost,
  expected_profit = expected_profit,
  expected_ROMI = expected_ROMI
)

knitr::kable(intervention_results)
target_migration target_n breakeven_rate assumed_response_rate expected_converters campaign_cost expected_profit expected_ROMI
111 to 211 178 0.205 0.3 53.4 356 165 0.462

Intervention proposal (~150 words suggested):

I’m targeting on customers moving from 111 to 211 because they used to be our best shoppers but are now starting to slip. This group is still very important for the business, and if we step in now, we can stop them from becoming low-value customers and increase retention rate (which helps a lot with the CLV).

I’ll use email for this because it’s a relatively cheaper way and allows us to send personalised messages. The goal is to bring them back with a reminder message: “We miss you! Here is a special reward for your favorite GourmetBox items”. We can uses what we already know they like to make the message feel more personal.

I’m expecting a 30% response rate since these people were very active in the past and are usually more likely to buy than others. Since the breakeven point is roughly 20.5% (£2 / £9.75), this 30% estimate good enough for us to stay safe. If things go as planned, the campaign should bring good profit and strong ROMI, making it a solid plan to keep our customers.

7.5 When does migration analysis mislead?

In ~100 words, name two situations in which migration analysis is more useful than static RFM, and two situations in which it could mislead.

Migration analysis is better than just looking at a single RFM snapshot when we want to see if customers are getting better or worse over time. It’s great for preventing people from churning, because you can catch someone moving from a strong group to a weaker one (e.g., 111 -> 211 from above example) and message them before they stop buying entirely.

However, some situations may be misleading. First, seasonality, if your business is seasonal, a customer might look like they are “declining” just because it’s the off-season, not because they’ve actually lost interest. It can also be confusing for people who don’t buy on a regular schedule, we might think they are turning to a worse segment when they’re actually just taking a normal break between orders.


Submission checklist

Before submitting, verify: