Causal Machine Learning

Author

Richmond Silvanus Baye

Published

June 20, 2025

Intoduction

This tutorial introduces explainable AI through the lens of causal machine learning (Causal ML). While most machine learning models focus purely on prediction, Causal ML aims to understand the why behind the outcome by estimating causal effects. For example, are there individual or group differences (heterogeneity)or what would have happened under a different exposure or policy (the counterfactual).

Causal ML sits at the intersection of machine learning and causal inference. It not only helps us predict outcomes but also allows us to simulate “what-if” scenarios and individual effects.

If you’d like to explore Causal ML in more depth, here are some excellent resources:

Use cases

Causal ML is used in a wide range of domains, including e-commerce, digital marketing, finance, and healthcare.

In health economics, Causal ML can assess the effects of policies on population outcomes.
In drug development, it enables individualized treatment effect estimation to support personalized clinical decision-making (patient voice).

In this tutorial, we’ll focus on an e-commerce-inspired use case from the ride-hailing industry (Uber).

Case Study: Causal Effect of Surge Pricing Opt-Out

Question:

What is the causal effect of opting out of surge pricing alerts on the number of rides taken during peak hours?

Problem

We cannot run a traditional A/B test by randomly forcing some users to accept surge pricing and others to opt out. Opt-out behavior is self-selected, meaning users decide based on their own characteristics:

Riders who are price-insensitive or have urgent travel needs may accept surge pricing.
Riders who are price-sensitive or in non-urgent situations are more likely to opt out.

This introduces selection bias, making a simple comparison between opt-out and non-opt-out groups (A/B testing) unreliable.

Solution

Suppose Uber previously conducted an experiment that randomly assigned users to different versions of the opt-out prompt:

“Are you sure?” (confirmation step) [less easy]
“One-click toggle” (simplified opt-out) [easier]

Some versions made it easier to opt out than others. These prompt variants create random variation in the likelihood of opting out and can serve as an instrumental variable (IV):

They affect opt-out behavior (relevance assumption), but
They do not directly affect ride volume (exclusion restriction assumption).

We can use microsoft’s EconML Intent-To-Treat Doubly Robust Instrumental Variable estimator (DRIV) to understand this reduced form causal relationship. DRIV combines machine learning with causal inference to estimate the causal effect of opting out of surge pricing on ride volume during peak hours. This estimator helps to estimate the fact not every rider was offered the easier opt-out option.

Let’s begin by loading the packages. We will use python for this exercise.

Code

import shap
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import warnings 
warnings.filterwarnings("ignore")
import os
import sys

Code

#ML packages
import lightgbm as lgb
from sklearn.utils import resample
from sklearn.preprocessing import PolynomialFeatures

#EconMl
from econml.iv.dr import IntentToTreatDRIV
from econml.iv.dr import LinearIntentToTreatDRIV
from econml.cate_interpreter import SingleTreeCateInterpreter, SingleTreePolicyInterpreter

Note that for you to be able to load the EconML package successfully, you need to have :

#pip install econml

Data Simulation

Because we do not have data readily available for this, we can create our own synthetic data of 100,000 observations using the code below.

Code

#Set a seed for reproducibility
np.random.seed(32)

# Sample size 
n = 100000

#Simulated data
data = pd.DataFrame({
    "rides_peak_pre" : np.random.poisson(3, size = n)
    , "total_rides_pre" : np.random.poisson(10, size=n)
    , "avg_fare_pre" : np.round(np.random.normal(15, 5, size=n), 2)
    , "user_city" : np.random.choice(["New York", "Chicago", "San Fransisco", "Boston", "Austin", "Boulder"], size = n)
    , "user_device" : np.random.choice(["IOS", "Andriod"], size =n)
    , "is_biz_account" : np.random.choice([0, 1], size = n, p = [0.8, 0.2])
    , "prompt_variant_easy" : np.random.choice([0, 1], size = n) #Instrument
})

#Simulate treatment assignemnt based on instrument with noise

data["opted_out"] = np.where(
    (data["prompt_variant_easy"] ==1) & (np.random.rand(n)< 0.7), 1
    , np.where((data["prompt_variant_easy"] == 0) & (np.random.rand(n)< 0.3), 1,0)
)

#Simulate post-treatment behavior with some treatment effect
data["rides_peak_post"] = (
    data["rides_peak_pre"] + 
np.random.normal(0, 1, size = n) +
1.5 * (1-data["opted_out"]) #Users who did not opt-out take more pre rides
).round().astype(int)

#Ensure there is no negative rides 
data["rides_peak_post"] = data["rides_peak_post"].clip(lower= 0)

# Create directory if it doesn't exist
os.makedirs('analysis', exist_ok=True)

# Save to CSV
data.to_csv('analysis/synthetic_data.csv', index=False)

# Display the first few rows of the data
data.head()

	rides_peak_pre	total_rides_pre	avg_fare_pre	user_city	user_device	is_biz_account	prompt_variant_easy	opted_out	rides_peak_post
0	6	13	18.60	Austin	IOS	1	1	1	7
1	3	13	21.62	Chicago	IOS	0	0	0	4
2	4	13	19.51	Chicago	Andriod	0	1	0	6
3	1	9	16.94	Austin	Andriod	0	1	1	4
4	3	12	16.00	New York	Andriod	0	1	1	3

Exploratory Data Analysis

With our data generated, we can explore the distribution of the features and implement the model.

Code

# Set up the subplot grid
fig, axes = plt.subplots(2, 2, figsize=(8, 6))

# Plot 1: Distribution of Post-Peak Rides by Opt-Out Status
sns.histplot(data, x="rides_peak_post", hue="opted_out", multiple="stack", ax=axes[0, 0], palette="CMRmap")
axes[0, 0].set_title("Post-Peak Rides by Opt-Out Status")
axes[0, 0].set_xlabel("Rides During Peak (Post)")
axes[0, 0].set_ylabel("Count")

# Plot 2: Boxplot of Average Fare by Opt-Out Status
sns.boxplot(data=data, x="opted_out", y="avg_fare_pre", ax=axes[0, 1], palette="CMRmap")
axes[0, 1].set_title("Average Fare (Pre) by Opt-Out Status")
axes[0, 1].set_xlabel("Opted Out")
axes[0, 1].set_ylabel("Average Fare (Pre)")

# Plot 3: Opt-Out Rate by City
city_opt_out = data.groupby("user_city")["opted_out"].mean().reset_index()
sns.barplot(data=city_opt_out, x="user_city", y="opted_out", ax=axes[1, 0], palette="CMRmap")
axes[1, 0].set_title("Opt-Out Rate by City")
axes[1, 0].set_xlabel("City")
axes[1, 0].set_ylabel("Opt-Out Rate")
axes[1, 0].set_xticklabels(axes[1,0].get_xticklabels(), fontsize=6)

# Plot 4: Scatter Plot of Pre vs. Post Peak Rides colored by Opt-Out
sns.scatterplot(data=data, x="rides_peak_pre", y="rides_peak_post", hue="opted_out", alpha=0.7, ax=axes[1, 1], palette="CMRmap")
axes[1, 1].set_title("Pre vs Post Peak Rides")
axes[1, 1].set_xlabel("Pre Peak Rides")
axes[1, 1].set_ylabel("Post Peak Rides")

plt.tight_layout()
plt.show()

Key summary from the data. We observe the following.

Post-peak rides are higher among users who did not opt out of surge pricing alerts.
Average fares are similar across opt-out groups, suggesting fare levels alone may not drive opt-out behavior.
Opt-out rates vary slightly by city, with no city showing extreme deviation.
There’s a positive relationship between pre- and post-peak rides and stronger for users who did not opt out.

Causal Effect with EconML

Having explored key features of the data, we can now implement our casual effect.

To ensure that we are able to run the model successfully, we create a dummy for the categorical features using One-hot encoder.

Code

# One-hot encode categorical variables
data = pd.get_dummies(data, columns=["user_city", "user_device"])

Next, we define our key variables for analysis. The instrument variable (Z) is prompt_variant_easy, representing the nudge intervention. The treatment variable is opted_out, indicating whether a customer chose to opt out of the prompt. Our outcome variable of interest is rides_peak_post, which captures ride activity during peak hours after the intervention.

We identify several potential confounders that could influence both treatment assignment and the outcome. These include the customer’s city, mobile operating system, average fare price, and whether the account is associated with a corporate (business) profile. To account for these factors, we include them as control variables in our analysis.

Code

# Define the instrument, treatment, outcome
Z = data['prompt_variant_easy']  # Instrument
T = data['opted_out']  # Treatment
Y = data['rides_peak_post']  # Outcome

# Define features excluding the instrument, treatment, and outcome
X = data.drop(columns=['prompt_variant_easy', 'opted_out', 'rides_peak_post'])

In our data generation process, the treatment effect is given by

\[ \text{treatment\_effect} = 1.5 \times (1 - \text{opted\_out}) \]

This implies that the treatment effect is + 1.5 if the rider did not opt-out and 0 if the rider opted out and we are seeking to learn this from the data.

To do that we define our function for the treatment effect.

Code

# Define nuisance models
lgb_T_XZ_params = {
    'objective': 'binary',
    'metric': 'auc',
    'learning_rate': 0.1,
    'num_leaves': 30,
    'max_depth': 5,
    'verbosity' : -1
}

lgb_Y_X_params = {
    'metric': 'rmse',
    'learning_rate': 0.1,
    'num_leaves': 30,
    'max_depth': 5,
    'verbosity' : -1
}

model_T_XZ = lgb.LGBMClassifier(**lgb_T_XZ_params)
model_Y_X = lgb.LGBMRegressor(**lgb_Y_X_params)
flexible_model_effect = lgb.LGBMRegressor(**lgb_Y_X_params)

Having defined the nuisance parameters and without assuming a linear functional form, we use XGBoost to flexibly estimate the priors. This non-parametric approach allows us to capture complex relationships between covariates and the treatment or outcome. With these estimates in place, we proceed to train the causal model using the IntentToTreatDRIV estimator from the EconML library.

Code

# Train EconML model using IntentToTreatDRIV
model = IntentToTreatDRIV(
    model_y_xw=model_Y_X,
    model_t_xwz=model_T_XZ,
    flexible_model_effect=flexible_model_effect
)

Code

# Fit the model
model.fit(Y, T, Z=Z, X=X)

<econml.iv.dr._dr.IntentToTreatDRIV at 0x14a149fd0>

Code

# Get the causal effect
causal_effect = model.effect(X)
print("Causal Effect of Opting Out on Rides During Peak Hours:", causal_effect.mean())

Causal Effect of Opting Out on Rides During Peak Hours: -1.4725080945552658

Code

model.effect(X[:8])

array([-1.37491246, -1.29706647, -1.45915156, -1.48616581, -1.60759469,
       -1.2136639 , -1.36240566, -1.32523829])

Our estimated causal effect is negative, indicating that opting out of surge pricing alerts results in fewer rides during peak hours. For a business aiming to scale, this finding highlights the behavioral influence of price visibility. The alerts likely nudge users to engage more during peak periods and they do this either by making them aware of dynamic pricing or by prompting time-sensitive decisions. This suggests a strategic opportunity: for users who opt out, alternative incentives such as loyalty points or discounts on future rides could be deployed to maintain or boost peak-hour engagement.

Heterogenous Treatment Effect & Policy

One might ask whether this effect holds uniformly across all customers. To explore this, we conducted a heterogeneous treatment effect analysis to inform targeted policy recommendations. Specifically, we used the SingleTreeCateInterpreter to fit a simplified decision tree to the estimated treatment effects. On average, the treatment effect was -1.473, indicating an overall reduction in peak-hour rides due to the intervention. However, the decision tree revealed clear segmentation:

Users shaded in dark red experienced strongly negative effects and this suggests that surge pricing alerts reduced their peak-hour ride activity.
Users in green, typically those with high prior ride frequency, responded positively to the alerts, showing increased ride activity.

These findings suggest that the intervention is beneficial primarily for a narrow, high-usage segment, but could be counterproductive for the broader user base. A one-size-fits-all approach may therefore reduce overall engagement, highlighting the need for more targeted communication strategies.

Code

# Use SingleTreeCateInterpreter to interpret the treatment effects

intrp = SingleTreeCateInterpreter(max_depth=2, min_samples_leaf=10)
intrp.interpret(model, X)

# Plot the decision tree
plt.figure(figsize=(15, 8))
intrp.plot(feature_names=X.columns, fontsize=11)
plt.show()

Conclusion

Our analysis reveals that surge pricing alerts have a modest overall effect, with substantial heterogeneity across user segments. While high-frequency riders tend to respond positively, the broader user base shows little to no benefit, and in some cases, reduced engagement. These insights suggest that a one-size-fits-all alert strategy may be sub-optimal. Instead, targeted interventions informed by user behavior and engagement patterns are likely to yield greater impact and efficiency.

--- title: "Causal Machine Learning" author: "Richmond Silvanus Baye" date: "2025-06-20" toc: true toc-depth: 3 code-fold: true code-tools: true theme: cosmo editor: visual format: html jupyter: python3 --- ## **Intoduction** This tutorial introduces explainable AI through the lens of causal machine learning (Causal ML). While most machine learning models focus purely on prediction, Causal ML aims to understand the *why* behind the outcome by estimating causal effects. For example, are there individual or group differences (*heterogeneity*)or what would have happened under a different exposure or policy (the *counterfactual*). Causal ML sits at the intersection of machine learning and causal inference. It not only helps us predict outcomes but also allows us to simulate "what-if" scenarios and individual effects. If you’d like to explore Causal ML in more depth, here are some excellent resources: - [CausalML Library Documentation: https://causalml.readthedocs.io/](https://causalml.readthedocs.io/en/latest/about.html) - [The Book of Why: https://causalml-book.org/](https://causalml-book.org/) - [Nature Medicine Article on Causal ML: https://www.nature.com/articles/s41591-024-02902-1](https://www.nature.com/articles/s41591-024-02902-1) ### **Use cases** Causal ML is used in a wide range of domains, including e-commerce, digital marketing, finance, and healthcare. - In health economics, Causal ML can assess the effects of policies on population outcomes. - In drug development, it enables individualized treatment effect estimation to support personalized clinical decision-making (patient voice). In this tutorial, we'll focus on an e-commerce-inspired use case from the ride-hailing industry (Uber). ## **Case Study: Causal Effect of Surge Pricing Opt-Out** ### **Question:** What is the causal effect of opting out of surge pricing alerts on the number of rides taken during peak hours? ### **Problem** We cannot run a traditional A/B test by randomly forcing some users to accept surge pricing and others to opt out. Opt-out behavior is self-selected, meaning users decide based on their own characteristics: - Riders who are price-insensitive or have urgent travel needs may accept surge pricing. - Riders who are price-sensitive or in non-urgent situations are more likely to opt out. This introduces selection bias, making a simple comparison between opt-out and non-opt-out groups (A/B testing) unreliable. ### **Solution** Suppose Uber previously conducted an experiment that randomly assigned users to different versions of the opt-out prompt: - “Are you sure?” (confirmation step) \[less easy\] - “One-click toggle” (simplified opt-out) \[easier\] Some versions made it easier to opt out than others. These prompt variants create random variation in the likelihood of opting out and can serve as an instrumental variable (IV): - They affect opt-out behavior (relevance assumption), but - They do not directly affect ride volume (exclusion restriction assumption). We can use microsoft's EconML I*ntent-To-Treat Doubly Robust Instrumental Variable estimator (DRIV)* to understand this reduced form causal relationship. DRIV combines machine learning with causal inference to estimate the causal effect of opting out of surge pricing on ride volume during peak hours. This estimator helps to estimate the fact not every rider was offered the easier opt-out option. Let's begin by loading the packages. We will use python for this exercise. ```{python} import shap import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline import warnings warnings.filterwarnings("ignore") import os import sys ``` ```{python} #ML packages import lightgbm as lgb from sklearn.utils import resample from sklearn.preprocessing import PolynomialFeatures #EconMl from econml.iv.dr import IntentToTreatDRIV from econml.iv.dr import LinearIntentToTreatDRIV from econml.cate_interpreter import SingleTreeCateInterpreter, SingleTreePolicyInterpreter ``` Note that for you to be able to load the EconML package successfully, you need to have : ``` #pip install econml ``` ### Data Simulation Because we do not have data readily available for this, we can create our own synthetic data of 100,000 observations using the code below. ```{python} #Set a seed for reproducibility np.random.seed(32) # Sample size n = 100000 #Simulated data data = pd.DataFrame({ "rides_peak_pre" : np.random.poisson(3, size = n) , "total_rides_pre" : np.random.poisson(10, size=n) , "avg_fare_pre" : np.round(np.random.normal(15, 5, size=n), 2) , "user_city" : np.random.choice(["New York", "Chicago", "San Fransisco", "Boston", "Austin", "Boulder"], size = n) , "user_device" : np.random.choice(["IOS", "Andriod"], size =n) , "is_biz_account" : np.random.choice([0, 1], size = n, p = [0.8, 0.2]) , "prompt_variant_easy" : np.random.choice([0, 1], size = n) #Instrument }) #Simulate treatment assignemnt based on instrument with noise data["opted_out"] = np.where( (data["prompt_variant_easy"] ==1) & (np.random.rand(n)< 0.7), 1 , np.where((data["prompt_variant_easy"] == 0) & (np.random.rand(n)< 0.3), 1,0) ) #Simulate post-treatment behavior with some treatment effect data["rides_peak_post"] = ( data["rides_peak_pre"] + np.random.normal(0, 1, size = n) + 1.5 * (1-data["opted_out"]) #Users who did not opt-out take more pre rides ).round().astype(int) #Ensure there is no negative rides data["rides_peak_post"] = data["rides_peak_post"].clip(lower= 0) # Create directory if it doesn't exist os.makedirs('analysis', exist_ok=True) # Save to CSV data.to_csv('analysis/synthetic_data.csv', index=False) # Display the first few rows of the data data.head() ``` ### Exploratory Data Analysis With our data generated, we can explore the distribution of the features and implement the model. ```{python} # Set up the subplot grid fig, axes = plt.subplots(2, 2, figsize=(8, 6)) # Plot 1: Distribution of Post-Peak Rides by Opt-Out Status sns.histplot(data, x="rides_peak_post", hue="opted_out", multiple="stack", ax=axes[0, 0], palette="CMRmap") axes[0, 0].set_title("Post-Peak Rides by Opt-Out Status") axes[0, 0].set_xlabel("Rides During Peak (Post)") axes[0, 0].set_ylabel("Count") # Plot 2: Boxplot of Average Fare by Opt-Out Status sns.boxplot(data=data, x="opted_out", y="avg_fare_pre", ax=axes[0, 1], palette="CMRmap") axes[0, 1].set_title("Average Fare (Pre) by Opt-Out Status") axes[0, 1].set_xlabel("Opted Out") axes[0, 1].set_ylabel("Average Fare (Pre)") # Plot 3: Opt-Out Rate by City city_opt_out = data.groupby("user_city")["opted_out"].mean().reset_index() sns.barplot(data=city_opt_out, x="user_city", y="opted_out", ax=axes[1, 0], palette="CMRmap") axes[1, 0].set_title("Opt-Out Rate by City") axes[1, 0].set_xlabel("City") axes[1, 0].set_ylabel("Opt-Out Rate") axes[1, 0].set_xticklabels(axes[1,0].get_xticklabels(), fontsize=6) # Plot 4: Scatter Plot of Pre vs. Post Peak Rides colored by Opt-Out sns.scatterplot(data=data, x="rides_peak_pre", y="rides_peak_post", hue="opted_out", alpha=0.7, ax=axes[1, 1], palette="CMRmap") axes[1, 1].set_title("Pre vs Post Peak Rides") axes[1, 1].set_xlabel("Pre Peak Rides") axes[1, 1].set_ylabel("Post Peak Rides") plt.tight_layout() plt.show() ``` Key summary from the data. We observe the following. - Post-peak rides are higher among users who did not opt out of surge pricing alerts. - Average fares are similar across opt-out groups, suggesting fare levels alone may not drive opt-out behavior. - Opt-out rates vary slightly by city, with no city showing extreme deviation. - There's a positive relationship between pre- and post-peak rides and stronger for users who did not opt out. ### Causal Effect with EconML Having explored key features of the data, we can now implement our casual effect. To ensure that we are able to run the model successfully, we create a dummy for the categorical features using One-hot encoder. ```{python} # One-hot encode categorical variables data = pd.get_dummies(data, columns=["user_city", "user_device"]) ``` Next, we define our key variables for analysis. The instrument variable (Z) is `prompt_variant_easy`, representing the nudge intervention. The treatment variable is `opted_out`, indicating whether a customer chose to opt out of the prompt. Our outcome variable of interest is `rides_peak_post`, which captures ride activity during peak hours after the intervention. We identify several potential confounders that could influence both treatment assignment and the outcome. These include the customer's city, mobile operating system, average fare price, and whether the account is associated with a corporate (business) profile. To account for these factors, we include them as control variables in our analysis. ```{python} # Define the instrument, treatment, outcome Z = data['prompt_variant_easy'] # Instrument T = data['opted_out'] # Treatment Y = data['rides_peak_post'] # Outcome # Define features excluding the instrument, treatment, and outcome X = data.drop(columns=['prompt_variant_easy', 'opted_out', 'rides_peak_post']) ``` In our data generation process, the treatment effect is given by $$ \text{treatment\_effect} = 1.5 \times (1 - \text{opted\_out}) $$ This implies that the treatment effect is + 1.5 if the rider did not opt-out and 0 if the rider opted out and we are seeking to learn this from the data. To do that we define our function for the treatment effect. ```{python} # Define nuisance models lgb_T_XZ_params = { 'objective': 'binary', 'metric': 'auc', 'learning_rate': 0.1, 'num_leaves': 30, 'max_depth': 5, 'verbosity' : -1 } lgb_Y_X_params = { 'metric': 'rmse', 'learning_rate': 0.1, 'num_leaves': 30, 'max_depth': 5, 'verbosity' : -1 } model_T_XZ = lgb.LGBMClassifier(**lgb_T_XZ_params) model_Y_X = lgb.LGBMRegressor(**lgb_Y_X_params) flexible_model_effect = lgb.LGBMRegressor(**lgb_Y_X_params) ``` Having defined the nuisance parameters and without assuming a linear functional form, we use XGBoost to flexibly estimate the priors. This non-parametric approach allows us to capture complex relationships between covariates and the treatment or outcome. With these estimates in place, we proceed to train the causal model using the `IntentToTreatDRIV` estimator from the EconML library. ```{python} # Train EconML model using IntentToTreatDRIV model = IntentToTreatDRIV( model_y_xw=model_Y_X, model_t_xwz=model_T_XZ, flexible_model_effect=flexible_model_effect ) ``` ```{python} # Fit the model model.fit(Y, T, Z=Z, X=X) ``` ```{python} # Get the causal effect causal_effect = model.effect(X) print("Causal Effect of Opting Out on Rides During Peak Hours:", causal_effect.mean()) ``` ```{python} model.effect(X[:8]) ``` Our estimated causal effect is negative, indicating that opting out of surge pricing alerts results in fewer rides during peak hours. For a business aiming to scale, this finding highlights the behavioral influence of price visibility. The alerts likely nudge users to engage more during peak periods and they do this either by making them aware of dynamic pricing or by prompting time-sensitive decisions. This suggests a strategic opportunity: for users who opt out, alternative incentives such as loyalty points or discounts on future rides could be deployed to maintain or boost peak-hour engagement. ### Heterogenous Treatment Effect & Policy One might ask whether this effect holds uniformly across all customers. To explore this, we conducted a heterogeneous treatment effect analysis to inform targeted policy recommendations. Specifically, we used the SingleTreeCateInterpreter to fit a simplified decision tree to the estimated treatment effects. On average, the treatment effect was **-1.473**, indicating an overall reduction in peak-hour rides due to the intervention. However, the decision tree revealed clear segmentation: - Users shaded in dark red experienced strongly negative effects and this suggests that surge pricing alerts reduced their peak-hour ride activity. - Users in green, typically those with high prior ride frequency, responded positively to the alerts, showing increased ride activity. These findings suggest that the intervention is beneficial primarily for a narrow, high-usage segment, but could be counterproductive for the broader user base. A one-size-fits-all approach may therefore reduce overall engagement, highlighting the need for more targeted communication strategies. ```{python} # Use SingleTreeCateInterpreter to interpret the treatment effects intrp = SingleTreeCateInterpreter(max_depth=2, min_samples_leaf=10) intrp.interpret(model, X) # Plot the decision tree plt.figure(figsize=(15, 8)) intrp.plot(feature_names=X.columns, fontsize=11) plt.show() ``` ### Conclusion Our analysis reveals that surge pricing alerts have a modest overall effect, with substantial heterogeneity across user segments. While high-frequency riders tend to respond positively, the broader user base shows little to no benefit, and in some cases, reduced engagement. These insights suggest that a one-size-fits-all alert strategy may be sub-optimal. Instead, targeted interventions informed by user behavior and engagement patterns are likely to yield greater impact and efficiency.