This analysis aims to provide the client, a large mobile data provider, with insights on effectiveness of a recent retargeting campaign. I address four key areas in my approach to the questions of interest.

1. Average incremental effect (ITT) of retargeting on spending

Key findings

The ITT of the retargeting campaign on spending was an increase of 0.129 dollars on average, or 12.9 cents.

It was not unwise for the company to engage in retargeting, as no losses occurred on average. However, the ITT effect seems quite small. The retargeting campaign yielded a net increase in spending per-customer of 2.9 cents on average. Taking into account the 95% confidence interval, we see that for a substantial proportion of retargeted consumers, the change in spending was actually less than the cost to retarget them.

Support for findings

The difference in mean spend between the treatment and control groups (ITT) is 0.129096. In other words, targeting a customer causes an increase in customer spending by 0.129 dollars on average.

To obtain a confidence interval for the ATE or test the null hypothesis that the two means are equal, I used the t.test function:


    Welch Two Sample t-test

data:  spend by W
t = -2.9362, df = 599088, p-value = 0.003323
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.21527117 -0.04292076
sample estimates:
mean in group 0 mean in group 1 
       2.450313        2.579409 

The confidence interval confirms that there is a positive average treatment effect of targeting on spending in the 0.043 to 0.215 dollar range.

Taking into account the total number of ad exposures and the corresponding bidding cost, the expected per-customer cost of retargeting is given at 0.1, or 0.10 dollars. Subtracting this from the ITT of the retargeting campaign on spending (+0.129 dollars) gives us a net profit per-customer of 0.029 dollars.

2. Linear regression model (Lasso), CATE variation

Key findings

The customer-level incremental effect (CATE) of the retargeting attempt is 0.2167. I note that this is larger than our earlier estimate of the ITT, 0.129, and slightly outside of the 95% confidence interval.

The CATE varies across customers; the graph of its distribution exhibits variance, and it has a high standard deviation. Graphing the total profit when targeting the top \(n\) percent of customers confirms that the model is predictive of behavior out-of-sample (See section 3.).

Support for findings

Despite it’s slightly high CATE, I chose the Lasso for its improvements on out-of-sample predictions. Notably, using the Lasso instead of OLS to estimate the linear model with treatment interactions returns larger profits.

I first created a matrix that includes all URL features and the targeting indicator, W. I contained the outcome, spending, in a column y. Then I estimated the Lasso using cross-validation. The number of non-zero coefficients, i.e. the number of columns that the Lasso selected is 128.

I then predicted the incremental targeting effects (CATE) in the validation sample.

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
 -89.4209   -0.0762    0.0000    0.2167    0.0966 1482.5587 
[1] 8.653854

The CATE, 0.2167, is larger than my earlier estimate of the ITT, 0.129. The distribution of the predicted incremental effects based on the Lasso is visibly less dispersed, and there are also fewer negative values. These indicate the Lasso’s ability to manage overfitting of the data.

I added a tiny amount of noise to the prediction before creating the score to increase the number of distinct values predicted.

Note: the score (group) is based on cate, the predicted incremental effect on spending. To validate this prediction, I calculated diff_spend, the difference in mean spending between customers who were targeted and customers who were not targeted, for each score level. diff_spend is measured in incremental dollars. This is akin to separately calculating an ATE for each group (score).

The lift chart reveals that the model can distinguish between customers with larger and smaller incremental effects.

3. Optimal profit-maximizing personalized targeting policy

Key findings

Under the optimal, profit-maximizing personalized targeting policy, total profits are 0.979 million dollars (per one million customers), and 15.18% of all customers are targeted using the optimal targeting policy.

Targeting none of the customers yields 0.893 million dollars, whereas targeting all customers results in a profit of 0.836 million dollars. The optimal targeting policy clearly improves over these two alternatives. Personalized targeting based on the incremental value of targeting works in this application.

Support for findings

I evaluated the profits based on an optimal targeting strategy that is predicted using the Lasso incremental targeting effects.

I compared these profits to no targeting or blanket targeting, i.e. targeting none or all customers, respectively:

The optimal targeting policy clearly improves over these two alternatives. Personalized targeting based on the incremental value of targeting works in this application.

I then calculated the total profit when targeting the top \(n\) percent of customers, using cate_lasso as the score or ranking variable.

The initial steep increase in profits when \(n\) increases from 0 to 0.1 indicates that the out-of-sample prediction of the incremental targeting effects is fairly accurate. The profit curve starts to decline significantly for \(n\) above 0.3, once the profit contribution of targeting a customer is less than the targeting cost.

4. Spending differences based on ad exposure

Key findings

The difference in mean spend between customers in the treatment group who were exposed to an ad and customers who were not exposed is 1.524 dollars. The ITT effect of the retargeting campaign on spending was an increase of 0.129 dollars on average, which is significantly less than the difference we calculated above.

The crux of this issue is that being assigned to the treatment group does not necessary entail that a customer is exposed to an ad. This difference simply indicates that customers who can be exposed spend more than customers who cannot be exposed to an ad.

Support for findings

Being assigned to the treatment group does not necessary entail that a customer is exposed to an ad. Rather, they are in the group that is entered into the ad auction, who may then become exposed to an ad should they win the bid. It is incorrect to assume a causal relationship because high-value customers can be exposed, but low-value customers do not see ads.

Mean spending for customers exposed to ads, mean spending for customers not exposed to ads, and the difference, respectively:

[1] 3.677777
[1] 2.153885
[1] 1.523892

The difference in mean spend between customers who were exposed to an ad and customers who were not exposed is 1.523892. This difference simply indicates that customers who can be exposed spend more than customers who cannot be exposed to an ad. There is confounding of ad exposure with customer type/spending level.

The ITT of the retargeting campaign on spending was an increase of 0.129 dollars on average, which is significantly less than the difference we calculated above. However, this discrepancy makes sense given the high confounding variables.