This document includes the suggested solutions of the final project. For the rubrics, please see Canvas.
load("Final_Project.RData")Question 1
Question 1a
# to specify the constraint matrix with 4 parameters
ui <- diag(4) # an identify matrix 4*4
ci <- rep(0,4) # all all-zero vector of length 4
# para: the starting values; a vector with 4 elements corresponding to (r,alpha,a,b).
# it usually does not matter much with different starting values.
# you can change this to other values or test different values.
para <- rep(.01,4)
# to run the constrained optimization
results <- constrOptim(para, lik_bgnbd, NULL, ui, ci)
# to get the estimated parameters in the order of (r,alpha,a,b)
para <- results$par
names(para) <- c("r","alpha","a","b")
para## r alpha a b
## 0.05731699 0.43525166 0.40168860 3.43605575
Question 1b
To do the prediction, we can use the function pred_Y ans
set the time t=60.
# so we set t = 60
t <- 60
# a loop procedure
# you can also use "apply" by changing a function
# a vector to store the results
Yhat <- rep(0,2074)
for (i in 1:2074) {
H <- CLV_data[i,] # getting the i-th user
Yhat[i] <- as.numeric(pred_Y(para,H,t)) # storing the results of the i-th user
}With Yhat, we first visualize the distribution with a
histogram.
hist(Yhat, breaks = 50)To get a closer look on the distribution of Yhat, we
next get the descriptive statistics.
desc_Yhat <- c(mean(Yhat),median(Yhat),sd(Yhat))
names(desc_Yhat) <- c("Mean","Median","Std. Dev.")
desc_Yhat## Mean Median Std. Dev.
## 3.0010238 0.1037716 15.0097306
Question 1c
This is an open question. Please see Canvas for the grading criteria. One possible answer could be:
- The distribution is left-skewed, meaning most of clicks come from a few users and most of users have no clicks.
- For the recommender system, this provides good benchmarks for 1) whether the distribution is expected by the company, and 2) the effectiveness of the recommender system given the mass of the low clicks.
Question 2
Question 2a
As Cards is a list, we use lapply to apply
the function of estimate_bass to the list.
# estimate the bass parameters
bass.pars <- lapply(Cards, estimate_bass)
# transform the parameters to a data frame
bass.pars <- t(matrix(unlist(bass.pars), 3, 2538))
bass.pars <- as.data.frame(bass.pars)
colnames(bass.pars) <- c("p","q","M")We can next obtain the histgram and description stats of the bass
pars. We first build a function to produce the results and then we can
apply it to the three parameters in bass.pars.
# the first input argument is the vector x
# the second is the name to be used
descr_bass <- function(x, name.x) {
g <- hist(x, breaks = 50,
main = paste("Histogram of",name.x))
desc <- c(mean(x),median(x),sd(x))
names(desc) <- paste(c("Mean","Median","Std. Dev."),
"of",name.x)
g
desc
}The description stats and histogram of p:
descr_bass(bass.pars$p,"p")## Mean of p Median of p Std. Dev. of p
## 0.02208286 0.01251350 0.02931045
The description stats and histogram of q:
descr_bass(bass.pars$q,"q")## Mean of q Median of q Std. Dev. of q
## 0.5398106 0.1442026 4.4683900
The description stats and histogram of M:
descr_bass(bass.pars$M,"M")## Mean of M Median of M Std. Dev. of M
## 10114.3572 253.8988 57296.7644
Question 2b
You can build the model in different ways. The objective is to beat
the bench mark. I’ll show you here the benchmark model with a simple
linear regression. Note that with the simple linear regression, you
cannot use talkID or contentId as the DV as
this would lead to too many IVs. For your own model, you can use these
variables. For example, you can use a LASSO-based regression with
contentID as an extra IV by setting some of levels to
zeros. For the specific grading criteria, please check Canvas.
First, let’s merge the two datasets bass.pars and
Cards_features:
Cards_features <- cbind(Cards_features, bass.pars[,1:2])
head(Cards_features)## mlogId publishTime type contentId talkId gender registeredMonthCnt
## 1 GCGCHCMCHCPC 168 0 45 38 female 48
## 2 GCGCKCKCOCNC 168 0 45 361 male 48
## 3 GCHCJCMCKCNC 168 0 6 21 male 32
## 4 GCHCLCNCGCMC 168 0 45 346 male 37
## 5 GCHCLCPCOCKC 168 0 45 332 female 43
## 6 GCHCMCLCOCPC 168 0 45 38 female 30
## follows followeds creatorType level p q
## 1 5 2204 1 8 0.008308272 0.03773790
## 2 20 21 1 7 0.005352587 0.03415290
## 3 9 49 1 8 0.017085047 0.17263693
## 4 22 47 1 8 0.012667936 0.07551139
## 5 26 43 1 8 0.012376498 0.06145323
## 6 18 105 1 9 0.071706785 0.01644253
To run a simple linear regression on the first 2,000 observations.
benchmark_p <- lm(log(p) ~ publishTime + type + gender + registeredMonthCnt +
follows + followeds + creatorType + level, Cards_features[1:2000,])
benchmark_q <- lm(log(q) ~ publishTime + type + gender + registeredMonthCnt +
follows + followeds + creatorType + level, Cards_features[1:2000,])To show the MSE of the model on the test sample:
mse_benchmark_p <- predict(benchmark_p,
newdata = Cards_features[-c(1:2000),])
mse_benchmark_p <- mean((mse_benchmark_p-
log(Cards_features$p[-c(1:2000)]))^2)
mse_benchmark_q <- predict(benchmark_q,
newdata = Cards_features[-c(1:2000),])
mse_benchmark_q <- mean((mse_benchmark_q-
log(Cards_features$q[-c(1:2000)]))^2)
mse_benchmark <- c(mse_benchmark_p, mse_benchmark_q)
names(mse_benchmark) <- c("p", "q")
mse_benchmark## p q
## 18.963256 1.239371
Question 2c
This is an open question. Please find the specific grading criteria on Canvas.
Question 3
Question 3a
First, we run the Fuzzy RDD analysis following the three steps outlined in the assignment.
To begin the analysis, run a simple linear regression of
first_screen on assigend_first_screen and
match_scores and their interactions.
mdl_1 <- lm(first_screen ~ assigned_first_screen*match_scores, Impressions)
summary(mdl_1)##
## Call:
## lm(formula = first_screen ~ assigned_first_screen * match_scores,
## data = Impressions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.1553 -0.3740 0.1544 0.2635 1.1897
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.476421 0.003037 156.86 <2e-16 ***
## assigned_first_screen 0.148263 0.006889 21.52 <2e-16 ***
## match_scores 0.261324 0.005159 50.65 <2e-16 ***
## assigned_first_screen:match_scores -0.130881 0.006476 -20.21 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4202 on 49996 degrees of freedom
## Multiple R-squared: 0.2059, Adjusted R-squared: 0.2058
## F-statistic: 4321 on 3 and 49996 DF, p-value: < 2.2e-16
# get the predicted value of first_screen
first_screen_hat <- predict(mdl_1, type = "response")With first_screen_hat, we run a second (logistic)
regression with first_screen_hat and
match_scores as the IVs.
# bind first_screen_hat to Impressions
Impressions$first_screen_hat <- first_screen_hat
mdl_2 <- glm(isClick ~ first_screen_hat + match_scores, family = "binomial", Impressions)
summary(mdl_2)##
## Call:
## glm(formula = isClick ~ first_screen_hat + match_scores, family = "binomial",
## data = Impressions)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7292 -0.3292 -0.2997 -0.2365 3.2539
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.0892 0.2686 -15.223 < 2e-16 ***
## first_screen_hat 2.6564 0.5572 4.768 1.86e-06 ***
## match_scores -1.0531 0.1345 -7.828 4.96e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 17922 on 49999 degrees of freedom
## Residual deviance: 17532 on 49997 degrees of freedom
## AIC: 17538
##
## Number of Fisher Scoring iterations: 6
From the regression results, the conclusion is the recommender system indeed has some value in the sense that the cards on the first_screen are more likely to be clicked than those not in the first screen.
To interpret the results, you need to transform the estimated effects for easy interpretation. You can use, for example, the “log odds ratio” or the “average marginal effect”.
For a logistic regression, the log odds ratio equal to the coefficient. The interpretation is the probability of being clicked is 2.6564 times of that of not clicked, if a card is recommended by the recommender system.
For the average marginal effect, the definition is: \[ AME = \dfrac{\partial P(isClicked)}{\partial FirstScreen} = \frac{1}{N}\sum_i\dfrac{exp(V_i)\times\beta_{firstscreen}}{(1+exp(V_i))^2} \]
Where \(V_i=Intercept + \beta_{firstscreen}\times FirstScreen + \beta_{matchingscores}\times MatchingScores\).
# get the predicted V_i from the model
V <- predict(mdl_2)
# getting the average marginal effect
average_marginal_effect <- mean(exp(V)/((1+exp(V))^2))*2.6564
average_marginal_effect## [1] 0.1099258
The average marginal effect is the expected effect of first_screen on the probability of being clicked. On average, if recommended by the recommender system, a card is 10.99% more likely to be clicked.
Lastly, the above are the numerical interpretations. The empirical interpretations are more complex, due to the setup and assumptions of the model. There are two important points to emphasize in the interpretation.
- First, we already remove samples and focus on a local sample with matching scores that are bounded in a reasonable range, as stated in the question. Therefore, the effect is a “local” effect or “sample average treatment effect” at the best. This part is the same as the normal sharp RDD as discussed in class.
- Second, essentially, we use
assigned_first_screenas the instrument forfirst_screento estimate the local effect. Due to the assignment issues, not all cards with high enough matching scores that should have been assigned to the first screen were assigned to the first screen. This is whyassigned_first_screenandfirst_screenare not the same. This is the well-known “non-compliance” in experimental studies. The identified local effect is more “local” as we can only identify the effect for “compliers” (cards that should have been assigned to the first screen were actually assigned to the first screen) if we assume there were no “defiers”.
In conclusion, the identified effect is at best the sample average treatment effect for compliers.
Question 3b
This is an open question. Here I include one set of exemplary answers to the questions. Please see Canvas for the specific grading criteria.
- One possible design is to use a “random” recommendation system as the control group, where the cards are selected at random and recommended to people. The treatment group is the default recommendations made by the company. So, no disruptions.
- The counterfactual is the average no. of clicks of the cards, had the cards were recommended to people randomly.
- One possible and common way to do the assignment is to run the experiments for a certain interval, and for each visitor to the app, do a Bernoulli draw of 1 or 0 and assign the visitor to treatment/control accordingly.