load("Assignment_1.Rdata")
ls()
## [1] "CLV_data" "Conjoint_data" "lik_bgnbd" "pred_Y"
This question uses “Conjoint_data”:
head(Conjoint_data)
## # A tibble: 6 x 8
## respondent_id profile form noapply disinfect bio price choice
## <dbl> <dbl> <fct> <fct> <fct> <fct> <fct> <dbl>
## 1 1 1 Concentrate 200 times Yes No 35 cents 1
## 2 1 2 Powder 200 times Yes No 35 cents 0
## 3 1 3 Premix 100 times Yes Yes 49 cents 1
## 4 1 4 Powder 200 times Yes Yes 49 cents 0
## 5 1 5 Powder 50 times Yes No 79 cents 0
## 6 1 6 Concentrate 200 times No Yes 79 cents 0
We first have to set the baselines for different attributes, using “relevel(.)”. You may also use “mapply” or other packages for this purpose.
# to set baselines for all attributes
Conjoint_data$form <- relevel(Conjoint_data$form,ref = "Powder")
Conjoint_data$noapply <- relevel(Conjoint_data$noapply,ref = "200 times")
Conjoint_data$disinfect <- relevel(Conjoint_data$disinfect,ref = "No")
Conjoint_data$bio <- relevel(Conjoint_data$bio,ref = "No")
Conjoint_data$price <- relevel(Conjoint_data$price,ref = "35 cents")
# double check if the baselines are set as wanted
lapply(Conjoint_data[,3:7],contrasts)
## $form
## Concentrate Premix
## Powder 0 0
## Concentrate 1 0
## Premix 0 1
##
## $noapply
## 100 times 50 times
## 200 times 0 0
## 100 times 1 0
## 50 times 0 1
##
## $disinfect
## Yes
## No 0
## Yes 1
##
## $bio
## Yes
## No 0
## Yes 1
##
## $price
## 49 cents 79 cents
## 35 cents 0 0
## 49 cents 1 0
## 79 cents 0 1
We then run a logistic regression with “Conjoint_data” using “choice” as the binary outcome.
mdl <- glm(choice ~ . - respondent_id - profile, family = "binomial", data = Conjoint_data)
summary(mdl)
##
## Call:
## glm(formula = choice ~ . - respondent_id - profile, family = "binomial",
## data = Conjoint_data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1680 -0.7913 0.3276 0.8173 2.1374
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.1357 0.4662 2.436 0.014859 *
## formConcentrate 0.6482 0.3293 1.968 0.049035 *
## formPremix -0.2519 0.3347 -0.753 0.451638
## noapply100 times -0.2335 0.3457 -0.675 0.499416
## noapply50 times -0.5904 0.3601 -1.639 0.101111
## disinfectYes 1.1143 0.2998 3.717 0.000202 ***
## bioYes 0.2762 0.2862 0.965 0.334477
## price49 cents -1.3828 0.3609 -3.831 0.000128 ***
## price79 cents -3.0606 0.3810 -8.034 9.46e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 449.25 on 329 degrees of freedom
## Residual deviance: 348.95 on 321 degrees of freedom
## AIC: 366.95
##
## Number of Fisher Scoring iterations: 4
For this question, the coefficients and stanard errors should be consistent with the suggested solutions. From the results, we have 8 coefficients with {“formConcentrate”, “formPremix”, “noapply100 times”, “noapply50 times”, “disinfectYes”, bioYes“,”price49 cents“,”price79 cents"}, with each coefficient \(0.5\) point and each standard error also \(0.25\) point, so the total is \(8\times0.5+8\times0.25=6\).
To calculate the partworths, we must apply the 3 rules:
# create a list of partworths with each attribute as an element
# initialize all values to zeros
partworths <- lapply(mdl$xlevels,function(x){x <- rep(0,length(x))})
change_names <- function (x,y) {
names(x) <- y
return(x)
}
partworths <- mapply(change_names,partworths,mdl$xlevels)
# set partworths of significant levels to original coefficients
partworths$form["Concentrate"] <- mdl$coefficients["formConcentrate"]
partworths$disinfect["Yes"] <- mdl$coefficients["disinfectYes"]
partworths$price[c("49 cents","79 cents")] <- mdl$coefficients[c("price49 cents","price79 cents")]
partworths
## $form
## Powder Concentrate Premix
## 0.0000000 0.6481925 0.0000000
##
## $noapply
## 200 times 100 times 50 times
## 0 0 0
##
## $disinfect
## No Yes
## 0.000000 1.114257
##
## $bio
## No Yes
## 0 0
##
## $price
## 35 cents 49 cents 79 cents
## 0.000000 -1.382753 -3.060619
For this question, all partworths should be same as the suggested solutions. We have in total 13 partworths, with each 0.45 point.
The most preferred product should give consumers the highest utility. We focus on levels that have the maximum partworth for different attributes.
| Attributes | Levels |
|---|---|
| Form | Concentrate |
| No of Apps | Any level |
| Disinfect | Yes |
| Biodegradable | Any level |
| Price | 35 cents |
For this question, there are 5 attributes, the point of each attribute is 0.6, so in total \(5\times0.6=3\). Note for no. of apps and biodegradable, the answer should be “any level is fine” or “consumers do not care about this attribute.”
This question uses “CLV_data”, as well as two functions “lik_bgnbd” and “pred_Y” from Case II. You also need to use the library “hypergeo” for prediction of logins.
# to specify the constraint matrix
ui <- diag(4)
ci <- rep(0,4)
# para: the starting values; a vector with 4 elements corresponding to (r,alpha,a,b).
# you can change this to other values or test different values.
para <- rep(.01,4)
# to run the constrained optimization
results <- constrOptim(para, lik_bgnbd, NULL, ui, ci)
# to get the estimated parameters in the order of (r,alpha,a,b)
para <- results$par
names(para) <- c("r","alpha","a","b")
para
## r alpha a b
## 0.2324974 8.9573195 1.1935137 3.1910102
For this question, each parameter is \(2\) points, so for \({r,\alpha,a,b}\), in total \(2\times4=8\) points. Note: depending on the inital values of para, the estimated values could be somewhat different. For grading, please try to replicate the analysis with the initial values of “para” set in the assignment.
# To predict the future values of Gamer 1-5 in 30 days
Yhat <- rep(0,5)
names(Yhat) <- paste("Gamer",as.character(1:5),sep = "_")
for (i in 1:5) {
Yhat[i] <- pred_Y(para,CLV_data[i,],30)
}
Yhat
## Gamer_1 Gamer_2 Gamer_3 Gamer_4 Gamer_5
## 0.17780954 0.24341699 0.08669631 0.75265953 0.08925523
For this question, each parameter is \(1.2\) points, so for \(5\) gamers, in total \(1.2\times5=6\) points. Note: depending on the estimted values of paras, the predicted consumer values could be somewhat different. For grading, please try to recalculate with the estimated values of paras in Question 2a.
An exemplary answer:
Another source of consumer values is social influence, as gamers’ activities are influenced by other gamers. This is because most online games have some elements which incentivize or exploit social influence between gamers. To measure social influence, we can rely on the social network between gamers and apply social network analysis to identify influencers. For example, we may focus on the degree centrality of gamers (i.e., no. of friends), as previous research (e.g., Chen et al. 2017) finds that degree centrality is a good measure of people’s influence on social networks.
For grading, please consider the following aspects:
1. Clarity of the proposed source of consumer value (1 point): the proposed consumer value should be clearly explained.
2. Validity of the proposed source of consumer value (1.5 point): the proposed consumer value should indirectly influence the active levels of gamers.
3. Clarity of the proposed method (1 point): the proposed method should be clearly described and explained.
4. Validity of the proposed method (1.5 point): the proposed method can measure the proposed source of consumer value, if applied empirically.
5. Feasbility of the proposed method (1 point): the proposed method and the data needed for the method are feasible in practices.