Ref:

1 Simple Intro

Example (from ref 1)
有兩個mp3，分別為A: 16MB,運送時間1天；B: 64MB,運送時間1週，偏好哪一個?
- Preference for a combination of attributes reveals the ‘part-worth utilities’ of individual attributes.
- attr. 1: memory
- attr. 2: delivery
如果選擇A，表示更強調短運送時間；選擇B則更強調大記憶體
Relative importance (from ref 3)
1. Calculate the range of preference within each attribute for each individual.
2. Calculate the importance ratio of each attribute for each individual.
3. Calculate the average importance across respondents.
part-worth (from ref 3)
1. Calculate the average preference across individuals for each level.
2. Scale the partworth utility to set $\mu=0$
Use linear regression (from ref 1)
\[Ranking=\mu+\beta_1Attr_1+\beta_2Attr_2\]
- $\beta$表示part-worth，所有類別的加總須為0
- if $Attr_i$ 只有兩類，則\[relative\:importance=\pm\beta_i/\sum \pm\beta_i\] 假設$\beta_1=2$，$\beta_2=0.5$，則$\sum \pm\beta_i=4+1=5$
  memory的relative importance$=4/5=80\%$
  delivery的relative importance$=1/5=20\%$

2 Preprocess

2.1 Read data

# read in conjoint survey profiles with respondent ranks
conjoint.data.frame <- read.csv("mobile_services_ranking.csv")
head(conjoint.data.frame)
##           brand startup monthly  service       retail       apple       samsung
## 1        "AT&T"  "$100"  "$100"  "4G NO"  "Retail NO"  "Apple NO"  "Samsung NO"
## 2     "Verizon"  "$300"  "$100"  "4G NO" "Retail YES" "Apple YES" "Samsung YES"
## 3 "US Cellular"  "$400"  "$200"  "4G NO"  "Retail NO"  "Apple NO" "Samsung YES"
## 4     "Verizon"  "$400"  "$400" "4G YES" "Retail YES"  "Apple NO"  "Samsung NO"
## 5     "Verizon"  "$200"  "$300"  "4G NO"  "Retail NO"  "Apple NO" "Samsung YES"
## 6     "Verizon"  "$100"  "$200" "4G YES"  "Retail NO" "Apple YES"  "Samsung NO"
##        google ranking
## 1  "Nexus NO"      11
## 2  "Nexus NO"      12
## 3  "Nexus NO"       9
## 4  "Nexus NO"       2
## 5 "Nexus YES"       8
## 6 "Nexus YES"      13

2.2 Fit linear model

# main effects model specification
main.effects.model <- {ranking ~ brand + startup + monthly + service + 
  retail + apple + samsung + google}

# fit linear regression model using main effects only (no interaction terms)
main.effects.model.fit <- lm(main.effects.model, data=conjoint.data.frame)
print(summary(main.effects.model.fit)) 
## 
## Call:
## lm(formula = main.effects.model, data = conjoint.data.frame)
## 
## Residuals:
##      1      2      3      4      5      6      7      8      9     10     11 
## -0.125  0.125  0.125 -0.125 -0.125  0.125 -0.125  0.125  0.125 -0.125 -0.125 
##     12     13     14     15     16 
## -0.125  0.125  0.125  0.125 -0.125 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)  
## (Intercept)           1.112e+01  4.841e-01  22.980   0.0277 *
## brand"T-Mobile"      -2.500e-01  3.536e-01  -0.707   0.6082  
## brand"US Cellular"   -6.629e-16  3.536e-01   0.000   1.0000  
## brand"Verizon"        2.500e-01  3.536e-01   0.707   0.6082  
## startup"$200"        -7.500e-01  3.536e-01  -2.121   0.2804  
## startup"$300"        -7.500e-01  3.536e-01  -2.121   0.2804  
## startup"$400"        -1.500e+00  3.536e-01  -4.243   0.1474  
## monthly"$200"        -3.000e+00  3.536e-01  -8.485   0.0747 .
## monthly"$300"        -6.250e+00  3.536e-01 -17.678   0.0360 *
## monthly"$400"        -1.075e+01  3.536e-01 -30.406   0.0209 *
## service"4G YES"       3.500e+00  2.500e-01  14.000   0.0454 *
## retail"Retail YES"   -5.000e-01  2.500e-01  -2.000   0.2952  
## apple"Apple YES"     -5.000e-01  2.500e-01  -2.000   0.2952  
## samsung"Samsung YES"  2.250e+00  2.500e-01   9.000   0.0704 .
## google"Nexus YES"     1.500e+00  2.500e-01   6.000   0.1051  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5 on 1 degrees of freedom
## Multiple R-squared:  0.9993, Adjusted R-squared:  0.989 
## F-statistic: 97.07 on 14 and 1 DF,  p-value: 0.0794
sum(coef(main.effects.model.fit)[-1])
## [1] -16.75

2.3 Store result

# save key list elements of the fitted model as needed for conjoint measures
conjoint.results <- 
  main.effects.model.fit[c("contrasts","xlevels","coefficients")]

conjoint.results$attributes <- names(conjoint.results$contrasts)
conjoint.results
## $contrasts
## $contrasts$brand
## [1] "contr.treatment"
## 
## $contrasts$startup
## [1] "contr.treatment"
## 
## $contrasts$monthly
## [1] "contr.treatment"
## 
## $contrasts$service
## [1] "contr.treatment"
## 
## $contrasts$retail
## [1] "contr.treatment"
## 
## $contrasts$apple
## [1] "contr.treatment"
## 
## $contrasts$samsung
## [1] "contr.treatment"
## 
## $contrasts$google
## [1] "contr.treatment"
## 
## 
## $xlevels
## $xlevels$brand
## [1] "\"AT&T\""        "\"T-Mobile\""    "\"US Cellular\"" "\"Verizon\""    
## 
## $xlevels$startup
## [1] "\"$100\"" "\"$200\"" "\"$300\"" "\"$400\""
## 
## $xlevels$monthly
## [1] "\"$100\"" "\"$200\"" "\"$300\"" "\"$400\""
## 
## $xlevels$service
## [1] "\"4G NO\""  "\"4G YES\""
## 
## $xlevels$retail
## [1] "\"Retail NO\""  "\"Retail YES\""
## 
## $xlevels$apple
## [1] "\"Apple NO\""  "\"Apple YES\""
## 
## $xlevels$samsung
## [1] "\"Samsung NO\""  "\"Samsung YES\""
## 
## $xlevels$google
## [1] "\"Nexus NO\""  "\"Nexus YES\""
## 
## 
## $coefficients
##          (Intercept)      brand"T-Mobile"   brand"US Cellular" 
##         1.112500e+01        -2.500000e-01        -6.628732e-16 
##       brand"Verizon"        startup"$200"        startup"$300" 
##         2.500000e-01        -7.500000e-01        -7.500000e-01 
##        startup"$400"        monthly"$200"        monthly"$300" 
##        -1.500000e+00        -3.000000e+00        -6.250000e+00 
##        monthly"$400"      service"4G YES"   retail"Retail YES" 
##        -1.075000e+01         3.500000e+00        -5.000000e-01 
##     apple"Apple YES" samsung"Samsung YES"    google"Nexus YES" 
##        -5.000000e-01         2.250000e+00         1.500000e+00 
## 
## $attributes
## [1] "brand"   "startup" "monthly" "service" "retail"  "apple"   "samsung"
## [8] "google"

3 Compute part-worth

# compute and store part-worths in the conjoint.results list structure
part.worths <- conjoint.results$xlevels  # list of same structure as xlevels
end.index.for.coefficient <- 1  # initialize skipping the intercept
part.worth.vector <- NULL # used for accumulation of part worths
for(index.for.attribute in seq(along=conjoint.results$contrasts)) {
  nlevels <- length(unlist(conjoint.results$xlevels[index.for.attribute]))
  begin.index.for.coefficient <- end.index.for.coefficient + 1
  end.index.for.coefficient <- begin.index.for.coefficient + nlevels -2
  last.part.worth <- -sum(conjoint.results$coefficients[
    begin.index.for.coefficient:end.index.for.coefficient])
  part.worths[index.for.attribute] <- 
    list(as.numeric(c(conjoint.results$coefficients[
      begin.index.for.coefficient:end.index.for.coefficient],
      last.part.worth)))
  part.worth.vector <- 
    c(part.worth.vector,unlist(part.worths[index.for.attribute]))    
  } 
conjoint.results$part.worths <- part.worths
part.worths
## $brand
## [1] -2.500000e-01 -6.628732e-16  2.500000e-01 -2.140445e-15
## 
## $startup
## [1] -0.75 -0.75 -1.50  3.00
## 
## $monthly
## [1]  -3.00  -6.25 -10.75  20.00
## 
## $service
## [1]  3.5 -3.5
## 
## $retail
## [1] -0.5  0.5
## 
## $apple
## [1] -0.5  0.5
## 
## $samsung
## [1]  2.25 -2.25
## 
## $google
## [1]  1.5 -1.5

seq(along=conjoint.results$contrasts)

seq(c(3,4))
## [1] 1 2
seq_along(c(3,4))
## [1] 1 2
seq(along=c(3,4))
## [1] 1 2

seq(5)
## [1] 1 2 3 4 5
seq_along(5)
## [1] 1
seq(along=5)
## [1] 1

last.part.worth: $-\sum \hat{\beta}$，除了base類別以外的類別的係數，例如顏色分紅黃藍，base為紅色，則$\sum \hat{\beta}=$黃色加藍色的係數
part.worths: 所有類別的part-worth，加總為0

3.1 Standardized

標準化，讓每個類別的基準值位於0

# compute standardized part-worths
standardize <- function(x) {(x - mean(x)) / sd(x)}
conjoint.results$standardized.part.worths <- 
  lapply(conjoint.results$part.worths,standardize)
conjoint.results$standardized.part.worths
## $brand
## [1] -1.224745e+00 -3.247402e-15  1.224745e+00 -1.048600e-14
## 
## $startup
## [1] -0.3692745 -0.3692745 -0.7385489  1.4770979
## 
## $monthly
## [1] -0.2188703 -0.4559797 -0.7842851  1.4591351
## 
## $service
## [1]  0.7071068 -0.7071068
## 
## $retail
## [1] -0.7071068  0.7071068
## 
## $apple
## [1] -0.7071068  0.7071068
## 
## $samsung
## [1]  0.7071068 -0.7071068
## 
## $google
## [1]  0.7071068 -0.7071068

4 Compute relative importance

計算各特徵的全距並加總

# compute and store part-worth ranges for each attribute 
part.worth.ranges <- conjoint.results$contrasts
for(index.for.attribute in seq(along=conjoint.results$contrasts)) 
  part.worth.ranges[index.for.attribute] <- 
  dist(range(conjoint.results$part.worths[index.for.attribute]))
conjoint.results$part.worth.ranges <- part.worth.ranges

sum.part.worth.ranges <- sum(as.numeric(conjoint.results$part.worth.ranges))

計算各特徵的全距占所有特徵的比例

# compute and store importance values for each attribute 
attribute.importance <- conjoint.results$contrasts
for(index.for.attribute in seq(along=conjoint.results$contrasts)) 
  attribute.importance[index.for.attribute] <- 
  (as.numeric(part.worth.ranges[index.for.attribute])/sum.part.worth.ranges) * 100
conjoint.results$attribute.importance <- attribute.importance
attribute.importance
## $brand
## [1] 0.9569378
## 
## $startup
## [1] 8.61244
## 
## $monthly
## [1] 58.85167
## 
## $service
## [1] 13.39713
## 
## $retail
## [1] 1.913876
## 
## $apple
## [1] 1.913876
## 
## $samsung
## [1] 8.61244
## 
## $google
## [1] 5.741627

5 Output

5.1 事先設置

排序 & $R^2$

# data frame for ordering attribute names
attribute.name <- names(conjoint.results$contrasts)
attribute.importance <- as.numeric(attribute.importance)
temp.frame <- data.frame(attribute.name,attribute.importance)
conjoint.results$ordered.attributes <- 
  as.character(temp.frame[sort.list(
  temp.frame$attribute.importance,decreasing = TRUE),"attribute.name"])
# respondent internal consistency added to list structure
conjoint.results$internal.consistency <- summary(main.effects.model.fit)$r.squared

呈現位數

print.digits <- 2
# user-defined function for printing conjoint measures
if (print.digits == 2) 
  pretty.print <- function(x) {sprintf("%1.2f",round(x,digits = 2))} 
if (print.digits == 3) 
  pretty.print <- function(x) {sprintf("%1.3f",round(x,digits = 3))}

顯示欄位名稱 (圖形用的)

# user-defined function for plotting descriptive attribute names 
effect.name.map <- function(effect.name) { 
  if(effect.name=="brand") return("Mobile Service Provider")
  if(effect.name=="startup") return("Start-up Cost")
  if(effect.name=="monthly") return("Monthly Cost")
  if(effect.name=="service") return("Offers 4G Service")
  if(effect.name=="retail") return("Has Nearby Retail Store")
  if(effect.name=="apple") return("Sells Apple Products")
  if(effect.name=="samsung") return("Sells Samsung Products")
  if(effect.name=="google") return("Sells Google/Nexus Products")
}

options設定 (好像沒差?)

# # set up sum contrasts for effects coding as needed for conjoint analysis
# options(contrasts=c("contr.sum","contr.poly"))

5.2 Report

# report conjoint measures to console 
# use pretty.print to provide nicely formatted output
for(k in seq(along=conjoint.results$ordered.attributes)) {
  cat("\n","\n")
  cat(conjoint.results$ordered.attributes[k],"Levels: ",
  unlist(conjoint.results$xlevels[conjoint.results$ordered.attributes[k]]))
  
  cat("\n"," Part-Worths:  ")
  cat(pretty.print(unlist(conjoint.results$part.worths
    [conjoint.results$ordered.attributes[k]])))
    
  cat("\n"," Standardized Part-Worths:  ")
  cat(pretty.print(unlist(conjoint.results$standardized.part.worths
    [conjoint.results$ordered.attributes[k]])))  
    
  cat("\n"," Attribute Importance:  ")
  cat(pretty.print(unlist(conjoint.results$attribute.importance
    [conjoint.results$ordered.attributes[k]])))
  }

## 
##  
## monthly Levels:  "$100" "$200" "$300" "$400"
##   Part-Worths:  -3.00 -6.25 -10.75 20.00
##   Standardized Part-Worths:  -0.22 -0.46 -0.78 1.46
##   Attribute Importance:  58.85
##  
## service Levels:  "4G NO" "4G YES"
##   Part-Worths:  3.50 -3.50
##   Standardized Part-Worths:  0.71 -0.71
##   Attribute Importance:  13.40
##  
## startup Levels:  "$100" "$200" "$300" "$400"
##   Part-Worths:  -0.75 -0.75 -1.50 3.00
##   Standardized Part-Worths:  -0.37 -0.37 -0.74 1.48
##   Attribute Importance:  8.61
##  
## samsung Levels:  "Samsung NO" "Samsung YES"
##   Part-Worths:  2.25 -2.25
##   Standardized Part-Worths:  0.71 -0.71
##   Attribute Importance:  8.61
##  
## google Levels:  "Nexus NO" "Nexus YES"
##   Part-Worths:  1.50 -1.50
##   Standardized Part-Worths:  0.71 -0.71
##   Attribute Importance:  5.74
##  
## apple Levels:  "Apple NO" "Apple YES"
##   Part-Worths:  -0.50 0.50
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  1.91
##  
## retail Levels:  "Retail NO" "Retail YES"
##   Part-Worths:  -0.50 0.50
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  1.91
##  
## brand Levels:  "AT&T" "T-Mobile" "US Cellular" "Verizon"
##   Part-Worths:  -0.25 -0.00 0.25 -0.00
##   Standardized Part-Worths:  -1.22 -0.00 1.22 -0.00
##   Attribute Importance:  0.96

5.3 Plot

source("spine_chart.R")
spine.chart(conjoint.results)

6 Self-learning tips

6.1 Enter new ranking

Enter your own rankings for the product profiles and generate conjoint measures of attribute importance and level part-worths.

# get data
df <- conjoint.data.frame
set.seed(0920)
df$ranking <- sample(1:16)
# head(df,3); head(conjoint.data.frame,3)

# fit model
ft <- lm(ranking~., data=df)

# get `conjoint.results`
new.conjoint.results <- get_result(ft)
# report
show_report(new.conjoint.results)

## 
##  
## brand Levels:  "AT&T" "T-Mobile" "US Cellular" "Verizon"
##   Part-Worths:  -1.00 -2.00 -4.00 7.00
##   Standardized Part-Worths:  -0.21 -0.41 -0.83 1.45
##   Attribute Importance:  26.35
##  
## startup Levels:  "$100" "$200" "$300" "$400"
##   Part-Worths:  -1.75 1.25 5.50 -5.00
##   Standardized Part-Worths:  -0.39 0.28 1.23 -1.12
##   Attribute Importance:  25.15
##  
## monthly Levels:  "$100" "$200" "$300" "$400"
##   Part-Worths:  -2.75 -3.00 3.75 2.00
##   Standardized Part-Worths:  -0.81 -0.88 1.10 0.59
##   Attribute Importance:  16.17
##  
## retail Levels:  "Retail NO" "Retail YES"
##   Part-Worths:  -2.25 2.25
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  10.78
##  
## samsung Levels:  "Samsung NO" "Samsung YES"
##   Part-Worths:  1.50 -1.50
##   Standardized Part-Worths:  0.71 -0.71
##   Attribute Importance:  7.19
##  
## apple Levels:  "Apple NO" "Apple YES"
##   Part-Worths:  -1.25 1.25
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  5.99
##  
## service Levels:  "4G NO" "4G YES"
##   Part-Worths:  -1.00 1.00
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  4.79
##  
## google Levels:  "Nexus NO" "Nexus YES"
##   Part-Worths:  -0.75 0.75
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  3.59

# plot
par(mfrow = c(1, 2))
spine.chart(new.conjoint.results)
title('New ranking')
spine.chart(conjoint.results)
title('Origin')

6.1.1 Conclusion

隨機生成ranking後，最重要的變為電信商，但是重要性僅26.35%

6.2 Add interaction

Beyond a linear main-effects model. See if you can build a model with interaction effects for service provider attributes.

head(conjoint.data.frame,3)
##           brand startup monthly service       retail       apple       samsung
## 1        "AT&T"  "$100"  "$100" "4G NO"  "Retail NO"  "Apple NO"  "Samsung NO"
## 2     "Verizon"  "$300"  "$100" "4G NO" "Retail YES" "Apple YES" "Samsung YES"
## 3 "US Cellular"  "$400"  "$200" "4G NO"  "Retail NO"  "Apple NO" "Samsung YES"
##       google ranking
## 1 "Nexus NO"      11
## 2 "Nexus NO"      12
## 3 "Nexus NO"       9
# fit model
ft_interaction <- lm(ranking~. + samsung*google, data=conjoint.data.frame)
# ft_interaction
summary(ft_interaction)
## 
## Call:
## lm(formula = ranking ~ . + samsung * google, data = conjoint.data.frame)
## 
## Residuals:
## ALL 16 residuals are 0: no residual degrees of freedom!
## 
## Coefficients:
##                                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)                             1.100e+01        NaN     NaN      NaN
## brand"T-Mobile"                        -2.500e-01        NaN     NaN      NaN
## brand"US Cellular"                     -6.459e-16        NaN     NaN      NaN
## brand"Verizon"                          2.500e-01        NaN     NaN      NaN
## startup"$200"                          -7.500e-01        NaN     NaN      NaN
## startup"$300"                          -7.500e-01        NaN     NaN      NaN
## startup"$400"                          -1.500e+00        NaN     NaN      NaN
## monthly"$200"                          -3.000e+00        NaN     NaN      NaN
## monthly"$300"                          -6.250e+00        NaN     NaN      NaN
## monthly"$400"                          -1.075e+01        NaN     NaN      NaN
## service"4G YES"                         3.500e+00        NaN     NaN      NaN
## retail"Retail YES"                     -5.000e-01        NaN     NaN      NaN
## apple"Apple YES"                       -5.000e-01        NaN     NaN      NaN
## samsung"Samsung YES"                    2.500e+00        NaN     NaN      NaN
## google"Nexus YES"                       1.750e+00        NaN     NaN      NaN
## samsung"Samsung YES":google"Nexus YES" -5.000e-01        NaN     NaN      NaN
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:    NaN 
## F-statistic:   NaN on 15 and 0 DF,  p-value: NA

# get `conjoint.results`
interaction.conjoint.results <- get_result(ft_interaction)
# report
show_report(interaction.conjoint.results)

## 
##  
## monthly Levels:  "$100" "$200" "$300" "$400"
##   Part-Worths:  -3.00 -6.25 -10.75 20.00
##   Standardized Part-Worths:  -0.22 -0.46 -0.78 1.46
##   Attribute Importance:  57.75
##  
## service Levels:  "4G NO" "4G YES"
##   Part-Worths:  3.50 -3.50
##   Standardized Part-Worths:  0.71 -0.71
##   Attribute Importance:  13.15
##  
## samsung Levels:  "Samsung NO" "Samsung YES"
##   Part-Worths:  2.50 -2.50
##   Standardized Part-Worths:  0.71 -0.71
##   Attribute Importance:  9.39
##  
## startup Levels:  "$100" "$200" "$300" "$400"
##   Part-Worths:  -0.75 -0.75 -1.50 3.00
##   Standardized Part-Worths:  -0.37 -0.37 -0.74 1.48
##   Attribute Importance:  8.45
##  
## google Levels:  "Nexus NO" "Nexus YES"
##   Part-Worths:  1.75 -1.75
##   Standardized Part-Worths:  0.71 -0.71
##   Attribute Importance:  6.57
##  
## apple Levels:  "Apple NO" "Apple YES"
##   Part-Worths:  -0.50 0.50
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  1.88
##  
## retail Levels:  "Retail NO" "Retail YES"
##   Part-Worths:  -0.50 0.50
##   Standardized Part-Worths:  -0.71 0.71
##   Attribute Importance:  1.88
##  
## brand Levels:  "AT&T" "T-Mobile" "US Cellular" "Verizon"
##   Part-Worths:  -0.25 -0.00 0.25 -0.00
##   Standardized Part-Worths:  -1.22 -0.00 1.22 -0.00
##   Attribute Importance:  0.94

# plot
par(mfrow = c(1, 2))
spine.chart(interaction.conjoint.results)
title('Add interaction')
spine.chart(conjoint.results)
title('Origin')

6.2.1 Conclusion

加入samsung和google的交互作用項
- samsung的係數$2.25\to2.5$
- google的係數$1.5\to1.75$
- samsung*google的係數為$-0.5$，表示當samsung和google都有代理時會降低喜好的排名

6.3 Rewrite spine_chart.R

See if you can rewrite the source script spine_chart.R to generalize the spine chart with more flexibility

Conjoint Analysis

2023-09-20

1 Simple Intro

2 Preprocess

2.1 Read data

2.2 Fit linear model

2.3 Store result

3 Compute part-worth

3.1 Standardized

4 Compute relative importance

5 Output

5.1 事先設置

5.2 Report

5.3 Plot

6 Self-learning tips

6.1 Enter new ranking

6.1.1 Conclusion

6.2 Add interaction

6.2.1 Conclusion

6.3 Rewrite spine_chart.R