Conjoint Analysis is a survey based technique to
identify how customers value various attributes that make up an
individual product. Products and services are bundles of features that
customers consider jointly and while making a purchase decision they
must make trade offs. Conjoint analysis helps companies and business
owners determine importance of different features and their money value.
This also helps them identify optimal price for their products and
services.
Basic Setup In this example, we will discover how to design a survey to get the information necessary, analyze and quantify importance of various features in a product and how to interpret the results. But first , let’s begin by setting up R environment and loading required libraries.
library(tidyverse)
options(scipen = 999)
options(digits=3)
library(conjoint)
install.packages('pacman')
library(pacman)
pacman::p_load(conjoint, DoE.base, knitr, dplyr, kableExtra, ggplot2)
Experiment Design
I want to know what features / style in sneakers stand out more to fans
###Experiment Design for asking Conjoint Questions
#- identify number of questions required. Define level and factors
NumberOfQuestions = nrow(oa.design(nlevels=c(7,2,2)))
creating full factorial with 28 runs ...
# create dummy data
NBAstar_branded_sneaker = expand.grid(
mj_nostalgia = c('fashionable','performance','sleek','style','hightop','reputation','none_of_the_above'),
lj_enhancements= c('performance_based',
'modern','technology','stability','durability','lowtop','none'))
#combination to enquire
selectedComb = caFactorialDesign(
data = NBAstar_branded_sneaker,
type = 'fractional',
cards = NumberOfQuestions)
# print select combinations
kable(selectedComb) %>%
kable_styling(bootstrap_options = c('striped','hover','condensed','responsive'))
mj_nostalgia | lj_enhancements | |
---|---|---|
4 | style | performance_based |
5 | hightop | performance_based |
6 | reputation | performance_based |
7 | none_of_the_above | performance_based |
8 | fashionable | modern |
9 | performance | modern |
12 | hightop | modern |
13 | reputation | modern |
15 | fashionable | technology |
17 | sleek | technology |
18 | style | technology |
21 | none_of_the_above | technology |
23 | performance | stability |
24 | sleek | stability |
27 | reputation | stability |
28 | none_of_the_above | stability |
29 | fashionable | durability |
31 | sleek | durability |
32 | style | durability |
34 | reputation | durability |
37 | performance | lowtop |
38 | sleek | lowtop |
39 | style | lowtop |
40 | hightop | lowtop |
43 | fashionable | none |
44 | performance | none |
47 | hightop | none |
49 | none_of_the_above | none |
Predicting Responses for other combinations Using orthogonal factorial design, we identified 28 combinations to include in customer survey out of possible 49 combinations. It saves time, money and effort. Survey takers was asked if they like the combination or not and their reposes were noted for all 9 combinations. We will use these 9 responses to train out logistic regression model and use to predict rest of combinations.
# add response column
selectedComb$Response =
c(0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0)
selectedComb$Response
[1] 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0
# logistic regression
logisticmodel=glm(Response ~ factor(mj_nostalgia) + factor(lj_enhancements),
family=binomial(link='logit'), data=selectedComb)
logisticmodel
Call: glm(formula = Response ~ factor(mj_nostalgia) + factor(lj_enhancements),
family = binomial(link = "logit"), data = selectedComb)
Coefficients:
(Intercept) factor(mj_nostalgia)performance
37.17375406612627131 -18.66970866954126862
factor(mj_nostalgia)sleek factor(mj_nostalgia)style
-24.18860483890588853 -56.42542531667980654
factor(mj_nostalgia)hightop factor(mj_nostalgia)reputation
-37.17375406612629263 -37.17375406612627131
factor(mj_nostalgia)none_of_the_above factor(lj_enhancements)modern
0.00000000000000367 0.00000000000000819
factor(lj_enhancements)technology factor(lj_enhancements)stability
-37.17375406612627131 19.79288025159280195
factor(lj_enhancements)durability factor(lj_enhancements)lowtop
-56.83058479194890822 39.00537632897358975
factor(lj_enhancements)none
-37.17375406612627842
Degrees of Freedom: 27 Total (i.e. Null); 15 Residual
Null Deviance: 38.7
Residual Deviance: 11.1 AIC: 37.1
# predict
NBAstar_branded_sneaker$response = ifelse(predict(logisticmodel,NBAstar_branded_sneaker,type="response") > 0.5,1,0)
NBAstar_branded_sneaker$response
[1] 1 1 1 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1
[43] 0 0 0 0 0 0 0
Conjoint Analysis
Now, that we have all the data we need, lets run a conjoint analysis and summarize importance of various features for customers.
Call:
lm(formula = frml)
Residuals:
Min 1Q Median 3Q Max
-0,5102 -0,0816 -0,0816 0,2041 0,6327
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0,4898 0,0405 12,08 0,000000000000031 ***
factor(x$mj_nostalgia)1 0,0816 0,0993 0,82 0,4164
factor(x$mj_nostalgia)2 0,0816 0,0993 0,82 0,4164
factor(x$mj_nostalgia)3 0,0816 0,0993 0,82 0,4164
factor(x$mj_nostalgia)4 -0,2041 0,0993 -2,06 0,0471 *
factor(x$mj_nostalgia)5 -0,2041 0,0993 -2,06 0,0471 *
factor(x$mj_nostalgia)6 -0,0612 0,0993 -0,62 0,5413
factor(x$lj_enhancements)1 0,0816 0,0993 0,82 0,4164
factor(x$lj_enhancements)2 0,2245 0,0993 2,26 0,0299 *
factor(x$lj_enhancements)3 -0,3469 0,0993 -3,49 0,0013 **
factor(x$lj_enhancements)4 0,5102 0,0993 5,14 0,000009820851537 ***
factor(x$lj_enhancements)5 -0,4898 0,0993 -4,93 0,000018414331608 ***
factor(x$lj_enhancements)6 0,5102 0,0993 5,14 0,000009820851537 ***
---
Signif. codes: 0 ‘***’ 0,001 ‘**’ 0,01 ‘*’ 0,05 ‘.’ 0,1 ‘ ’ 1
Residual standard error: 0,284 on 36 degrees of freedom
Multiple R-squared: 0,763, Adjusted R-squared: 0,684
F-statistic: 9,68 on 12 and 36 DF, p-value: 0,0000000514
[1] "Part worths (utilities) of levels (model parameters for whole sample):"
[1] "Average importance of factors (attributes):"
[1] 30 70
[1] Sum of average importance: 100
[1] "Chart of average factors importance"
Feature Importance
The Importance of features plot
shows that buyers of sneakers (specifically Nike brand sneakers
associated with NBA stars Michael Jordan | Lebron James) care primarily
about enhancements while only a small minority of consumers care abut
nostalgia.
# Feature Importance
Importance = data.frame(Feature = c('mj_nostalgia','lj_enhancements'),
Importance=caImportance(y=Survey,
x=NBAstar_branded_sneaker[,1:2]))
Importance
ggplot(Importance,aes(reorder(Feature,-Importance),Importance))+
geom_bar(stat='identity',width=.8,fill='purple3')+
ggtitle('Importance of different features')+
xlab('')+theme_classic()+coord_flip()
Feature Utilities
Now, for NBA player Endorsed
sneakers, we understand that enhancement is the key factor, although
nostalgia is also of some importance. But which aspect of enhancement is
of the greatest reason for consumers wanting to buy the sneakers? We can
get our answer by comparing the levels of each feature one at a time
#summarize utilities
util = data.frame(Utilities = t
(data.frame(caPartUtilities(y=Survey,
x=NBAstar_branded_sneaker[,1:2],z=levels))))
util$levels = row.names(util)
util$levels
[1] "intercept" "fashionable" "performance" "sleek"
[5] "style" "hightop" "reputation" "none_of_the_above"
[9] "performance_based" "modern" "technology" "stability"
[13] "durability" "lowtop" "none"
util %>%
arrange(desc(Utilities))
# Type
ggplot(data = util[which(util$levels %in% c('performance_based',
'modern','technology','stability','durability','lowtop','none')),], aes(x = levels, y = Utilities,fill=Utilities)) +
geom_bar(stat='identity') +
ggtitle("Utilities of NBA Endorsed Sneaker 'Enhancements' aspects") + xlab("")+coord_flip()+
geom_hline(yintercept = 0,lty=2,col='black',linewidth=.7)+scale_fill_gradient(high='pink2',low='black')+theme(plot.title = element_text(size=14),vjust=1)+theme_classic()+theme(legend.position = 'bottom')
ggplot(data = util[which(util$levels %in% c('performance_based',
'modern','stability','lowtop')),], aes(x = levels, y = Utilities,fill=Utilities)) +
geom_bar(stat='identity') +
ggtitle("Utilities of NBA Endorsed Sneaker 'Enhancements' aspects") + xlab("")+coord_flip()+
geom_hline(yintercept = 0,lty=2,col='black',linewidth=.7)+scale_fill_gradient(high='pink2',low='black')+theme(plot.title = element_text(size=14),vjust=1)+theme_classic()+theme(legend.position = 'bottom')
NA
NA
Findings
After comparing the levels of each feature
in our utility summary it’s clear from the results of the survey that
the majority of respondents choose to buy NBA endorsed sneakers based
off of enhancements over nostalgia. In addition to this, we also
discovered that the two most important aspects behind one’s decision to
purchase such sneakers is its stability and low top design with both
having 51% support and after that is modern at 22.4% alongside
performance_based at 8%. While durability, none of the above and
technology all seemed to be of absolutely no importance to potential
customers.
Insights:
Now, in order to take advantage of the popularity and fan base behind these types of sneakers and the today’s stars that endorse them such as Lebron James the company should continue to focus heavily on the sneakers level of stability and low cut design and secondarily its modern look and performance_based fit ideally suited for game play.