Suppose your firm has introduced a new product to the market e.g. a new yoghurt brand, what features maximizes customers utility? Afterall, companies invest time and money into product research trying to pinpoint exactly what products or services consumers desire. Knowing what your customers’ desire, your product delevelopment strategies will be aimed at meeting or exceeding what they value.
Conjoint analysis is a statistical marketing research technique that helps businesses measure what their consumers value most about their products and services. It helps identify a combination of features and pricing that produce maximum utility. With different combinations you can predict future demands for the product.
As an example, let’s say that we are researching on attributes that are most influential to a consumer when purchasing a TV. See attributes below:
My example that I will henceforth use is from an internet providing company e.g. Zuku, Safcom, etc. I want to understand what bundle options based on different features and price are most influential to customers. They have been based on combinations of different features. The data generated is simulated but not based on any market information a.ka. it’s random. An online questionnaire I have scripted using XLSForm can be accessed here and filled in.
The first thing is to load the required packages and come up with all possible combinations of product features based on full experimental design. All combinations are equal to multipliplication of number of attributes in each level in our case \(3*3*2*2*3=108\). We will then choose the combination necessary for research.
library(conjoint)
library(dplyr)
library(AlgDesign)
library(ggplot2)
library(knitr)
library(kableExtra)
##Which combinations maximizes utility of consumers
combined_attributes=gen.factorial(c(3,3,2,2,3),varNames=c("Bundles","Days_valid","Free_Whatsapp","Free_Youtube","Price"),factors = "all")
##Name all attributes
combined_attributes<-combined_attributes %>%
mutate(Bundles=factor(Bundles,labels=c("10GBs","12GBs","15GBs"),levels=c(1,2,3)),
Days_valid=factor(Days_valid,labels=c("7days","10days","12days"),levels=c(1,2,3)),
Free_Whatsapp=factor(Free_Whatsapp,labels = c("No","Yes"),levels=c(1,2)),
Free_Youtube=factor(Free_Youtube,labels=c("No","Yes"),levels=c(1,2)),
Price=factor(Price,labels=c("1000","1500","2000"),levels=c(1,2,3)))
####levels
levels=c("10GBs","12GBs","15GBs","7days","10days","12days","Yes","No","Yes","No","1000","1500","2000")
levels=data.frame(levels)
NB: It is difficult to use all combinations, we need to reduce them into manageable combinations through random selection of rows. I have reduced to 9 combinations, displayed in the table below.
###Reduce number of combinations through random selection
set.seed(7654321)
few_combinations=optFederov(~.,combined_attributes,9)
few_combinations=few_combinations$design
few_combinations %>%
kable("html") %>%
kable_styling(font_size=12) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
| Bundles | Days_valid | Free_Whatsapp | Free_Youtube | Price | |
|---|---|---|---|---|---|
| 4 | 10GBs | 10days | No | No | 1000 |
| 18 | 15GBs | 12days | Yes | No | 1000 |
| 28 | 10GBs | 7days | Yes | Yes | 1000 |
| 38 | 12GBs | 7days | No | No | 1500 |
| 49 | 10GBs | 10days | Yes | No | 1500 |
| 60 | 15GBs | 10days | No | Yes | 1500 |
| 84 | 15GBs | 7days | Yes | No | 2000 |
| 97 | 10GBs | 12days | No | Yes | 2000 |
| 104 | 12GBs | 10days | Yes | Yes | 2000 |
We now simulate data on given by the respondents ratings given to each combination. In this example the data is from 100 respondents. I will only display data from only 10 respondents.
####simulating a data
n=100 ##Number of respondents interviewed
profile_data <- data.frame(Respondent =1:100)
profile_data$Respondent<-as.factor(profile_data$Respondent)
for (run in 1:9) {
profile_data[,paste("rating.combination",as.character(run), sep = "")]<- sample(c(1:9), n, replace = TRUE)
}
head(profile_data,10) %>%
kable("html") %>%
kable_styling(font_size=10) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
| Respondent | rating.combination1 | rating.combination2 | rating.combination3 | rating.combination4 | rating.combination5 | rating.combination6 | rating.combination7 | rating.combination8 | rating.combination9 |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 4 | 4 | 8 | 3 | 9 | 3 | 8 | 8 | 4 |
| 2 | 1 | 1 | 1 | 4 | 4 | 5 | 5 | 2 | 1 |
| 3 | 2 | 5 | 3 | 2 | 2 | 3 | 5 | 5 | 2 |
| 4 | 9 | 7 | 6 | 4 | 2 | 5 | 2 | 1 | 7 |
| 5 | 9 | 8 | 1 | 8 | 2 | 2 | 1 | 4 | 1 |
| 6 | 8 | 2 | 8 | 2 | 1 | 6 | 4 | 2 | 5 |
| 7 | 9 | 3 | 6 | 3 | 8 | 8 | 6 | 8 | 9 |
| 8 | 2 | 8 | 2 | 3 | 3 | 3 | 6 | 4 | 8 |
| 9 | 7 | 8 | 1 | 7 | 6 | 9 | 9 | 2 | 6 |
| 10 | 3 | 5 | 2 | 5 | 7 | 4 | 8 | 9 | 7 |
Run a conjoint analysis and summarize importance of various features. For ease of interpretation we will summarize important factors in a bar graph. From the bar graph below on ‘Importance of different features’, it is clear that customers put a high value on the type of bundle than on other factors. Free Youtube does not feature prominently in customers preference when they are making trade-offs.
fit=Conjoint(y=profile_data[,2:10],x=few_combinations,z=levels)
##
## Call:
## lm(formula = frml)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4,27 -2,07 0,06 2,10 4,26
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4,981000 0,094921 52,475 <2e-16 ***
## factor(x$Bundles)1 -0,005333 0,126562 -0,042 0,966
## factor(x$Bundles)2 0,152667 0,152682 1,000 0,318
## factor(x$Days_valid)1 0,092667 0,129907 0,713 0,476
## factor(x$Days_valid)2 -0,141333 0,126562 -1,117 0,264
## factor(x$Free_Whatsapp)1 -0,096000 0,094921 -1,011 0,312
## factor(x$Free_Youtube)1 -0,047000 0,093211 -0,504 0,614
## factor(x$Price)1 0,048667 0,137398 0,354 0,723
## factor(x$Price)2 0,186667 0,146467 1,274 0,203
## ---
## Signif. codes: 0 '***' 0,001 '**' 0,01 '*' 0,05 '.' 0,1 ' ' 1
##
## Residual standard error: 2,537 on 891 degrees of freedom
## Multiple R-squared: 0,00629, Adjusted R-squared: -0,002632
## F-statistic: 0,705 on 8 and 891 DF, p-value: 0,6874
## [1] "Part worths (utilities) of levels (model parameters for whole sample):"
## levnms utls
## 1 intercept 4,981
## 2 10GBs -0,0053
## 3 12GBs 0,1527
## 4 15GBs -0,1473
## 5 7days 0,0927
## 6 10days -0,1413
## 7 12days 0,0487
## 8 Yes -0,096
## 9 No 0,096
## 10 Yes -0,047
## 11 No 0,047
## 12 1000 0,0487
## 13 1500 0,1867
## 14 2000 -0,2353
## [1] "Average importance of factors (attributes):"
## [1] 25,66 24,45 13,96 11,33 24,60
## [1] Sum of average importance: 100
## [1] "Chart of average factors importance"
#####GGplot of important features
Importance = data.frame(Feature = c("Bundles","Days_valid","Free_Whatsapp","Free_Youtube","Price"),
Importance = caImportance(y=profile_data[,2:10],x=few_combinations))
ggplot(data = Importance, aes(x = reorder(Feature,-Importance), y = Importance)) +
geom_bar(stat= "identity", fill = "skyblue2", width = 0.7) +
ggtitle("Importance of different features") + xlab("")
We now ask, what is the best combination that will maximize utility? We will turn to utilities for each level. From the 5 biplots below the best features are the following: NB. Highest positive bars from biplots.
util = caUtilities(y=profile_data[,2:10],x=few_combinations,z =levels)
##
## Call:
## lm(formula = frml)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4,27 -2,07 0,06 2,10 4,26
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4,981000 0,094921 52,475 <2e-16 ***
## factor(x$Bundles)1 -0,005333 0,126562 -0,042 0,966
## factor(x$Bundles)2 0,152667 0,152682 1,000 0,318
## factor(x$Days_valid)1 0,092667 0,129907 0,713 0,476
## factor(x$Days_valid)2 -0,141333 0,126562 -1,117 0,264
## factor(x$Free_Whatsapp)1 -0,096000 0,094921 -1,011 0,312
## factor(x$Free_Youtube)1 -0,047000 0,093211 -0,504 0,614
## factor(x$Price)1 0,048667 0,137398 0,354 0,723
## factor(x$Price)2 0,186667 0,146467 1,274 0,203
## ---
## Signif. codes: 0 '***' 0,001 '**' 0,01 '*' 0,05 '.' 0,1 ' ' 1
##
## Residual standard error: 2,537 on 891 degrees of freedom
## Multiple R-squared: 0,00629, Adjusted R-squared: -0,002632
## F-statistic: 0,705 on 8 and 891 DF, p-value: 0,6874
bundle_utility=util[2:4]
valid_days=util[5:7]
Free.Whatsapp=util[8:9]
Free.Youtube=util[10:11]
price=util[12:14]
names(bundle_utility)=c("10GBs","12GBs","15GBs")
names(valid_days)=c("7days","10days","12days")
names(Free.Whatsapp)=c("No","Yes")
names(Free.Youtube)=c("No","Yes")
names(price)=c("1000","1500","2000")
barplot(bundle_utility,col="skyblue2",main="Bundle type")
barplot(valid_days,col="brown",main="Valid days")
barplot(Free.Whatsapp,col="grey",main="Free WhatsApp")
barplot(Free.Youtube,col="orange",main="Free Youtube")
barplot(price,col="yellow",main="Price tags")