This script will demonstrate how to design a conjoint survey where 40 respondents will each have to answer 5 questions, choosing one of the two alternatives.
Based on: https://campus.datacamp.com/courses/marketing-analytics-in-r-choice-modeling/
respondent = Subject
question = Trial
alternative = Alt
There are 3 attributes of alternatives: OP (factor), build (factor), and price (integer).
df <- data.frame(Subject = c(rep(1:40, each = 10, times = 1)),
Trial = c(rep(1:5, each = 2, times = 40)),
Alt = c(rep(1:2, times = 200)))
df$OP <- NA
df$build <- NA
df$price <- NA
str(df)
## 'data.frame': 400 obs. of 6 variables:
## $ Subject: int 1 1 1 1 1 1 1 1 1 1 ...
## $ Trial : int 1 1 2 2 3 3 4 4 5 5 ...
## $ Alt : int 1 2 1 2 1 2 1 2 1 2 ...
## $ OP : logi NA NA NA NA NA NA ...
## $ build : logi NA NA NA NA NA NA ...
## $ price : logi NA NA NA NA NA NA ...
Create a list of all the possible combinations of attributes:
attribs <- list(OP = c("SocInf", "Econ", "Politics"),
build = c("Kant", "Grib", "Sed"),
price = seq(from = 50, to = 200, by = 10))
all_comb <- expand.grid(attribs)
nrow(all_comb)
## [1] 144
Sample 10 random combinations out of the list of alternatives to be presented to each respondent in 5 questions with 2 alternatives each and fill in the attributes to the data frame:
for (i in 1:144){
rand_rows <- sample(1:nrow(all_comb), size = 5*2)
rand_alts <- all_comb[rand_rows, ]
df[df$Subject == i, 4:6] <- rand_alts
}
str(df)
## 'data.frame': 400 obs. of 6 variables:
## $ Subject: int 1 1 1 1 1 1 1 1 1 1 ...
## $ Trial : int 1 1 2 2 3 3 4 4 5 5 ...
## $ Alt : int 1 2 1 2 1 2 1 2 1 2 ...
## $ OP : int 2 1 1 1 2 2 2 1 2 1 ...
## $ build : int 2 3 2 3 1 2 1 3 3 1 ...
## $ price : num 160 200 190 90 90 80 110 70 170 200 ...
head(df)
## Subject Trial Alt OP build price
## 1 1 1 1 2 2 160
## 2 1 1 2 1 3 200
## 3 1 2 1 1 2 190
## 4 1 2 2 1 3 90
## 5 1 3 1 2 1 90
## 6 1 3 2 2 2 80
df$OP <- as.factor(df$OP)
df$build <- as.factor(df$build)
levels(df$OP) <- c("SocInf", "Econ", "Politics")
levels(df$build) <- c("Kant", "Grib", "Sed")
df$Choice <- NA #add a column for answers
head(df, 11)
## Subject Trial Alt OP build price Choice
## 1 1 1 1 Econ Grib 160 NA
## 2 1 1 2 SocInf Sed 200 NA
## 3 1 2 1 SocInf Grib 190 NA
## 4 1 2 2 SocInf Sed 90 NA
## 5 1 3 1 Econ Kant 90 NA
## 6 1 3 2 Econ Grib 80 NA
## 7 1 4 1 Econ Kant 110 NA
## 8 1 4 2 SocInf Sed 70 NA
## 9 1 5 1 Econ Sed 170 NA
## 10 1 5 2 SocInf Kant 200 NA
## 11 2 1 1 SocInf Grib 140 NA
Things to consider:
the minimal number of product profiles to be put in a survey is recommended as a product of two attributes with the largest number of levels, here - 15*3 = 45
since 45 is too many for a set of trials (6-8 is reasonable), a larger sample might be required. Here, there are 144 possible choices and each respondent picks 1 out of 2 alternatives. We need at least 144/2=72 respondents + 1 for the error term. In other words, 40 (as in the example above) is not enough.
Further reading:
Struhl S. (2017) Artificial Intelligence Marketing and Predicting Consumer Choice. Kogan Page. Chapter 5. == plain language, critical remarks, easy to read, highly useful; based in Excel
Slides from C.Chapman’s and E.McDonnell Feit’s (2017) presentation: http://r-marketing.r-forge.r-project.org/slides/EARL-London-Sept2017/EARL-ConjointR-20170914.html#/ == repeats/prequels the datacamp course
Chapman C. and McDonnell Feit E. (2015) R for Marketing Research and Analytics. Springer. Chapter 13. https://link.springer.com/book/10.1007/978-3-319-14436-8 == maths, rigorous definitions, to consult if you need to design a conjoint survey; based in R.