#Perform conjoint analyses using R and RStudio (not MEXL) #• Conjoint analysis for all respondents (See below code) #• Provide part-worth utilities of each level for each variable ## 1 intercept 63,8822 ## 2 less than 2 miles 9,0987 ## 3 within 2-5 miles -1,6801 ## 4 within 5-10 miles -7,4186 ## 5 very large assortment 4,5071 ## 6 large assortment 2,2726 ## 7 limited assortment -6,7797 ## 8 office furniture 5,1926 ## 9 no furniture -5,1926 ## 10 no computers -5,9053 ## 11 software only -4,377 ## 12 software and computers 10,2822
#• Interpret the results #Locations are mostl likely to be successful if they are less than 2 miles away, have a very large assort, have office furniture and they have software and computers. The least preferred store are far away within 5-10 miles, limited assortment, no furniture, and no computers. Stores that just have software are unpopular as well. #• Relative importance of attributes #The average relative importance of attributes out of 100 score is 27.17 for location, 25.42 for office supplies, 17.65 for furniture, and 29.77 for computers.
#• What is the most preferred one?
#The most preferred store is less than 2 miles away with a very large assortment, office furniture, and has software and computers.
#I had trouble getting the utility plots of this to knit. I would like to know how to do this if possible. They would not all fit into the plots in R studio and they would
setwd("~/Desktop/Exam")
library(conjoint)
# Declare research variables: purpose, form, season, and accommodation
# Include levels of attributes
office<-expand.grid(location=c("less than 2 miles","within 2-5 miles","within 5-10 miles"),office.supply=c("very large assortment","large assortment","limited assortment"),furniture=c("office furniture","no furniture"), computers=c("no computers", "software only", "software and computers"))
# Fractional Factorial Design: Very Important #########
officefactdesign<-caFactorialDesign(data=office,type="fractional")
officefactdesign
## encoding variable levels of the fractional design
prof=caEncodedDesign(design=officefactdesign)
prof
###########################
# data loading
# semi-colon separated data: using read.csv2
preferences=read.csv2("office_preferences.csv", header=TRUE)
profiles=read.csv2("office_profiles.csv", header=TRUE)
levelnames=read.csv2("office_levels.csv", header=TRUE)
simulations=read.csv2("office_simulations.csv", header=TRUE)
print(head(preferences))
## Bundle.1.Bundle.2.Bundle.3.Bundle.4.Bundle.5.Bundle.6.Bundle.7.Bundle.8.Bundle.9.Bundle.10.Bundle.11.Bundle.12.Bundle.13.Bundle.14.Bundle.15.Bundle.16
## 1 90,50,50,80,85,40,40,90,30,60,60,30,30,90,80,40
## 2 50,55,95,50,50,40,40,85,75,35,20,25,25,85,35,45
## 3 40,60,90,60,45,35,45,85,80,55,40,45,50,85,50,35
## 4 75,80,60,70,90,65,60,85,85,70,55,60,90,85,70,65
## 5 90,80,70,70,80,75,50,75,80,75,50,60,80,75,50,75
## 6 95,70,70,80,80,50,50,70,40,60,55,50,60,70,50,55
# comma and tab separated data: using read.csv
preferences2=read.csv("office_preferences.csv", header=TRUE)
# Measurement of part-worths utilities (all respondents):
partutilities=caPartUtilities(y=preferences2,x=profiles,z=levelnames)
print(head(partutilities))
## intercept less than 2 miles within 2-5 miles within 5-10 miles
## [1,] 58.802 11.603 2.186 -13.789
## [2,] 53.789 13.576 -1.072 -12.504
## [3,] 58.542 5.400 -2.945 -2.455
## [4,] 71.680 0.865 2.987 -3.853
## [5,] 71.771 8.770 -3.578 -5.191
## [6,] 63.789 17.630 -4.315 -13.315
## very large assortment large assortment limited assortment office furniture
## [1,] 0.039 2.501 -2.541 21.216
## [2,] 0.108 4.379 -4.487 2.095
## [3,] -2.545 1.695 0.850 1.824
## [4,] 12.499 -1.607 -10.891 2.635
## [5,] 12.691 2.953 -15.645 1.622
## [6,] 5.919 2.622 -8.541 9.257
## no furniture no computers software only software and computers
## [1,] -21.216 -2.474 -2.552 5.026
## [2,] -2.095 -15.866 -14.518 30.384
## [3,] -1.824 -20.729 -6.042 26.771
## [4,] -2.635 -7.025 0.299 6.725
## [5,] -1.622 0.885 -4.271 3.385
## [6,] -9.257 0.384 0.482 -0.866
# Measurement of total utilities (all respondents):
totalutilities=caTotalUtilities(y=preferences2,x=profiles)
print(head(totalutilities))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 89.187 49.139 51.674 82.154 79.692 39.799 39.721 89.732 28.862 66.178
## [2,] 53.702 55.131 91.167 44.673 40.402 39.136 40.484 89.575 69.683 33.240
## [3,] 42.492 57.770 89.738 53.074 48.834 34.738 49.426 85.887 78.488 53.564
## [4,] 80.654 68.602 65.744 75.995 90.100 63.400 70.724 82.420 84.416 69.154
## [5,] 95.739 77.601 66.660 68.497 78.235 70.410 65.253 76.153 81.035 66.883
## [6,] 96.979 75.266 62.755 71.835 75.132 53.224 53.321 70.487 46.271 62.835
## [,11] [,12] [,13] [,14] [,15] [,16]
## [1,] 61.214 23.746 37.259 89.732 77.111 39.799
## [2,] 23.026 29.051 36.213 89.575 35.807 39.136
## [3,] 38.032 49.916 45.186 85.887 52.230 34.738
## [4,] 52.546 63.884 84.830 82.420 66.710 63.400
## [5,] 53.442 63.640 74.992 76.153 49.899 70.410
## [6,] 51.574 44.321 56.618 70.487 60.671 53.224
# # Using the Conjoint function for all respondents
Conjoint(y=preferences2,x=profiles,z=levelnames)
##
## Call:
## lm(formula = frml)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37,140 -9,949 0,931 11,623 36,623
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 63,8822 1,0076 63,400 < 2e-16 ***
## factor(x$locations)1 9,0987 1,5412 5,904 9,28e-09 ***
## factor(x$locations)2 -1,6801 1,2184 -1,379 0,168883
## factor(x$office.supply)1 4,5071 1,3604 3,313 0,001032 **
## factor(x$office.supply)2 2,2726 1,2184 1,865 0,063083 .
## factor(x$furniture)1 5,1926 0,8802 5,899 9,50e-09 ***
## factor(x$computers)1 -5,9053 1,3292 -4,443 1,24e-05 ***
## factor(x$computers)2 -4,3770 1,1549 -3,790 0,000181 ***
## ---
## Signif. codes: 0 '***' 0,001 '**' 0,01 '*' 0,05 '.' 0,1 ' ' 1
##
## Residual standard error: 15,14 on 312 degrees of freedom
## Multiple R-squared: 0,3098, Adjusted R-squared: 0,2943
## F-statistic: 20,01 on 7 and 312 DF, p-value: < 2,2e-16
## [1] "Part worths (utilities) of levels (model parameters for whole sample):"
## levnms utls
## 1 intercept 63,8822
## 2 less than 2 miles 9,0987
## 3 within 2-5 miles -1,6801
## 4 within 5-10 miles -7,4186
## 5 very large assortment 4,5071
## 6 large assortment 2,2726
## 7 limited assortment -6,7797
## 8 office furniture 5,1926
## 9 no furniture -5,1926
## 10 no computers -5,9053
## 11 software only -4,377
## 12 software and computers 10,2822
## [1] "Average importance of factors (attributes):"
## [1] 27,17 25,42 17,65 29,77
## [1] Sum of average importance: 100,01
## [1] "Chart of average factors importance"
### Segmentation of respondents
### ---------------------------
### Segmentation using k-means method - the default division into 2 segments:
segments<-caSegmentation(preferences2,profiles)
print(segments$seg)
## K-means clustering with 2 clusters of sizes 6, 14
##
## Cluster means:
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## 1 49.16067 55.97833 89.86100 47.13733 44.41050 40.43417 45.83117 84.65700
## 2 88.61014 69.84336 63.33207 73.06993 77.43086 58.92364 58.79393 77.93179
## [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
## 1 74.99150 41.50500 32.47117 40.19883 43.10433 84.65700 43.50100 40.43417
## 2 62.23286 67.28614 56.04250 53.00986 63.15471 77.93179 61.69679 58.92364
##
## Clustering vector:
## [1] 2 1 1 2 2 2 2 1 1 2 2 2 2 1 1 2 2 2 2 2
##
## Within cluster sum of squares by cluster:
## [1] 3748.626 28113.003
## (between_SS / total_SS = 49.5 %)
##
## Available components:
##
## [1] "cluster" "centers" "totss" "withinss" "tot.withinss"
## [6] "betweenss" "size" "iter" "ifault"
## Length Class Mode
## segm 9 kmeans list
## util 320 -none- numeric
## sclu 20 -none- numeric
## Loading required package: fpc
## Loading required package: broom
## Loading required package: ggplot2
Cluster 1
# data loading
# semi-colon separated data: using read.csv2
preferences3=read.csv2("office_preferences11.csv", header=TRUE)
print(head(preferences))
## Bundle.1.Bundle.2.Bundle.3.Bundle.4.Bundle.5.Bundle.6.Bundle.7.Bundle.8.Bundle.9.Bundle.10.Bundle.11.Bundle.12.Bundle.13.Bundle.14.Bundle.15.Bundle.16
## 1 90,50,50,80,85,40,40,90,30,60,60,30,30,90,80,40
## 2 50,55,95,50,50,40,40,85,75,35,20,25,25,85,35,45
## 3 40,60,90,60,45,35,45,85,80,55,40,45,50,85,50,35
## 4 75,80,60,70,90,65,60,85,85,70,55,60,90,85,70,65
## 5 90,80,70,70,80,75,50,75,80,75,50,60,80,75,50,75
## 6 95,70,70,80,80,50,50,70,40,60,55,50,60,70,50,55
# comma and tab separated data: using read.csv
preferences4=read.csv("office_preferences11.csv", header=TRUE)
# Measurement of part-worths utilities (all respondents):
partutilities=caPartUtilities(y=preferences4,x=profiles,z=levelnames)
print(head(partutilities))
## intercept less than 2 miles within 2-5 miles within 5-10 miles
## [1,] 53.789 13.576 -1.072 -12.504
## [2,] 58.542 5.400 -2.945 -2.455
## [3,] 52.786 9.196 -2.903 -6.292
## [4,] 58.672 8.130 -1.832 -6.298
## [5,] 55.495 13.559 -2.019 -11.540
## [6,] 60.169 1.993 1.742 -3.735
## very large assortment large assortment limited assortment office furniture
## [1,] 0.108 4.379 -4.487 2.095
## [2,] -2.545 1.695 0.850 1.824
## [3,] 1.501 3.268 -4.768 0.946
## [4,] -0.577 1.051 -0.474 0.405
## [5,] 1.748 6.134 -7.882 2.162
## [6,] -3.870 -3.799 7.669 -3.514
## no furniture no computers software only software and computers
## [1,] -2.095 -15.866 -14.518 30.384
## [2,] -1.824 -20.729 -6.042 26.771
## [3,] -0.946 -12.878 -14.245 27.122
## [4,] -0.405 -18.477 -5.547 24.023
## [5,] -2.162 -14.023 -16.953 30.977
## [6,] 3.514 -14.655 -6.940 21.595
# Measurement of total utilities (all respondents):
totalutilities=caTotalUtilities(y=preferences4,x=profiles)
print(head(totalutilities))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 53.702 55.131 91.167 44.673 40.402 39.136 40.484 89.575 69.683 33.240
## [2,] 42.492 57.770 89.738 53.074 48.834 34.738 49.426 85.887 78.488 53.564
## [3,] 51.551 50.059 83.390 39.852 38.085 39.327 37.960 81.219 74.171 36.463
## [4,] 48.154 61.900 89.946 52.749 51.121 39.009 51.938 82.319 75.415 48.283
## [5,] 58.941 56.073 89.986 44.818 40.433 43.424 40.494 92.748 74.518 35.298
## [6,] 40.124 54.937 94.939 47.658 47.588 46.971 54.685 76.194 77.674 42.182
## [,11] [,12] [,13] [,14] [,15] [,16]
## [1,] 23.026 29.051 36.213 89.575 35.807 39.136
## [2,] 38.032 49.916 45.186 85.887 52.230 34.738
## [3,] 29.794 34.571 36.193 81.219 31.816 39.327
## [4,] 33.829 47.473 50.310 82.319 51.225 39.009
## [5,] 24.211 30.973 36.109 92.748 30.802 43.424
## [6,] 45.935 49.209 54.615 76.194 59.126 46.971
# # Using the Conjoint function for all respondents
Conjoint(y=preferences4,x=profiles,z=levelnames)
##
## Call:
## lm(formula = frml)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23,1042 -5,5334 0,3429 5,0412 17,8625
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 56,5755 1,0284 55,015 < 2e-16 ***
## factor(x$locations)1 8,6422 1,5730 5,494 3,77e-07 ***
## factor(x$locations)2 -1,5049 1,2435 -1,210 0,2294
## factor(x$office.supply)1 -0,6058 1,3885 -0,436 0,6637
## factor(x$office.supply)2 2,1212 1,2435 1,706 0,0916 .
## factor(x$furniture)1 0,6532 0,8984 0,727 0,4691
## factor(x$computers)1 -16,1046 1,3566 -11,871 < 2e-16 ***
## factor(x$computers)2 -10,7075 1,1787 -9,084 2,75e-14 ***
## ---
## Signif. codes: 0 '***' 0,001 '**' 0,01 '*' 0,05 '.' 0,1 ' ' 1
##
## Residual standard error: 8,466 on 88 degrees of freedom
## Multiple R-squared: 0,833, Adjusted R-squared: 0,8198
## F-statistic: 62,73 on 7 and 88 DF, p-value: < 2,2e-16
## [1] "Part worths (utilities) of levels (model parameters for whole sample):"
## levnms utls
## 1 intercept 56,5755
## 2 less than 2 miles 8,6422
## 3 within 2-5 miles -1,5049
## 4 within 5-10 miles -7,1373
## 5 very large assortment -0,6058
## 6 large assortment 2,1212
## 7 limited assortment -1,5155
## 8 office furniture 0,6532
## 9 no furniture -0,6532
## 10 no computers -16,1046
## 11 software only -10,7075
## 12 software and computers 26,8121
## [1] "Average importance of factors (attributes):"
## [1] 21,34 11,03 5,19 62,43
## [1] Sum of average importance: 99,99
## [1] "Chart of average factors importance"
Cluster 2
# data loading
# semi-colon separated data: using read.csv2
preferences5=read.csv2("office_preferences22.csv", header=TRUE)
print(head(preferences))
## Bundle.1.Bundle.2.Bundle.3.Bundle.4.Bundle.5.Bundle.6.Bundle.7.Bundle.8.Bundle.9.Bundle.10.Bundle.11.Bundle.12.Bundle.13.Bundle.14.Bundle.15.Bundle.16
## 1 90,50,50,80,85,40,40,90,30,60,60,30,30,90,80,40
## 2 50,55,95,50,50,40,40,85,75,35,20,25,25,85,35,45
## 3 40,60,90,60,45,35,45,85,80,55,40,45,50,85,50,35
## 4 75,80,60,70,90,65,60,85,85,70,55,60,90,85,70,65
## 5 90,80,70,70,80,75,50,75,80,75,50,60,80,75,50,75
## 6 95,70,70,80,80,50,50,70,40,60,55,50,60,70,50,55
# comma and tab separated data: using read.csv
preferences6=read.csv("office_preferences22.csv", header=TRUE)
# Measurement of part-worths utilities (all respondents):
partutilities=caPartUtilities(y=preferences6,x=profiles,z=levelnames)
print(head(partutilities))
## intercept less than 2 miles within 2-5 miles within 5-10 miles
## [1,] 58.802 11.603 2.186 -13.789
## [2,] 71.680 0.865 2.987 -3.853
## [3,] 71.771 8.770 -3.578 -5.191
## [4,] 63.789 17.630 -4.315 -13.315
## [5,] 61.354 13.376 -2.013 -11.363
## [6,] 73.620 2.281 1.128 -3.409
## very large assortment large assortment limited assortment office furniture
## [1,] 0.039 2.501 -2.541 21.216
## [2,] 12.499 -1.607 -10.891 2.635
## [3,] 12.691 2.953 -15.645 1.622
## [4,] 5.919 2.622 -8.541 9.257
## [5,] -1.137 3.888 -2.751 19.189
## [6,] 8.618 -0.764 -7.854 2.703
## no furniture no computers software only software and computers
## [1,] -21.216 -2.474 -2.552 5.026
## [2,] -2.635 -7.025 0.299 6.725
## [3,] -1.622 0.885 -4.271 3.385
## [4,] -9.257 0.384 0.482 -0.866
## [5,] -19.189 0.677 -3.854 3.177
## [6,] -2.703 -4.961 -0.078 5.039
# Measurement of total utilities (all respondents):
totalutilities=caTotalUtilities(y=preferences6,x=profiles)
print(head(totalutilities))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 89.187 49.139 51.674 82.154 79.692 39.799 39.721 89.732 28.862 66.178
## [2,] 80.654 68.602 65.744 75.995 90.100 63.400 70.724 82.420 84.416 69.154
## [3,] 95.739 77.601 66.660 68.497 78.235 70.410 65.253 76.153 81.035 66.883
## [4,] 96.979 75.266 62.755 71.835 75.132 53.224 53.321 70.487 46.271 62.835
## [5,] 93.459 55.574 55.967 78.564 73.539 44.717 40.186 85.595 32.842 69.215
## [6,] 82.260 72.356 70.383 76.609 85.990 66.321 71.204 81.726 81.164 72.071
## [,11] [,12] [,13] [,14] [,15] [,16]
## [1,] 61.214 23.746 37.259 89.732 77.111 39.799
## [2,] 52.546 63.884 84.830 82.420 66.710 63.400
## [3,] 53.442 63.640 74.992 76.153 49.899 70.410
## [4,] 51.574 44.321 56.618 70.487 60.671 53.224
## [5,] 67.107 30.836 35.160 85.595 71.926 44.717
## [6,] 60.098 66.666 80.585 81.726 69.519 66.321
# # Using the Conjoint function for all respondents
Conjoint(y=preferences6,x=profiles,z=levelnames)
##
## Call:
## lm(formula = frml)
##
## Residuals:
## Min 1Q Median 3Q Max
## -38,155 -7,932 1,930 7,068 31,076
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 67,0136 0,9978 67,159 < 2e-16 ***
## factor(x$locations)1 9,2943 1,5263 6,089 5,12e-09 ***
## factor(x$locations)2 -1,7552 1,2066 -1,455 0,147
## factor(x$office.supply)1 6,6983 1,3473 4,972 1,35e-06 ***
## factor(x$office.supply)2 2,3374 1,2066 1,937 0,054 .
## factor(x$furniture)1 7,1380 0,8717 8,189 2,29e-14 ***
## factor(x$computers)1 -1,5341 1,3163 -1,165 0,245
## factor(x$computers)2 -1,6639 1,1437 -1,455 0,147
## ---
## Signif. codes: 0 '***' 0,001 '**' 0,01 '*' 0,05 '.' 0,1 ' ' 1
##
## Residual standard error: 12,55 on 216 degrees of freedom
## Multiple R-squared: 0,3741, Adjusted R-squared: 0,3538
## F-statistic: 18,44 on 7 and 216 DF, p-value: < 2,2e-16
## [1] "Part worths (utilities) of levels (model parameters for whole sample):"
## levnms utls
## 1 intercept 67,0136
## 2 less than 2 miles 9,2943
## 3 within 2-5 miles -1,7552
## 4 within 5-10 miles -7,5391
## 5 very large assortment 6,6983
## 6 large assortment 2,3374
## 7 limited assortment -9,0358
## 8 office furniture 7,138
## 9 no furniture -7,138
## 10 no computers -1,5341
## 11 software only -1,6639
## 12 software and computers 3,198
## [1] "Average importance of factors (attributes):"
## [1] 29,66 31,58 22,99 15,77
## [1] Sum of average importance: 100
## [1] "Chart of average factors importance"
#• Develop two segments-I did this based on the segment vectors given. #o Run conjoint analysis for each segment #o Provide the results and interpretation for each segment (basically, you should do the same things as the above)
#Cluster 1:
#• Provide part-worth utilities of each level for each variable
#• Interpret the results # The most popular variables in cluster 1 are less than 2 miles, large assortment, office furniture, and computers and software.The least popular are within 5-10 miles, limited assortment, and no computers and no furniture.
#• Relative importance of attributes #Distance is at 21.34, office supply is at 11.03, furniture is at 5.19, and computers are at 62.43.
#Provide part-worth utilities of each level for each variable ## 2 less than 2 miles 9,2943 ## 3 within 2-5 miles -1,7552 ## 4 within 5-10 miles -7,5391 ## 5 very large assortment 6,6983 ## 6 large assortment 2,3374 ## 7 limited assortment -9,0358 ## 8 office furniture 7,138 ## 9 no furniture -7,138 ## 10 no computers -1,5341 ## 11 software only -1,6639 ## 12 software and computers 3,198
##• Interpret the results
#The most popular location is less than 2 miles, with a very large assortment of office supplies, office furniture, and software/computers. The least popular location is within 5-10 miles, limited assortment, no furniture, and no computers.
#• Relative importance of attributes #Location is 29.66, office supplies is at 31.58, furniture is at 22.99 and computers are at 15.77. #• What is the most preferred one? #The most popular location is less than 2 miles, with a very large assortment of office supplies, office furniture, and software/computers.
#I would choose segment 2 as it is the biggest cluster and is over twice as big as segment 1. Computers and software are also of less importance versus segment 1 and I imagine that in this time period, computers and software were not easy to stock and probably more expensive than office supplies to get a hold of. I think computers and software should be available but to a lesser extent which will save the company money. This cluster is more focused on office furniture and a very large assortment of office supplies but since large assortment is also relatively popular for this cluster, it’s likely that the company could find a middle ground that isn’t off-putting to the first segment of customers. I think you can probably pay attention to both segments but you just won’t have to stock as many computers/software (but you should stock them).