Case Background:
CMMi has been a pioneer in the multimedia solutions industry over the past 12 years. They are now a nationally recognized leader servicing medium to large businesses as a “go-to” source for high-end multimedia applications. Although the multimedia solutions market is booming, CMMi has not seen much growth over the past few years. They are able to acquire new customers, however, many of them simply do not return.
Problem & Implications
The fact that CMMi is lagging behind overall market growth implies that they are steadily losing market share to competitors that are outpacing them.
Solution Approach
The client is CMMi's Marketing Manager, Noreen Kopf. She would like to develop a strategy to improve customer satisfaction and retention in order to reallocate the budget of customer satisfaction programs accordingly.
The approach to solving this problem is two-fold. First, we must leverage the provided survey data in order to evaluate CMMi's performance relative to their top competitor, Knights & Schoening. This can be done by computing an average performance score across all identified drivers of customer satisfaction. Likewise, we will also compute an average score of the “overall satifisfaction” attribute for both brands: CMMi vs. Knights & Schoening. A visual representation of the results will be provided below.
The second part of the analysis will focus on determining the relative importance of each customer satisfaction driver via linear regression. The results of this analysis, combined with CMMi's performance scores, will provide Ms. Kopf with the insights necessary to focus her customer satisfaction and retention efforts on the appropriate drivers. A corresponding quadrant analysis with budget allocation recommendations will be provided below.
#brand 0 = CMMi
#brand 1 = Knights & Schoening
dir()
## [1] "custsatcase.csv" "figure" "hw1.html"
## [4] "hw1.md" "hw1.Rmd" "marketing_hw1.Rproj"
## [7] "q.jpg"
ratings = read.csv("custsatcase.csv", header=TRUE, sep=",")
head(ratings)
## customer.id brand overall_satisfaction information longevity performance
## 1 1 1 1 5 1 1
## 2 3 1 2 5 1 2
## 3 5 1 1 3 1 1
## 4 10 1 5 5 2 1
## 5 15 1 1 5 1 1
## 6 20 1 1 5 1 3
## features price ease_of_use reliability documentation operating_costs
## 1 4 1 1 3 2 2
## 2 5 2 1 2 4 3
## 3 4 1 1 3 5 3
## 4 2 5 3 1 5 3
## 5 5 1 4 3 4 5
## 6 4 3 3 1 5 3
## delivery.time_orders response.time_support expertise_support brand_dummy
## 1 5 5 1 0
## 2 4 5 3 0
## 3 3 3 1 0
## 4 3 1 1 0
## 5 4 1 4 0
## 6 2 5 3 0
#removing unnecessary columns
library(dplyr)
ratings = select(ratings, -c(customer.id, brand))
#head(ratings)
#calculating mean for each column
perf = aggregate(x=ratings, by=list(ratings$brand_dummy), FUN=mean, simplify=TRUE)
perf
## Group.1 overall_satisfaction information longevity performance features
## 1 0 2.120 3.585 1.810 2.39 4.03
## 2 1 4.755 3.610 4.225 3.65 2.86
## price ease_of_use reliability documentation operating_costs
## 1 1.745 2.185 1.725 3.990 3.98
## 2 1.540 4.100 1.840 2.015 3.81
## delivery.time_orders response.time_support expertise_support brand_dummy
## 1 3.525 3.445 2.275 0
## 2 1.675 1.695 3.670 1
#visualize
par(mar=c(10, 4.1, 4.1, 2.1))
viz = as.matrix(perf)
barplot(viz[, 2:14], beside=TRUE, ylim=c(0,5), main="Average Performance Scores", col=c("blue", "red"), las=3, cex.lab = 1.5, cex.main = 1.5, cex.axis = 1.5, legend=viz[,1], ylab="Customer Ratings")
#move subtitle
mtext(text="0 = CMMi, 1 = Knights & Schoening", side=3, line=3)
Performance Analysis Insights
As evidenced by the results, Knights & Schoening is outperforming CMMi by a substantial margin aross several customer satisfaction drivers. As a result, Knights & Schoening touts an “overall satisfaction” score more than twice that of CMMi's. In order to remedy the issue, we'll need to focus on the customer satisfaction drivers that CMMi is underperforming on relative to their primary competitor. Namely, they are:
#head(ratings)
attach(ratings)
overall_importance = lm(overall_satisfaction ~., data=ratings)
summary(overall_importance)
##
## Call:
## lm(formula = overall_satisfaction ~ ., data = ratings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8558 -0.5593 -0.0052 0.4451 2.5374
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.08981 0.34342 -6.085 2.80e-09 ***
## information 0.07736 0.02910 2.658 0.00818 **
## longevity 0.06443 0.03657 1.762 0.07892 .
## performance 0.20560 0.02996 6.862 2.71e-11 ***
## features 0.03818 0.03021 1.264 0.20704
## price 0.22675 0.03747 6.052 3.40e-09 ***
## ease_of_use 0.15173 0.03219 4.713 3.41e-06 ***
## reliability 0.05988 0.03681 1.627 0.10459
## documentation 0.17961 0.03242 5.539 5.62e-08 ***
## operating_costs 0.06131 0.03150 1.946 0.05234 .
## delivery.time_orders 0.30243 0.03353 9.019 < 2e-16 ***
## response.time_support 0.04098 0.03140 1.305 0.19262
## expertise_support 0.07571 0.02923 2.590 0.00995 **
## brand_dummy 2.90288 0.17885 16.231 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7993 on 386 degrees of freedom
## Multiple R-squared: 0.7771, Adjusted R-squared: 0.7696
## F-statistic: 103.5 on 13 and 386 DF, p-value: < 2.2e-16
#confidence intervals for coefficient estimates
confint(overall_importance, level=.90)
## 5 % 95 %
## (Intercept) -2.6560472875 -1.52357367
## information 0.0293804185 0.12533916
## longevity 0.0041270156 0.12472633
## performance 0.1561998886 0.25499914
## features -0.0116286070 0.08798770
## price 0.1649718426 0.28853340
## ease_of_use 0.0986507833 0.20480281
## reliability -0.0008091214 0.12057658
## documentation 0.1261522377 0.23307556
## operating_costs 0.0093708383 0.11324458
## delivery.time_orders 0.2471389169 0.35771179
## response.time_support -0.0107899027 0.09274346
## expertise_support 0.0275154015 0.12389789
## brand_dummy 2.6079913009 3.19776144
The above “pooled” model estimates the relative importance of each customer satisfcation driver by regressing all drivers on the overall satisfaction score while controlling for the effect of brand. According to the Adjusted-R-Squared metric, approximately 77% of the variation in the overall satisfaction score can be explained by the model. However, no statistically significant linear dependence of the mean of “overall satisfaction” on the following satisfaction drivers was detected:
For all other customer satisfaction drivers, we can be at least 90% confident that the corresponding feature is related to our output variable of interest, overall satisfaction. We can interpret the coefficient estimates corresponding to each customer satisfaction driver as follows: The rate of change in the conditional mean of “overall satisfaction” with respect to customer satisfaction drivers is estimated to be the corresponding coefficient estimate (holding all else constant), with the true range of the effect falling between the corresponding confidence interval at the 90% level.
#visualize coefficient estimates
par(mar=c(10, 4.1, 4.1, 2.1))
d = overall_importance$coefficients
d = as.matrix(d)
#dim(d)
barplot(d[2:13,], las=3, cex.lab = 1.5, cex.main = 1.5, col="dark green", ylim=c(0,.35), ylab="Coefficient Estimates (Importance)", main="Relative Importance of Customer Satisfaction Drivers")
#model w/ different coefficients across competitors
head(ratings)
## overall_satisfaction information longevity performance features price
## 1 1 5 1 1 4 1
## 2 2 5 1 2 5 2
## 3 1 3 1 1 4 1
## 4 5 5 2 1 2 5
## 5 1 5 1 1 5 1
## 6 1 5 1 3 4 3
## ease_of_use reliability documentation operating_costs
## 1 1 3 2 2
## 2 1 2 4 3
## 3 1 3 5 3
## 4 3 1 5 3
## 5 4 3 4 5
## 6 3 1 5 3
## delivery.time_orders response.time_support expertise_support brand_dummy
## 1 5 5 1 0
## 2 4 5 3 0
## 3 3 3 1 0
## 4 3 1 1 0
## 5 4 1 4 0
## 6 2 5 3 0
sep_importance = lm(overall_satisfaction ~ information + longevity + performance + features + price + ease_of_use + reliability + documentation + operating_costs + delivery.time_orders + response.time_support + expertise_support + information:brand_dummy + longevity:brand_dummy +
performance:brand_dummy + features:brand_dummy + price:brand_dummy + ease_of_use:brand_dummy + reliability:brand_dummy + documentation:brand_dummy + operating_costs:brand_dummy + delivery.time_orders:brand_dummy + response.time_support:brand_dummy + expertise_support:brand_dummy + brand_dummy, data = ratings)
summary(sep_importance)
##
## Call:
## lm(formula = overall_satisfaction ~ information + longevity +
## performance + features + price + ease_of_use + reliability +
## documentation + operating_costs + delivery.time_orders +
## response.time_support + expertise_support + information:brand_dummy +
## longevity:brand_dummy + performance:brand_dummy + features:brand_dummy +
## price:brand_dummy + ease_of_use:brand_dummy + reliability:brand_dummy +
## documentation:brand_dummy + operating_costs:brand_dummy +
## delivery.time_orders:brand_dummy + response.time_support:brand_dummy +
## expertise_support:brand_dummy + brand_dummy, data = ratings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4895 -0.3091 0.0349 0.3920 2.1914
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.44962 0.42468 -8.123 6.70e-15 ***
## information 0.03018 0.03776 0.799 0.42461
## longevity 0.12206 0.04427 2.757 0.00611 **
## performance 0.27808 0.03751 7.414 8.27e-13 ***
## features 0.03487 0.04146 0.841 0.40080
## price 0.37907 0.04529 8.370 1.16e-15 ***
## ease_of_use 0.24383 0.03954 6.167 1.81e-09 ***
## reliability 0.11723 0.04822 2.431 0.01552 *
## documentation 0.24278 0.04151 5.849 1.08e-08 ***
## operating_costs 0.03283 0.04096 0.801 0.42337
## delivery.time_orders 0.44251 0.03850 11.494 < 2e-16 ***
## response.time_support 0.06907 0.03460 1.996 0.04666 *
## expertise_support 0.06229 0.03712 1.678 0.09418 .
## brand_dummy 5.89620 0.60910 9.680 < 2e-16 ***
## information:brand_dummy 0.09105 0.05256 1.732 0.08407 .
## longevity:brand_dummy -0.09855 0.06598 -1.494 0.13612
## performance:brand_dummy -0.18244 0.05403 -3.377 0.00081 ***
## features:brand_dummy 0.02170 0.05458 0.398 0.69122
## price:brand_dummy -0.38125 0.06763 -5.637 3.41e-08 ***
## ease_of_use:brand_dummy -0.16092 0.05771 -2.788 0.00557 **
## reliability:brand_dummy -0.12715 0.06697 -1.899 0.05837 .
## documentation:brand_dummy -0.15761 0.05819 -2.708 0.00707 **
## operating_costs:brand_dummy 0.04877 0.05658 0.862 0.38921
## delivery.time_orders:brand_dummy -0.39438 0.06288 -6.272 9.85e-10 ***
## response.time_support:brand_dummy -0.01442 0.05966 -0.242 0.80911
## expertise_support:brand_dummy 0.01576 0.05236 0.301 0.76355
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.707 on 374 degrees of freedom
## Multiple R-squared: 0.831, Adjusted R-squared: 0.8197
## F-statistic: 73.58 on 25 and 374 DF, p-value: < 2.2e-16
#anova
anova(overall_importance, sep_importance, test="F")
## Analysis of Variance Table
##
## Model 1: overall_satisfaction ~ information + longevity + performance +
## features + price + ease_of_use + reliability + documentation +
## operating_costs + delivery.time_orders + response.time_support +
## expertise_support + brand_dummy
## Model 2: overall_satisfaction ~ information + longevity + performance +
## features + price + ease_of_use + reliability + documentation +
## operating_costs + delivery.time_orders + response.time_support +
## expertise_support + information:brand_dummy + longevity:brand_dummy +
## performance:brand_dummy + features:brand_dummy + price:brand_dummy +
## ease_of_use:brand_dummy + reliability:brand_dummy + documentation:brand_dummy +
## operating_costs:brand_dummy + delivery.time_orders:brand_dummy +
## response.time_support:brand_dummy + expertise_support:brand_dummy +
## brand_dummy
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 386 246.63
## 2 374 186.94 12 59.689 9.9511 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The separate importance model, which includes interactions across all customer satisfaction drivers, shows how these coefficient estimates change at differing values of the dummy variable for brand (0 = CMMi, 1 = K&S) while controlling for individual effects of the customer satisfaction drivers. In looking at the results, we can see that the coefficient estimate for the individual brand dummy is very large relative to all of the other coefficient estimates. This implies that the effect of brand=1, Knights and Schoening, on the overall satisfaction score is huge. By comparison, the negative coefficient estimates for the interactions between customer service drivers and the brand dummy imply that performance on customer service drivers are less important for Knights and Schoening relative to CMMi. It's also worth noting that this model explains approximately 82% of the variation in the overall satisfaction score, as evidenced by the Adjusted-R-Squared value.
Likewise, the ANOVA test above indicates that there is a statistically significant difference between the two models. From this, we can conclude that there are indeed differences in the importance of the customer satisfaction drivers between the two brands. We can make final recommendations based on the combination of performance and importance information for CMMi.
In order to provide relevant recommendations to Ms. Kopf, we must think about both the importance of customer satisfaction drivers to CMMi and their performance on them relative to their competitor, Knights and Schoening. Quadrant Analysis provides us with the perfect framework in which to conduct this analysis and frame our recommendations.
Disinvest (Low Importance, High Performance)
Conversely, she will want to disinvest in areas where the importance of the satisfaction driver is low, but CMMi's performance on this metric is high. The following customer satisfaction drivers correspond to this quandrant: Information, Features, Operating Costs, and Response Time Support
Increase Investment (High Importance, Low Performance)
Ms. Kopf will instead want to reallocate the money previously dedicated to the above customer satisfaction drivers to those that have a high level of importance, however, CMMi's performance on these metrics is low. The following customer satisfaction drivers correspond to this quandrant: Performance, Price (cutting costs to decrease price and maintain margin), and Ease of Use.
Maintain Investment (High Importance, High Performance & Low Importance, Low Performance)
In terms of reallocating the customer satisfaction programs budget, Ms. Kopf will want to maintain her investment in areas where the importance of satisfaction drivers is high and CMMi's performance on these metrics is also high. Likewise, she will also want to maintain her investment in areas where the importance of satisfaction drivers is low and CMMi's performance on these metrics is also low. Thus, CMMi should maintain their investment in all remaining customer satisfaction drivers.