This case study identifies and quantifies high-impact opportunities in the customer service to improve the likelihood that customers recommend the company, building directly on the findings of Predicting Customer Detractors (Part 1): Analyzing Contextual Factors via Logistic Regression. Where Part 1 identified which service contexts had the most detractors (e.g., chat for incorrect items), this project investigates the root causes of dissatisfaction through thematic analysis of open-ended feedback and simulates the potential business impact of addressing them.
The analysis includes: - Data simulation and cleaning. - Exploratory
analysis with descriptive statistics and visualizations.
- Logistic regression modeling. - Simulation-based predictions using
bootstrapping.
Although based on a real-world project, all data, variables, and insights presented here have been simulated to ensure confidentiality.
We begin by loading the necessary R packages for data manipulation, visualization, and statistical modeling.
# Data handling
library(dplyr) # Data manipulation
##
## Adjuntando el paquete: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(openxlsx) # Open Excel files
# Visualization
library(ggplot2) # General plotting
library(vcd) # Mosaic plots
## Cargando paquete requerido: grid
# Statistical analysis
library(psych) # Descriptive statistics
##
## Adjuntando el paquete: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(glmnet) # Penalized logistic regression
## Cargando paquete requerido: Matrix
## Loaded glmnet 4.1-10
library(boot) # Bootstrapping procedures
##
## Adjuntando el paquete: 'boot'
## The following object is masked from 'package:psych':
##
## logit
library(MASS) # Simulate coefficients from multivariate normal distributions
##
## Adjuntando el paquete: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
In the previous study Predicting Customer Detractors (Part 1), we identified several contextual factors in customer service that were associated with higher levels of detractors, i.e., customers less likely to recommend our service.
The key findings are summarized in the chart below:
# Import the image of the mosaic graph about the predicted detractors across contextual factors
knitr::include_graphics("Predicted.Detractors.png")
As shown in the figure, customers attended via chat for incorrect item issues had the highest probability of becoming detractors (i.e., providing a score of 6 or lower in the Net Promoter Score).
Based on this finding, stakeholders and the research team agreed to conduct a follow-up to compare the customers’ complaints in cases attended by chat for incorrect item issues versus cases attended by chat for other issues. By holding the contact channel constant (chat), we avoid confounding the results with chat’s general negative impact.
This project builds on the dataset from Predicting Customer Detractors (Part 1). Analyzing Contextual Factors via Logistic Regression, which contained information about contact reasons, contact methods, countries, and NPS scores of customers who contacted the customer service.
We apply the code below to retrieve this information (please refer to Part 1 for full details in the dataset structure).
# Set seed for reproducibility
set.seed(1000)
# Define dataset size
data.n <- 40000 # Large enough to allow subgroup analysis despite unbalanced category probabilities
# Initialize dataframe with IDs
data <- data.frame(id = factor(1:data.n))
# Simulate the date of the evaluation
data$date <- sample(seq(from = as.Date("2024-01-01"), to = as.Date("2024-07-31"), by = "day"),
size = data.n,
replace = TRUE)
# ------ 1. Demographic Information and Contextual Variables
# Simulate age with a log-normal distribution, capped between 15 and 90
data$age <- round(rlnorm(n = data.n, meanlog = log(40), sdlog = log(1.4)))
data$age[data$age > 90] <- 90
data$age[data$age < 15] <- 15
# Simulate gender
data$gender <- factor(sample(c("man", "woman"),
prob = c(0.5, 0.5),
replace = TRUE, size = data.n))
# Simulate reason for contacting
data$reason <- factor(sample(c("incorrect_item", "delay", "status", "cancellation", "other"),
prob = c(0.20, 0.30, 0.20, 0.10, 0.20),
replace = TRUE, size = data.n))
# Simulate contact method
data$method <- factor(sample(c("chat", "web_form", "email", "phone"),
prob = c(0.40, 0.10, 0.15, 0.35),
replace = TRUE, size = data.n))
# Simulate country (two fictional countries for anonymity)
data$country <- factor(sample(c("Drinkland", "Eatland"),
prob = c(0.5, 0.5),
replace = TRUE, size = data.n))
data$country <- factor(data$country, levels = c("Eatland", "Drinkland"))
# ------ 2. Customer Complaints
# These variables represent underlying complaints that were not available at this stage of the investigation. We use them only to simulate NPS and satisfaction scores and store them in a separate dataframe to reflect their unavailability. They become visible only after thematic coding in the subsamples of interest (see next section).
## Create a dataset to store the complaints in connection to the case id
data.complaints <- data.frame(id = data$id)
## Complaints about connection issues
# Base probability of a connection complaint: 10%
prob_connection <- rep(0.10, data.n)
# In Drinkland, probability is doubled
prob_connection[data$country == "Drinkland"] <- prob_connection[data$country == "Drinkland"] * 2
# Ensure probability values remain within [0, 1]
prob_connection [prob_connection > 1] <- 1
prob_connection [prob_connection < 0] <- 0
# Generate binary variable based on probabilities
data.complaints$complain.connect <- as.integer(rbinom(n = data.n, size = 1, prob = prob_connection))
## Complaints about slow response
# Base probability of a slow response complaint: 5%
prob_slow_response <- rep(0.05, data.n)
# Email contacts: 6x higher likelihood
prob_slow_response[data$method == "email"] <- prob_slow_response[data$method == "email"] * 6
# Chat and web_form: 3x higher likelihood
prob_slow_response[data$method %in% c("chat", "web_form")] <- prob_slow_response[data$method %in% c("chat", "web_form")] * 3
# Ensure probability values remain within [0, 1]
prob_slow_response [prob_slow_response > 1] <- 1
prob_slow_response [prob_slow_response < 0] <- 0
# Generate binary variable
data.complaints$complain.sp.respond <- as.integer(rbinom(n = data.n, size = 1, prob = prob_slow_response))
## Complaints about slow resolution
# Base probability of a slow resolution complaint: 10%
prob_slow_resolution <- rep(0.10, data.n)
# If contact is via chat and reason is incorrect_item → 5x higher likelihood
is_chat_incorrect <- data$method == "chat" & data$reason == "incorrect_item"
prob_slow_resolution[is_chat_incorrect] <- prob_slow_resolution[is_chat_incorrect] * 5
# Ensure probability values remain within [0, 1]
prob_slow_resolution [prob_slow_resolution > 1] <- 1
prob_slow_resolution [prob_slow_resolution < 0] <- 0
# Generate binary variable
data.complaints$complain.sp.solve <- as.integer(rbinom(n = data.n, size = 1, prob = prob_slow_resolution))
## Complaints about having to repeat information
# Base probability of a repetition complaint: 2%
prob_repeat_info <- rep(0.02, data.n)
# If contact reason is cancellation → 10x higher likelihood
prob_repeat_info[data$reason == "cancellation"] <- prob_repeat_info[data$reason == "cancellation"] * 10
# Ensure probability values remain within [0, 1]
prob_repeat_info [prob_repeat_info > 1] <- 1
prob_repeat_info [prob_repeat_info < 0] <- 0
# Generate binary variable
data.complaints$complain.repeat <- as.integer(rbinom(n = data.n, size = 1, prob = prob_repeat_info))
## Other types of complaints (uniform distribution)
# Constant probability of 20% for other unspecified complaints
data.complaints$complain.other <- as.integer(rbinom(n = data.n, size = 1, prob = 0.20))
# ------ 3. NPS scores
# Simulate base NPS scores assuming no complaints (mean = 9, sd = 2)
data$nps.score <- rnorm(data.n, mean = 9, sd = 2)
# Apply fixed penalties based on complaints
# -3 for connection issues
# -4 for slow response
# -6 for slow resolution
# -2 for repeated information
# -1 for other complaints
data$nps.score <- data$nps.score -
3 * data.complaints$complain.connect -
4 * data.complaints$complain.sp.respond -
6 * data.complaints$complain.sp.solve -
2 * data.complaints$complain.repeat -
1 * data.complaints$complain.other
# Ensure scores stay within the 0–10 range
data$nps.score[data$nps.score < 0] <- 0
data$nps.score[data$nps.score > 10] <- 10
# Convert to integer values (using floor)
data$nps.score <- floor(data$nps.score)
# Add placeholder for open-text comments
data$open.comment <- rep("bla bla", data.n)
## Create NPS segments based on score values
data$nps.segment <- rep(NA_character_, data.n) # Initialize empty character variable
data$nps.segment[data$nps.score <= 6] <- "detractor"
data$nps.segment[data$nps.score > 6 & data$nps.score < 9] <- "passive"
data$nps.segment[data$nps.score >= 9] <- "promoter" # Assign segment based on score range
data$nps.segment <- factor(data$nps.segment, levels = c("detractor", "passive", "promoter")) # Convert to factor
## Create a binary variable for detractors
data$detractor <- NA_real_
data$detractor[data$nps.segment == "detractor"] <- 1
data$detractor[data$nps.segment != "detractor"] <- 0
data$detractor <- as.integer(data$detractor)
#------ 4. Satisfaction scores (1 to 5 scale)
# Generate satisfaction scores based on NPS
# We use a normal distribution centered at 3, with some variability (sd = 0.5)
# and add a scaled component based on NPS to simulate a positive relationship.
data$satisfaction <- floor(
rnorm(data.n, mean = 3, sd = 0.5) +
0.9 * (1 + as.numeric(scale(data$nps.score)))
)
# Ensure scores stay within the 1–5 range
data$satisfaction[data$satisfaction < 1] <- 1
data$satisfaction[data$satisfaction > 5] <- 5
# Set satisfaction to missing for Drinkland
data$satisfaction[data$country == "Drinkland"] <- NA_real_
# Set satisfaction to missing for email and web_form contacts
data$satisfaction[data$method %in% c("email", "web_form")] <- NA_real_
# Introduce 10% additional random missingness among remaining valid values
na_randomizer <- sample(
c(0, 1),
size = sum(!is.na(data$satisfaction)),
replace = TRUE,
prob = c(0.1, 0.9)
) # Create a random vector (0 = missing, 1 = keep) for non-NA entries
data$satisfaction[which(!is.na(data$satisfaction))[na_randomizer == 0]] <- NA_real_ # Apply randomizer
# ------ 6. Check the data was created as expected
summary(data)
## id date age gender
## 1 : 1 Min. :2024-01-01 Min. :15.00 man :19983
## 2 : 1 1st Qu.:2024-02-23 1st Qu.:32.00 woman:20017
## 3 : 1 Median :2024-04-16 Median :40.00
## 4 : 1 Mean :2024-04-16 Mean :42.26
## 5 : 1 3rd Qu.:2024-06-09 3rd Qu.:50.00
## 6 : 1 Max. :2024-07-31 Max. :90.00
## (Other):39994
## reason method country nps.score
## cancellation : 4018 chat :15893 Eatland :19856 Min. : 0.000
## delay :11911 email : 6091 Drinkland:20144 1st Qu.: 4.000
## incorrect_item: 8083 phone :14007 Median : 7.000
## other : 7987 web_form: 4009 Mean : 6.364
## status : 8001 3rd Qu.: 9.000
## Max. :10.000
##
## open.comment nps.segment detractor satisfaction
## Length:40000 detractor:17966 Min. :0.0000 Min. :1.000
## Class :character passive :10278 1st Qu.:0.0000 1st Qu.:3.000
## Mode :character promoter :11756 Median :0.0000 Median :4.000
## Mean :0.4491 Mean :3.466
## 3rd Qu.:1.0000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000
## NA's :26541
The simulated dataset contains the following key variables:
A variable “id” to identify each of the 40000 customers.
The variable “date” indicates the day in which each customer completed the survey.
The 3 contextual factors about the customer service attention:
3 variables that indicate the likelihood to recommend the service:
A variable, “open.comment”, that contained open feedback (to simplify, we have simulated each comment as “blabla”).
Additional variables (age, gender, satisfaction) were simulated as part of the original dataset from Part 1, but are not used in this study.
Due to the time-consuming nature of manual qualitative coding, we first checked sample size of the two conditions of interest to ensure a feasible workload.
# Check the number of cases contacting through different contact methods across contact reasons and countries.
with(data[data$method == "chat",], table(method, reason, country))
## , , country = Eatland
##
## reason
## method cancellation delay incorrect_item other status
## chat 822 2333 1638 1554 1586
## email 0 0 0 0 0
## phone 0 0 0 0 0
## web_form 0 0 0 0 0
##
## , , country = Drinkland
##
## reason
## method cancellation delay incorrect_item other status
## chat 839 2356 1631 1538 1596
## email 0 0 0 0 0
## phone 0 0 0 0 0
## web_form 0 0 0 0 0
This cross-tabulation shows that:
There are more than 1,600 cases per country for chat contacts regarding incorrect item issues.
There are more than 6,000 cases per country for chat contacts regarding other types of issues.
Coding all cases was not feasible. To maintain rigor while ensuring efficiency, we randomly selected 300 cases for each condition.
# Specify the sample sizes for both samples
sample.n <- 300
# Sample 1: Cases of chat attending incorrect items
subset_chat_incorrect<- data[data$reason == "incorrect_item" & data$method == "chat", ]
subset1.rows <- nrow(subset_chat_incorrect)
sample_chat_incorrect <- subset_chat_incorrect[sample(1:subset1.rows, sample.n), ]
sample_chat_incorrect$condition <- "chat_incorrect_item"
#Sample 2: Cases of chat attending other issues
subset_chat_other <- data[!(data$reason == "incorrect_item") & data$method == "chat", ]
subset2.rows <- nrow(subset_chat_other)
sample_chat_other <- subset_chat_other[sample(1:subset2.rows, sample.n), ]
sample_chat_other$condition <- "chat_other_issues"
# Bind together the two samples in a dataset
data.ch.i <- bind_rows(sample_chat_incorrect, sample_chat_other)
data.ch.i$condition <- factor(data.ch.i$condition, levels = c("chat_other_issues", "chat_incorrect_item"))
#Check the final sample
summary(data.ch.i)
## id date age gender
## 514 : 1 Min. :2024-01-01 Min. :15.00 man :296
## 570 : 1 1st Qu.:2024-02-25 1st Qu.:32.00 woman:304
## 705 : 1 Median :2024-04-21 Median :41.00
## 719 : 1 Mean :2024-04-18 Mean :42.17
## 755 : 1 3rd Qu.:2024-06-10 3rd Qu.:51.00
## 824 : 1 Max. :2024-07-31 Max. :90.00
## (Other):594
## reason method country nps.score
## cancellation : 37 chat :600 Eatland :301 Min. : 0.000
## delay :116 email : 0 Drinkland:299 1st Qu.: 2.000
## incorrect_item:300 phone : 0 Median : 6.000
## other : 81 web_form: 0 Mean : 5.478
## status : 66 3rd Qu.: 9.000
## Max. :10.000
##
## open.comment nps.segment detractor satisfaction
## Length:600 detractor:324 Min. :0.00 Min. :1.000
## Class :character passive :118 1st Qu.:0.00 1st Qu.:2.000
## Mode :character promoter :158 Median :1.00 Median :3.000
## Mean :0.54 Mean :3.151
## 3rd Qu.:1.00 3rd Qu.:4.000
## Max. :1.00 Max. :5.000
## NA's :329
## condition
## chat_other_issues :300
## chat_incorrect_item:300
##
##
##
##
##
We confirm that the simulated dataset provides 300 cases per condition, with distributions consistent with the original sample.
To validate the representativeness of this reduced dataset, we fit a logistic regression model predicting the likelihood of detractors based on the interaction between chat and incorrect item cases. The goal is to confirm whether the reduced dataset reproduces the same patterns observed in Predicting Customer Detractors (Part 1). Analyzing Contextual Factors via Logistic Regression. If these results do not align, the follow-up analysis would lack validity.
To obtain predictive metrics from the model, we apply a reusable function from my toolkit for logistic regression.
#####
# Fit the binomial model excluding the non-significant status factor
#####
rep.binomial.model.c <- glm(detractor ~ condition + country,
data = data.ch.i, family = "binomial")
summary(rep.binomial.model.c)
##
## Call:
## glm(formula = detractor ~ condition + country, family = "binomial",
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.3627 0.1443 -2.514 0.0119 *
## conditionchat_incorrect_item 0.8477 0.1678 5.050 4.41e-07 ***
## countryDrinkland 0.2146 0.1678 1.279 0.2008
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 827.93 on 599 degrees of freedom
## Residual deviance: 800.31 on 597 degrees of freedom
## AIC: 806.31
##
## Number of Fisher Scoring iterations: 4
#####
#Function to obtain key interpretation metrics for a binomial logistic model
#####
f.metrics.binomial <- function(my.model, my.matrix){
#--------1. Preliminary Checks:
# Making sure the model is glm binomial
if (!inherits(my.model, "glm") || my.model$family$family != "binomial") {
stop("The function requires a binomial model.")
}
#--------2. Odds Ratios
# Obtaining the coefficients and confidence intervals
coefs <- coef(my.model)
ci <- confint(my.model)
intercept <- coefs[1]
# Exponentiation of the coefficients and their CIs to obtain odds ratios
or <- exp(coefs)
or_ci <- exp(ci)
#--------3. Relative Probabilities (Risk Ratios)
# Calculating the base probability from the intercept
p0 <- or[1] / (1 + or[1])
# Obtaining relative probabilities from risk ratios and base probability
rp <- or / ((1 - p0) + (p0 * or))
rp_ci_low <- or_ci[, 1] / ((1 - p0) + (p0 * or_ci[, 1]))
rp_ci_high <- or_ci[, 2] / ((1 - p0) + (p0 * or_ci[, 2]))
#--------4. Predicted Proportions
# Extract log-odds and their sd for the individual presence of the factors (using the specified matrix)
pred <- predict(my.model, newdata = my.matrix, type = "link", se.fit = TRUE)
#Calculate confidence intervals in log-odds
lower_logit <- pred$fit - 1.96 * pred$se.fit
upper_logit <- pred$fit + 1.96 * pred$se.fit
#Specify a function to transform log-odds to probabilities
f.inv_logit <- function(x) {1 / (1 + exp(-x))}
#Apply the function to the log-odds and the limits of their ci
predicted_prob <- f.inv_logit(pred$fit) # Predicted probabilities
lower_predicted_prob <- f.inv_logit(lower_logit) # Lower limit of CI for predicted probabilities
upper_predicted_prob <- f.inv_logit(upper_logit) # Upper limit of CI for predicted probabilities
#--------5. Differences in Predicted Probabilities via Bootstrap
# Simulate a distribution of log-odds, for example with 10000 cases
set.seed(123)
simulated_log_odds <- as.data.frame(matrix(
rnorm(10000 * length(pred$fit), mean = pred$fit, sd = pred$se.fit),
ncol = length(pred$fit), byrow = TRUE
))
# Transform logg-odds to expected probabilities (using the formula specified above)
simulated_probs <- f.inv_logit(simulated_log_odds)
# Calculate differences from the first column, which correspond to the intercept probabilities
diffs <- apply(simulated_probs, 2, function(col){
col - simulated_probs[[1]]
})
# Calculate the means of the differences and 95% CI with percentiles
prob_diff <- colMeans(diffs)
lower_prob_diff <- apply(diffs, 2, quantile, probs = 0.025)
upper_prob_diff <- apply(diffs, 2, quantile, probs = 0.975)
#-------- 6. Combine Results
results <- data.frame(
Term = names(rp),
Prediction = as.numeric(predicted_prob)*100,
Lower_Prediction = as.numeric(lower_predicted_prob)*100,
Upper_Prediction = as.numeric(upper_predicted_prob)*100,
Predicted_Change = as.numeric(prob_diff)*100,
Lower_Predicted_Change = as.numeric(lower_prob_diff)*100,
Upper_Predicted_Change = as.numeric(upper_prob_diff)*100,
Relative_Probability = as.numeric(rp),
Lower_Relative_Probability = as.numeric(rp_ci_low),
Upper_Relative_Probability = as.numeric(rp_ci_high),
Odds_Ratio = as.numeric(or),
Lower_Odds_Ratio = as.numeric(or_ci[,1]),
Upper_Odds_Ratio = as.numeric(or_ci[,2])
)
# Remove meaningless metrics for intercept
results[1, c("Predicted_Change",
"Lower_Predicted_Change",
"Upper_Predicted_Change",
"Relative_Probability",
"Lower_Relative_Probability",
"Upper_Relative_Probability",
"Odds_Ratio",
"Lower_Odds_Ratio",
"Upper_Odds_Ratio")] <- NA_real_
# Return the results
return(results)
}
#####
# Design Matrix for Predictions
#####
# Extract the factor names from the model, with and without the intercept
f.names.interc <- names(coef(rep.binomial.model.c))
f.names <- f.names.interc[-1]
# Create a matrix with 1s on the diagonal (each case represents one factor set to 1)
my.matrix.df <- as.data.frame(diag(length(f.names.interc)))
my.matrix.df[, 1] <- f.names.interc
names(my.matrix.df) <- c("case", f.names)
# Ensure factor variables match the dataset format: we need to do it for country and condition.
names(my.matrix.df)[names(my.matrix.df) == "countryDrinkland"] <- "country"
my.matrix.df$country[my.matrix.df$country == 1] <- "Drinkland"
my.matrix.df$country[my.matrix.df$country == 0] <- "Eatland"
my.matrix.df$country <- as.factor(my.matrix.df$country)
names(my.matrix.df)[names(my.matrix.df) == "conditionchat_incorrect_item"] <- "condition"
my.matrix.df$condition[my.matrix.df$condition == 1] <- "chat_incorrect_item"
my.matrix.df$condition[my.matrix.df$condition == 0] <- "chat_other_issues"
my.matrix.df$condition <- as.factor(my.matrix.df$condition)
#####
# Calculate the metrics to the model using the function and matrix above
#####
sample.model.metrics.df <- f.metrics.binomial(rep.binomial.model.c, my.matrix.df)
## Waiting for profiling to be done...
# Display the resulting metrics
sample.model.metrics.df
We observe that the predictions for the reduced subsample are consistent with those obtained from the full dataset, with overlapping confidence intervals (see Section 3). This supports the representativeness of the subsample and justifies using it for further exploration of the differences between chat cases involving incorrect item issues and other issues.
Having established this, we proceed to code the open comments from customers to identify the most common complaints and opportunities for improvement.
Drawing from prior literature on customer service, we defined five initial categories of complaints to guide the coding of customer comments:
Connection issues
Slow response time
Slow issue resolution
Repetition of information
Other types of complaints
Since this is a simulated dataset, we generate these categories directly from the complaints information embedded in the data simulated earlier.
# Merge the complaints information in our subsample
data.ch.i <- merge(data.ch.i, data.complaints, by = "id", all.x = TRUE)
# Check the results
summary(data.ch.i)
## id date age gender
## 514 : 1 Min. :2024-01-01 Min. :15.00 man :296
## 570 : 1 1st Qu.:2024-02-25 1st Qu.:32.00 woman:304
## 705 : 1 Median :2024-04-21 Median :41.00
## 719 : 1 Mean :2024-04-18 Mean :42.17
## 755 : 1 3rd Qu.:2024-06-10 3rd Qu.:51.00
## 824 : 1 Max. :2024-07-31 Max. :90.00
## (Other):594
## reason method country nps.score
## cancellation : 37 chat :600 Eatland :301 Min. : 0.000
## delay :116 email : 0 Drinkland:299 1st Qu.: 2.000
## incorrect_item:300 phone : 0 Median : 6.000
## other : 81 web_form: 0 Mean : 5.478
## status : 66 3rd Qu.: 9.000
## Max. :10.000
##
## open.comment nps.segment detractor satisfaction
## Length:600 detractor:324 Min. :0.00 Min. :1.000
## Class :character passive :118 1st Qu.:0.00 1st Qu.:2.000
## Mode :character promoter :158 Median :1.00 Median :3.000
## Mean :0.54 Mean :3.151
## 3rd Qu.:1.00 3rd Qu.:4.000
## Max. :1.00 Max. :5.000
## NA's :329
## condition complain.connect complain.sp.respond
## chat_other_issues :300 Min. :0.0000 Min. :0.0000
## chat_incorrect_item:300 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000
## Mean :0.1617 Mean :0.1467
## 3rd Qu.:0.0000 3rd Qu.:0.0000
## Max. :1.0000 Max. :1.0000
##
## complain.sp.solve complain.repeat complain.other
## Min. :0.000 Min. :0.00 Min. :0.00
## 1st Qu.:0.000 1st Qu.:0.00 1st Qu.:0.00
## Median :0.000 Median :0.00 Median :0.00
## Mean :0.305 Mean :0.03 Mean :0.18
## 3rd Qu.:1.000 3rd Qu.:0.00 3rd Qu.:0.00
## Max. :1.000 Max. :1.00 Max. :1.00
##
The results are simulated as expected, with the following proportions:
To investigate why chat cases handling incorrect item issues have a higher proportion of detractors compared to other issues, we examine the type and frequency of complaints in each condition.
We calculate the percentage of cases reporting each type of complaint across the two conditions and countries, along with 95% confidence intervals. This is done using a reusable function from my R toolkit for percentages with 95% CI across multiple factors.
#####
# Function to calculate percentages and 95% CI across two factors
#####
f.prop.2f <- function(my.data, my.dv.list, my.factor1, my.factor2){
# Initialize an empty list to store the sampled subsets from each stratum.
results <- list()
# Initialize a counter for indexing elements in the 'results' list.
p <- 1
# Obtain the levels of the factors
f1.levels <- unique (my.data[[my.factor1]])
f2.levels <- unique (my.data[[my.factor2]])
#Create a loop for each dv and for each level of the factors
for (dv in my.dv.list){
for (lf1 in f1.levels){
for(lf2 in f2.levels){
# Subset the data for the current factor combination
current.vector <- my.data[my.data[[my.factor1]]==lf1 & my.data[[my.factor2]]==lf2, dv]
current.vector.c <- as.numeric(current.vector[!is.na(current.vector)]) #cleaning missing cases
n <- length(current.vector.c) # number of cases in the vector
s <- length(current.vector.c[current.vector.c == 1]) # number of cases satisfying the condition
# Calculate proportions and 95% CI
if(n > 0){
proportion <- s / n #calculating proportions
binomial.test <- binom.test (s, n) # binomial test to calculate 95% CI of the proportions
ci_lower <- binomial.test$conf.int[1] #extracting the lower limit of the CI
ci_upper <- binomial.test$conf.int[2] #extracting the upper limit of the CI
}else{
proportion <- NA
ci_lower <- NA
ci_upper <- NA
}
# Save results
results [[p]] <- data.frame(dep.variable = dv, factor1 = lf1, factor2 = lf2, percentage = proportion * 100,
ci_lower.perc = ci_lower * 100, ci_upper.perc = ci_upper * 100, stringsAsFactors = FALSE)
p <- p + 1
} # close the loop for factor 2
} # close the loop for factor 1
} # close the loop for the dv
#Combine the results in a dataframe
results.df <- (do.call(rbind, results))
rownames(results.df) <- NULL
#Return the dataframe
return(results.df)
}
#####
# Application of the function to calculate frequency of complaints across conditions and countries
#####
# Save a vector with the names of the dependent variables referring to complaints
dv.variables <- c("complain.connect", "complain.sp.respond", "complain.sp.solve", "complain.repeat", "complain.other")
# Apply the function
prop.df <- f.prop.2f (my.data = data.ch.i , my.dv.list = dv.variables, my.factor1 = "condition", my.factor2 = "country")
# Visualize the results
prop.df # to see the results
To better understand how complaints differ across chat cases handling incorrect item issues versus other issues, we visualize the above percentages and their 95% confidence intervals for each complaint type, separated by country.
# Set order of complaint types for consistent plotting
prop.df$dep.variable <- factor(prop.df$dep.variable, levels = c("complain.connect", "complain.repeat", "complain.sp.respond", "complain.sp.solve", "complain.other"))
# Custom titles for facets
custom_titles <- c(
complain.connect = "Connection Problems",
complain.repeat = "Need to Repeat Information",
complain.sp.respond = "Slow Responses",
complain.sp.solve = "Slow Resolution",
complain.other = "Other Complaints"
)
# Plot bar chart with error bars and faceting (using ggplot2)
ggplot(prop.df, aes(x = factor1, y = percentage, fill = factor2)) +
geom_col(position = position_dodge(width = 0.8), width = 0.7) +
geom_errorbar(
aes(ymin = ci_lower.perc, ymax = ci_upper.perc),
position = position_dodge(width = 0.8),
width = 0.2
) +
facet_wrap(~ dep.variable,
scales = "fixed",
axes = "all",
labeller = labeller(dep.variable = custom_titles)) +
labs(
y = "% of Complaints",
fill = "Country",
title = "% of Chat Complaints by Issue Type"
) +
scale_x_discrete(labels = c ("incorrect item", "other issue")) +
ylim(0, 100) +
geom_text(
aes(label = paste0(round(percentage, 0), "%")), # label with percentage rounded
position = position_dodge(width = 0.8), # match the dodge of bars
vjust = -3, # slightly above the bar
size = 2.5 # text size
) +
theme_minimal(base_size = 12) +
theme(
axis.title.x = element_blank(),
legend.position = c(1, 0),
legend.justification = c("right", "bottom"),
legend.title = element_text(size = 10, face = "bold"),
legend.text = element_text(size = 10),
panel.spacing = unit(1, "lines"),
plot.title = element_text(hjust = 0.5, face = "bold"),
strip.text = element_text(face = "bold"),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
plot.margin = margin(10, 5, 10, 5)
)
The graphs suggest that chat attention to incorrect item issues and chat attention to other issues differ mainly in the presence of slow resolution complaints. These are substantially more common for incorrect item issues compared to other issues.
The graphs also suggest that this pattern of results is consistent across countries.
To formally test the observations from the descriptive graphs, we estimate binomial logistic regression models for each type of complaint. These models examine whether the likelihood of each complaint differs between chat cases involving incorrect item issues versus other issues, controlling for country.
Given that our subsamples have similar proportions across countries, multicollinearity is not expected to be an issue.
##-- Logistic Models for different type of complaints, controlling for country
# Connection complaints
m.connect <- glm(complain.connect ~ condition + country, data = data.ch.i, family = binomial)
summary(m.connect)
##
## Call:
## glm(formula = complain.connect ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.3606 0.2315 -10.197 < 2e-16 ***
## conditionchat_incorrect_item 0.2325 0.2262 1.028 0.304
## countryDrinkland 1.0181 0.2394 4.254 2.1e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 530.91 on 599 degrees of freedom
## Residual deviance: 510.38 on 597 degrees of freedom
## AIC: 516.38
##
## Number of Fisher Scoring iterations: 4
# Complaints about needing to repeat information
m.repeat <- glm(complain.repeat ~ condition + country, data = data.ch.i, family = binomial)
summary(m.repeat)
##
## Call:
## glm(formula = complain.repeat ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.2611 0.3873 -8.419 <2e-16 ***
## conditionchat_incorrect_item -0.2308 0.4816 -0.479 0.632
## countryDrinkland -0.2240 0.4816 -0.465 0.642
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 161.69 on 599 degrees of freedom
## Residual deviance: 161.24 on 597 degrees of freedom
## AIC: 167.24
##
## Number of Fisher Scoring iterations: 6
# Slow response complaints
m.respond <- glm(complain.sp.respond ~ condition + country, data = data.ch.i, family = binomial)
summary(m.respond)
##
## Call:
## glm(formula = complain.sp.respond ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.8760109 0.2055836 -9.125 <2e-16 ***
## conditionchat_incorrect_item 0.0007374 0.2309741 0.003 0.997
## countryDrinkland 0.2213915 0.2316233 0.956 0.339
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 500.26 on 599 degrees of freedom
## Residual deviance: 499.34 on 597 degrees of freedom
## AIC: 505.34
##
## Number of Fisher Scoring iterations: 4
# Slow resolution complaints
m.solve <- glm(complain.sp.solve ~ condition + country, data = data.ch.i, family = binomial)
summary(m.solve)
##
## Call:
## glm(formula = complain.sp.solve ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.25502 0.21994 -10.253 <2e-16 ***
## conditionchat_incorrect_item 2.28847 0.22697 10.083 <2e-16 ***
## countryDrinkland 0.04008 0.19888 0.202 0.84
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 738.05 on 599 degrees of freedom
## Residual deviance: 606.25 on 597 degrees of freedom
## AIC: 612.25
##
## Number of Fisher Scoring iterations: 4
# Other complaints
m.other <- glm(complain.other ~ condition + country, data = data.ch.i, family = binomial)
summary(m.other)
##
## Call:
## glm(formula = complain.other ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.1153 0.1707 -6.534 6.41e-11 ***
## conditionchat_incorrect_item -0.3229 0.2152 -1.500 0.1335
## countryDrinkland -0.5446 0.2175 -2.504 0.0123 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 565.67 on 599 degrees of freedom
## Residual deviance: 557.05 on 597 degrees of freedom
## AIC: 563.05
##
## Number of Fisher Scoring iterations: 4
The statistical models confirm our initial observation: slow resolution is the primary driver of dissatisfaction for incorrect item cases. To identify actionable solutions, we conducted a deeper thematic analysis of these specific complaints.
Based on this observation, we proceed to examine more granular categories within slow resolution complaints to identify specific problems and opportunities for improvement.
Slow resolution complaints appear to be the primary driver of the high percentage of detractors in chat cases attending incorrect item issues. To identify specific problems and opportunities, we conduct additional thematic analyses focused in these complaints.
To represent the qualitative coding within the slow resolution complaints, we simulate the presence of four specific problems in them:
We simulate the coding in two steps:
# 1) Initialize problem variables
vars <- c("problem.photo","problem.info","problem.deadline","problem.unspecified")
for(v in vars) data.ch.i[[v]] <- NA_real_
# 2) Define probability distributions for number of problems per comment
pk_other <- c(`0` = 0.10, `1` = 0.70, `2` = 0.18, `3` = 0.02) # other issues
pk_incorrect<- c(`0` = 0.05, `1` = 0.70, `2` = 0.22, `3` = 0.03) # incorrect item
# 3) Define weighted probabilities for selecting each specific problem
probs <- c("photo","info","deadline")
w_other <- c(photo = 0.05, info = 0.60, deadline = 0.35) # other issues
w_incorrect <- c(photo = 0.70, info = 0.20, deadline = 0.10) # incorrect item
# 4) Function: sample k items without replacement, weighted by probability
sample_weighted_no_replace <- function(items, weights, k){
if(k <= 0) return(character(0))
picked <- character(0)
w <- weights
for(i in seq_len(k)){
p <- w / sum(w)
choice <- sample(items, size = 1, prob = p)
picked <- c(picked, choice)
w[choice] <- 0 # quita el elegido (sin reemplazo)
}
picked
}
# 5) Assign problems for cases with slow resolution complaints
idx <- which(data.ch.i$complain.sp.solve == 1)
for(i in idx){
cond <- data.ch.i$condition[i]
# Select distribution parameters based on condition
pk <- if(cond == "chat_incorrect_item") pk_incorrect else pk_other
K <- as.integer(sample(as.integer(names(pk)), size = 1, prob = pk))
w <- if(cond == "chat_incorrect_item") w_incorrect else w_other
# Default: all problem variables = 0
data.ch.i$problem.photo[i] <- 0
data.ch.i$problem.info[i] <- 0
data.ch.i$problem.deadline[i] <- 0
data.ch.i$problem.unspecified[i] <- 0
if(K == 0){
# unspecified
data.ch.i$problem.unspecified[i] <- 1
} else {
chosen <- sample_weighted_no_replace(probs, w, K)
if("photo" %in% chosen) data.ch.i$problem.photo[i] <- 1
if("info" %in% chosen) data.ch.i$problem.info[i] <- 1
if("deadline" %in% chosen) data.ch.i$problem.deadline[i] <- 1
}
}
# 6) For those without complaints, leave 0 as a value
no_idx <- which(data.ch.i$complain.sp.solve == 0)
data.ch.i[no_idx, c("problem.photo","problem.info","problem.deadline","problem.unspecified")] <- 0
# Check the results
summary(data.ch.i)
## id date age gender
## 514 : 1 Min. :2024-01-01 Min. :15.00 man :296
## 570 : 1 1st Qu.:2024-02-25 1st Qu.:32.00 woman:304
## 705 : 1 Median :2024-04-21 Median :41.00
## 719 : 1 Mean :2024-04-18 Mean :42.17
## 755 : 1 3rd Qu.:2024-06-10 3rd Qu.:51.00
## 824 : 1 Max. :2024-07-31 Max. :90.00
## (Other):594
## reason method country nps.score
## cancellation : 37 chat :600 Eatland :301 Min. : 0.000
## delay :116 email : 0 Drinkland:299 1st Qu.: 2.000
## incorrect_item:300 phone : 0 Median : 6.000
## other : 81 web_form: 0 Mean : 5.478
## status : 66 3rd Qu.: 9.000
## Max. :10.000
##
## open.comment nps.segment detractor satisfaction
## Length:600 detractor:324 Min. :0.00 Min. :1.000
## Class :character passive :118 1st Qu.:0.00 1st Qu.:2.000
## Mode :character promoter :158 Median :1.00 Median :3.000
## Mean :0.54 Mean :3.151
## 3rd Qu.:1.00 3rd Qu.:4.000
## Max. :1.00 Max. :5.000
## NA's :329
## condition complain.connect complain.sp.respond
## chat_other_issues :300 Min. :0.0000 Min. :0.0000
## chat_incorrect_item:300 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000
## Mean :0.1617 Mean :0.1467
## 3rd Qu.:0.0000 3rd Qu.:0.0000
## Max. :1.0000 Max. :1.0000
##
## complain.sp.solve complain.repeat complain.other problem.photo
## Min. :0.000 Min. :0.00 Min. :0.00 Min. :0.0000
## 1st Qu.:0.000 1st Qu.:0.00 1st Qu.:0.00 1st Qu.:0.0000
## Median :0.000 Median :0.00 Median :0.00 Median :0.0000
## Mean :0.305 Mean :0.03 Mean :0.18 Mean :0.1917
## 3rd Qu.:1.000 3rd Qu.:0.00 3rd Qu.:0.00 3rd Qu.:0.0000
## Max. :1.000 Max. :1.00 Max. :1.00 Max. :1.0000
##
## problem.info problem.deadline problem.unspecified
## Min. :0.0000 Min. :0.00000 Min. :0.00
## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.00
## Median :0.0000 Median :0.00000 Median :0.00
## Mean :0.1217 Mean :0.07833 Mean :0.02
## 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.00
## Max. :1.0000 Max. :1.00000 Max. :1.00
##
We can see the data of the new variables was correctly simulated, with the following frequencies: 19% of cases complaining about problems loading photos, 12% about slow processes for checking customer information, 8% about unfulfillment of the promised deadlines, and 2% of cases reporting unspecified slow resolution issues.
To investigate whether the higher prevalence of slow resolution complaints in chat cases attending incorrect item issues can be attributed to specific addressable problems, we compare the frequencies of these issues across the two conditions.
We calculated the percentage of cases mentioning each type of problem across conditions and countries using the reusable function from my R toolkit for percentages with 95% CI across multiple factors, already specified in section 5.2.
For comparability with prior analyses, we also calculated the percentage of cases without any slow resolution complaints.
# Create a variable that indicates the absence of slow resolution complaints
data.ch.i$no.complain.sp.solve <- NA_real_
data.ch.i$no.complain.sp.solve [data.ch.i$complain.sp.solve == 1] <- 0
data.ch.i$no.complain.sp.solve [data.ch.i$complain.sp.solve == 0] <- 1
# Specify the list of dependent variables as a vector
dv.problems <- c("no.complain.sp.solve", "problem.photo", "problem.info", "problem.deadline", "problem.unspecified")
# Calculate the percentages using the reusable function
prob.prop.df <- f.prop.2f (my.data = data.ch.i , my.dv.list = dv.problems, my.factor1 = "condition", my.factor2 = "country")
#See the results
prob.prop.df
To facilitate interpretation, we visualize the percentages of each problem type along with their 95% confidence intervals using bar graphs.
# Setting the order of the complaints variables
prob.prop.df$dep.variable <- factor(prob.prop.df$dep.variable, levels = c("no.complain.sp.solve", "problem.photo", "problem.info", "problem.deadline", "problem.unspecified"))
# Titles for each faceted graph representing a type of complaint
custom_titles <- c(
no.complain.sp.solve = "No Complaints",
problem.photo = "Problems Uploading Photos",
problem.info = "Delays in Verifying Info",
problem.deadline = "Unmet Deadlines",
problem.unspecified = "Unspecified Problems"
)
# Plotting the bar graph
ggplot(prob.prop.df, aes(x = factor1, y = percentage, fill = factor2)) +
geom_col(position = position_dodge(width = 0.8), width = 0.7) +
geom_errorbar(
aes(ymin = ci_lower.perc, ymax = ci_upper.perc),
position = position_dodge(width = 0.8),
width = 0.2
) +
facet_wrap(~ dep.variable,
scales = "fixed",
axes = "all",
labeller = labeller(dep.variable = custom_titles)) +
labs(
y = "% of Cases",
fill = "Country",
title = "Problems in Slow Resolution Complaints"
) +
scale_x_discrete(labels = c ("incorrect item", "other issue")) +
ylim(0, 110) + # allow space for percentage labels above bars
geom_text(
aes(label = paste0(round(percentage, 0), "%")), # label with percentage rounded
position = position_dodge(width = 0.8), # match the dodge of bars
vjust = -2, # slightly above the bar
size = 2.5 # text size
) +
theme_minimal(base_size = 12) +
theme(
axis.title.x = element_blank(),
legend.position = c(1, 0),
legend.justification = c("right", "bottom"),
legend.title = element_text(size = 10, face = "bold"),
legend.text = element_text(size = 10),
panel.spacing = unit(1, "lines"),
plot.title = element_text(hjust = 0.5, face = "bold"),
strip.text = element_text(face = "bold"),
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),
plot.margin = margin(10, 5, 10, 5)
)
Cases attended by chat for incorrect item issues differ not only in having more slow resolution complaints, but also in the nature of these complaints:
Most complaints in this condition concern difficulties uploading photos, which are rare in other issues. This aligns with stakeholder expectations, as customers with incorrect items are asked to provide photos as proof, while this requirement is uncommon for other issues.
Delays in verifying customer information are also more common in this condition.
We continue with formal testing of these differences.
To formally test these differences, we fit binomial logistic regression models predicting each type of slow resolution complaint from condition (incorrect item vs. other issues), controlling for country. Given similar proportions across countries, multicollinearity is not a concern.
##-- Logistic Models for predicting different type of slow resolution complaints
# Problems loading the photo
m.problem.photo <- glm (problem.photo ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.photo)
##
## Call:
## glm(formula = problem.photo ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.6430 1.0078 -5.599 2.15e-08 ***
## conditionchat_incorrect_item 5.2118 1.0087 5.167 2.38e-07 ***
## countryDrinkland -0.1184 0.2364 -0.501 0.617
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 586.36 on 599 degrees of freedom
## Residual deviance: 411.59 on 597 degrees of freedom
## AIC: 417.59
##
## Number of Fisher Scoring iterations: 8
# Problems of slow information check
m.problem.info <- glm (problem.info ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.info)
##
## Call:
## glm(formula = problem.info ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.96739 0.29535 -10.047 < 2e-16 ***
## conditionchat_incorrect_item 1.51617 0.30257 5.011 5.42e-07 ***
## countryDrinkland 0.04544 0.25602 0.177 0.859
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 444.28 on 599 degrees of freedom
## Residual deviance: 413.69 on 597 degrees of freedom
## AIC: 419.69
##
## Number of Fisher Scoring iterations: 5
# Problems of unmet deadlines
m.problem.deadline <- glm (problem.deadline ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.deadline)
##
## Call:
## glm(formula = problem.deadline ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.0490 0.3107 -9.813 <2e-16 ***
## conditionchat_incorrect_item 0.6184 0.3158 1.958 0.0502 .
## countryDrinkland 0.4324 0.3102 1.394 0.1633
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 329.62 on 599 degrees of freedom
## Residual deviance: 323.69 on 597 degrees of freedom
## AIC: 329.69
##
## Number of Fisher Scoring iterations: 5
# Unspecified problems
m.problem.unspecified <- glm (problem.unspecified ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.unspecified)
##
## Call:
## glm(formula = problem.unspecified ~ condition + country, family = binomial,
## data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.3127 0.7709 -6.891 5.53e-12 ***
## conditionchat_incorrect_item 1.1279 0.6731 1.676 0.0938 .
## countryDrinkland 1.1346 0.6731 1.686 0.0919 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 117.65 on 599 degrees of freedom
## Residual deviance: 111.17 on 597 degrees of freedom
## AIC: 117.17
##
## Number of Fisher Scoring iterations: 7
The logistic models confirm the higher presence of photo problems and slow information check problems in chat cases attending incorrect item issues compared to other issues.
To further understand to what extent these problems explain the high proportion of detractors in chat cases attending incorrect item issues, we proceed to model these effects.
To evaluate the impact of the problems on the likelihood to recommend we will fit binomial logistic regression models predicting detractors from the specific slow resolution problems, condition, and country.
Before fitting the models, we conduct mosaic plots of all considered predictors to visually inspect collinearity and potential interactions. Given the number of predictors, we examine the mosaics in several steps.
## Checking interactions between problems, countries and conditions
# Problems Loading Photos
table.photos <- with (data.ch.i, table(problem.photo, condition, country, detractor))
table.photos
## , , country = Eatland, detractor = 0
##
## condition
## problem.photo chat_other_issues chat_incorrect_item
## 0 88 56
## 1 0 2
##
## , , country = Drinkland, detractor = 0
##
## condition
## problem.photo chat_other_issues chat_incorrect_item
## 0 81 47
## 1 0 2
##
## , , country = Eatland, detractor = 1
##
## condition
## problem.photo chat_other_issues chat_incorrect_item
## 0 62 35
## 1 0 58
##
## , , country = Drinkland, detractor = 1
##
## condition
## problem.photo chat_other_issues chat_incorrect_item
## 0 68 48
## 1 1 52
doubledecker(table.photos)
# Problems of slow information check
table.info <- with (data.ch.i, table(problem.info, condition, country, detractor))
table.info
## , , country = Eatland, detractor = 0
##
## condition
## problem.info chat_other_issues chat_incorrect_item
## 0 88 56
## 1 0 2
##
## , , country = Drinkland, detractor = 0
##
## condition
## problem.info chat_other_issues chat_incorrect_item
## 0 81 47
## 1 0 2
##
## , , country = Eatland, detractor = 1
##
## condition
## problem.info chat_other_issues chat_incorrect_item
## 0 56 65
## 1 6 28
##
## , , country = Drinkland, detractor = 1
##
## condition
## problem.info chat_other_issues chat_incorrect_item
## 0 60 74
## 1 9 26
doubledecker(table.info)
# Problems of failing the deadline
table.deadline <- with (data.ch.i, table(problem.deadline, condition, country, detractor))
table.deadline
## , , country = Eatland, detractor = 0
##
## condition
## problem.deadline chat_other_issues chat_incorrect_item
## 0 88 57
## 1 0 1
##
## , , country = Drinkland, detractor = 0
##
## condition
## problem.deadline chat_other_issues chat_incorrect_item
## 0 81 48
## 1 0 1
##
## , , country = Eatland, detractor = 1
##
## condition
## problem.deadline chat_other_issues chat_incorrect_item
## 0 55 82
## 1 7 11
##
## , , country = Drinkland, detractor = 1
##
## condition
## problem.deadline chat_other_issues chat_incorrect_item
## 0 59 83
## 1 10 17
doubledecker(table.deadline)
## Checking interactions between problems
table.problems <- with (data.ch.i, table(problem.photo, problem.info, problem.deadline, detractor))
table.problems
## , , problem.deadline = 0, detractor = 0
##
## problem.info
## problem.photo 0 1
## 0 270 2
## 1 1 1
##
## , , problem.deadline = 1, detractor = 0
##
## problem.info
## problem.photo 0 1
## 0 0 0
## 1 1 1
##
## , , problem.deadline = 0, detractor = 1
##
## problem.info
## problem.photo 0 1
## 0 159 27
## 1 65 28
##
## , , problem.deadline = 1, detractor = 1
##
## problem.info
## problem.photo 0 1
## 0 20 7
## 1 11 7
doubledecker(table.problems)
We cannot identify relevant interactions between the factors used to predict detractors: across different conditions and countries, the presence of each of the problems is generally associated with most users being detractors. Furthermore, this ceiling effect makes it difficult to clearly observe whether the effects of the different problems interact with each other.
We also do not identify any issues of collinearity beyond the expected strong relation between problems loading photos and the incorrect item condition. Although we initially include the condition factor to illustrate how much of its effect is explained by the problems, it will be removed in the final model.
We start by fitting a model that includes all potential predictors to evaluate their individual contributions.
# Specify the model
m.problems.sp.solve <- glm (detractor ~ problem.photo + problem.info + problem.deadline + country + condition, data = data.ch.i, family = binomial)
summary(m.problems.sp.solve)
##
## Call:
## glm(formula = detractor ~ problem.photo + problem.info + problem.deadline +
## country + condition, family = binomial, data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.6034 0.1575 -3.830 0.000128 ***
## problem.photo 3.4225 0.5406 6.331 2.44e-10 ***
## problem.info 2.4683 0.5490 4.496 6.93e-06 ***
## problem.deadline 2.7492 0.7485 3.673 0.000240 ***
## countryDrinkland 0.3180 0.1922 1.655 0.097987 .
## conditionchat_incorrect_item -0.1455 0.2040 -0.714 0.475449
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 827.93 on 599 degrees of freedom
## Residual deviance: 633.58 on 594 degrees of freedom
## AIC: 645.58
##
## Number of Fisher Scoring iterations: 6
The model indicates that the presence of all the expected problems is associated with a significantly higher probability of users becoming detractors.
Furthermore, once we account for these specific problems, the effect of condition is no longer significant, suggesting that these problems explain the higher presence of detractors in chat cases attending incorrect item issues.
To evaluate the effect of each problem with a more parsimonious model, we evaluate the fitting of a model excluding the non-significant factors.
# Specify the model
m.problems.sp.solve2 <- glm (detractor ~ problem.photo + problem.info + problem.deadline, data = data.ch.i, family = "binomial")
#Obtain the results
summary(m.problems.sp.solve2)
##
## Call:
## glm(formula = detractor ~ problem.photo + problem.info + problem.deadline,
## family = "binomial", data = data.ch.i)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.49250 0.09938 -4.956 7.21e-07 ***
## problem.photo 3.29055 0.52339 6.287 3.24e-10 ***
## problem.info 2.39322 0.54268 4.410 1.03e-05 ***
## problem.deadline 2.75534 0.74717 3.688 0.000226 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 827.93 on 599 degrees of freedom
## Residual deviance: 636.79 on 596 degrees of freedom
## AIC: 644.79
##
## Number of Fisher Scoring iterations: 6
# Compare the models with and without the status factor
anova(m.problems.sp.solve2 , m.problems.sp.solve, test = "Chisq")
The likelihood ratio test confirms that removing the country and condition variables does not significantly reduce model fit. Therefore, we keep this simplified model that excluedes them.
To interpret the model, we derived meaningful metrics using a reusable function from my toolkit for logistic regression, already specified in section 4.2.
#####
# Design Matrix for Predictions
#####
# Extract the factor names from the model, with and without the intercept
f.names.interc <- names(coef(m.problems.sp.solve2)) #specify the new model here
f.names <- f.names.interc[-1]
# Create a matrix with 1s on the diagonal (each case represents one factor set to 1)
my.matrix.df <- as.data.frame(diag(length(f.names.interc)))
my.matrix.df[, 1] <- f.names.interc
names(my.matrix.df) <- c("case", f.names)
#####
# Calculation of the Metrics (probability predictions, risk rations, odds ratios)
#####
# The function has already been specified in section 4.2
# Apply the function to the final binomial model
m.problems.sp.solve2.metrics.df <- f.metrics.binomial(m.problems.sp.solve2, my.matrix.df)
## Waiting for profiling to be done...
# Display the resulting metrics
m.problems.sp.solve2.metrics.df
We can see that the presence of each of the slow resolution problems is strongly associated with a higher likelihood of detractors. This relationship can be better visualized with a dot-and-whisker plot, where each dot represents the expected percentage of detractors and the whiskers represent the 95% confidence intervals.
# Add descriptive labels for each factor
m.problems.sp.solve2.metrics.df$Label <- NA_character_
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "(Intercept)"] <-
"No Specific Problems"
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "problem.photo"] <- "Problems Loading Photos"
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "problem.info"] <- "Problems Checking Information"
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "problem.deadline"] <- "Problems of Failing Deadlines"
# Create the dot-and-whisker plot for the predictions
ggplot(m.problems.sp.solve2.metrics.df, aes(x = Prediction,
y = reorder(Label, Prediction))) +
geom_vline(xintercept = 0, linetype = "dashed", color = "darkgrey") +
geom_errorbarh(aes(xmin = Lower_Prediction, xmax = Upper_Prediction), height = 0.2) +
geom_point(size = 3) +
geom_text(aes(label = paste(round(Prediction,1), "%")),
vjust = -1, size = 3.5) +
labs(
title = "Expected Detractors by Slow Resolution Problems",
x = "% of Detractors",
y = "Stated Problems"
) +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5, face = "bold"))
## Warning: `geom_errobarh()` was deprecated in ggplot2 4.0.0.
## ℹ Please use the `orientation` argument of `geom_errorbar()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `height` was translated to `width`.
We can see that When the slow resolution problems are present, nearly all users are expected to become detractors. This effect is similar for all slow resolution problems. In contrast, in the absence of the problems, only around 37% of the users are predicted to be detractors.
We can further investigate the impact of correcting these problems.
To predict the potential reduction in detractors by correcting problems, we need to consider:
Given that these uncertainties may not follow normal distributions, we will use bootstrapping to estimate the variability of the predicted impact.
We define a reusable function to estimate the uncertainty of predicted outcomes for any scenario. The function works as follows:
f.bootstrap.predictions <- function (my.n, my.model, my.data, my.factor){
# --- Setup
# Make sure library MASS is installed and loaded
if (!require(MASS, quietly = TRUE)) {
install.packages("MASS")
library(MASS)
}
# Create a list to store the final results and the summary of the results
final.results <- list()
summary.list <- list()
pt <- 1
# Create subsets for subgroups
groups <- unique(my.data [[my.factor]])
# Extract coefficients from the original model
original.coefficients <- coef(my.model)
# Extract the names of the predictors in the original model
predictor.names <- names(original.coefficients)[-1]
# --- 1 Generate Simulated Coefficients
# Extract the covariance matrix
matriz_vcov <- vcov(my.model)
# Generate a matrix where each line is a set of simulated coefficients
simulated_coefficients <- mvrnorm(n = my.n,
mu = original.coefficients,
Sigma = matriz_vcov)
simulated_coefficients.df <- as.data.frame (simulated_coefficients)
# --- 2. Calculate Probabilities with Bootstrap Samples
# - A loop for each subgroup
for (group in groups) {
#Create an empty vector for the sampling distribution
sampling.distribution <- rep(NA, my.n)
# Select a the subsample for the subgroup
subsample <- my.data[my.data [[my.factor]] == group, ]
# - A loop for each sample we simulate
for (i in 1:my.n) {
# Select the simulated coefficients for the sample
sample.coefficients <- simulated_coefficients.df [i , ]
# Select a bootstrap sample from our sample
bootstrap.indexes <- sample(x = 1:nrow(subsample),
size = nrow(subsample),
replace = TRUE)
bootstrap.sample <- subsample [bootstrap.indexes, ]
# Create a vector to keep the results of the sample cases
prob.success <- numeric(nrow(bootstrap.sample))
# A loop for each case in the bootstrapped sample
for (c in 1:nrow(bootstrap.sample)){
# Empty vector to keep the effect for the presence of each predictor
predictor.effects.vector <- numeric(length(predictor.names))
# A loop for each predictor
for (p in 1:length(predictor.names)) {
# Extract the values of the predictive model from each predictor
name <- predictor.names[p] # save the name of the predictor
coef_value <- as.numeric (sample.coefficients [[name]])
data_value <- as.numeric (bootstrap.sample[[name]] [c])
predictor.effects.vector [p] <- sample.coefficients [[name]] * bootstrap.sample [[name]][c]
} # Close the loop each predictor
# Extract the remaining values of the predictive model
intercept <- as.numeric (sample.coefficients[1])
predictors.effect <- sum (predictor.effects.vector)
# Calculate the probability of each case
prob.success [c] <- 1 / (1 + exp (-1 * (intercept + predictors.effect)))
# Debug check
if (is.na(prob.success[c])) {
print(paste("NA in case ", c, ", sample ", i, "subgroup ", group))
}
} # Close the loop for each case in the bootstrapped sample
# Verification of NAs in the estimation of sample cases
if (any(is.na(prob.success))) {
print(paste("ADVERTENCIA: NAs en prob.success para iteración", i))
print(summary(prob.success))
}
# Calculate the average probability of success in the bootstrapped sample
proportion.sample <- mean(prob.success, na.rm = TRUE)
# Store the sample probability average in a vector for the sampling distribution
sampling.distribution [i] <- proportion.sample
} #Close the loop for each sample
# Save results of all the samples
estimate.probability <- median (sampling.distribution, na.rm = TRUE)
lower.ci.estimate <- quantile(sampling.distribution, probs = 0.025, na.rm = TRUE)
upper.ci.estimate <- quantile(sampling.distribution, probs = 0.975, na.rm = TRUE)
# Create a name for the sublist of results for each subgroup
name.group.results <- paste0(group, "results")
# Store the results
final.results [[name.group.results]] <- list(
estimate = estimate.probability,
lower.ci = lower.ci.estimate,
upper.ci = upper.ci.estimate,
bootstrap.sampling.dist = sampling.distribution
)
# Create a named vector to build a summary of results
summary.list [[pt]] <- data.frame(group.factor = paste0 (group),
estimate = estimate.probability,
lower.ci = lower.ci.estimate,
upper.ci = upper.ci.estimate,
stringsAsFactors = FALSE)
pt <- pt + 1
} #Close the loop for each subgroup
# Add the summary to the final results as a dataframe
summary.df <- do.call (rbind, summary.list)
rownames(summary.df) <- NULL
final.results [["summary.df"]] <- summary.df
# Add the summary
return (final.results)
}
Now we are ready to apply the bootstrapping function to scenarios where specific problems are corrected. We start with photo issues, as they are the most common slow resolution problem.
We compare the current probability of being a detractor with the expected probability if photo issues were fully corrected.
#####
# 1. Obtaining estimations for the present situation
#####
# Apply the bootstrapping function to the present data
sim.present <- f.bootstrap.predictions (my.n = 5000,
my.model = m.problems.sp.solve2,
my.data = data.ch.i,
my.factor = "condition")
# Extract a dataframe indicating the present moment
present.df <- data.frame(moment = rep("present", nrow(sim.present$summary.df)))
present.df <- cbind(present.df, sim.present$summary.df)
#####
# 2. Obtaining estimations for the corrected scenario
#####
# Represent the problem photo as totally eliminated
data.corrected <- data.ch.i
data.corrected$problem.photo <- 0
# Apply the formula
sim.photo.corrected <- f.bootstrap.predictions (my.n = 5000,
my.model = m.problems.sp.solve2,
my.data = data.corrected,
my.factor = "condition")
# Extract a dataframe indicating the moment
photo.corrected.df <- data.frame(moment = rep("expected if photo issues are fixed", nrow(sim.photo.corrected$summary.df)))
photo.corrected.df <- cbind(photo.corrected.df, sim.photo.corrected$summary.df)
#####
# 3. Save the results in a dataframe
#####
# Bind the results for the present and the photo correction simulations
impact.photo.correction.df <- rbind(present.df, photo.corrected.df)
# Express in percentages
impact.photo.correction.df [, c(3:5)] <- impact.photo.correction.df [, c(3:5)] * 100
impact.photo.correction.df
To better illustrate the comparison of the present and the correction scenario, we can use a bar graph
# Setting the order of the complaints variables
impact.photo.correction.df$group.factor <- factor(impact.photo.correction.df$group.factor, levels = c("chat_incorrect_item", "chat_other_issues"))
impact.photo.correction.df$moment <- factor(impact.photo.correction.df$moment, levels = c("present", "expected if photo issues are fixed"))
# Titles for each faceted graph representing a type of complaint
custom_titles <- c(
chat_incorrect_item = "Incorrect Item Issues",
chat_other_issues = "Other Issues"
)
# Plotting the bar graph
ggplot(impact.photo.correction.df, aes(x = moment, y = estimate)) +
geom_col(position = position_dodge(width = 0.8), width = 0.7, fill = "grey80") +
geom_errorbar(
aes(ymin = lower.ci, ymax = upper.ci),
position = position_dodge(width = 0.8),
width = 0.2
) +
facet_wrap(~ group.factor,
scales = "fixed",
axes = "all",
labeller = labeller(group.factor = custom_titles)) +
labs(
y = "% of Detractors",
title = "Impact of Correcting Photo Issues in Chat"
) +
scale_x_discrete(labels = c ("Current", "After\nCorrection")) +
ylim(0, 100) +
geom_text(
aes(label = paste0(round(estimate, 0), "%")), # label with percentage rounded
position = position_dodge(width = 0.8), # match the dodge of bars
vjust = -3, # slightly above the bar
size = 3.5 # text size
) +
theme_minimal(base_size = 12) +
theme(
axis.title.x = element_blank(),
panel.spacing = unit(1, "lines"),
plot.title = element_text(hjust = 0.5, face = "bold"),
strip.text = element_text(face = "bold", size = 11), # the panel titles
axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5, size = 11),
plot.margin = margin(10, 5, 10, 5)
)
The impact of correcting photo issues in the chat can be huge for cases attending incorrect item issues, with an expected reduction from 65% to 51% in the percentage of detractors.
Yet, to better calculate the global impact, it can be beneficial to evaluate the % of detractors reduction within all customer service cases.
To calculate the overall impact in detractors reduction, we need to ponderate the results above by the percentage of cases in customer service that these scenarios represent. In coherence with Part 1, we will assume that:
# --- 1. Extract the sampling distributions simulated for each scenario
# Incorrect item issues for the present moment
dist.present.ii <- sim.present[["chat_incorrect_itemresults"]][["bootstrap.sampling.dist"]]
# Incorrect item issues after the correction of the issue
dist.photo.corrected.ii <- sim.photo.corrected[["chat_incorrect_itemresults"]][["bootstrap.sampling.dist"]]
# Other issues for the present moment
dist.present.oi <- sim.present[["chat_other_issuesresults"]][["bootstrap.sampling.dist"]]
# Other issues after the correction of the issue
dist.photo.corrected.oi <- sim.photo.corrected[["chat_other_issuesresults"]][["bootstrap.sampling.dist"]]
# --- 2. Express a function to calculate differences between two distributions
f.distribution.differences <- function(my.vector1, my.vector2) {
# Obtain the difference between the two distributions
distribution.differences <- my.vector1 - my.vector2
# Define the distribution of differences with metrics
dif.estimate <- median (distribution.differences, na.rm = TRUE) *100
lower.ci.estimate <- quantile(distribution.differences, probs = 0.025, na.rm = TRUE) *100
upper.ci.estimate <- quantile(distribution.differences, probs = 0.975, na.rm = TRUE) *100
# Save the results in a dataframe
results.df <- data.frame (dif = dif.estimate, lower.ci = lower.ci.estimate, upper.ci = upper.ci.estimate)
rownames(results.df) <- NULL
return(results.df)
}
# --- Calculate the distribution of differences
# for incorrect item issues
dist.differences.ii <- f.distribution.differences(dist.present.ii, dist.photo.corrected.ii)
# for other issues
dist.differences.oi <- f.distribution.differences(dist.present.oi, dist.photo.corrected.oi)
# total (including uncertainty for other issues)
dist.differences.total.inc <- f.distribution.differences ((dist.present.ii * 0.10 + dist.present.oi * 0.30), (dist.photo.corrected.ii * 0.10 + dist.photo.corrected.oi *0.30))
# Since the reduction of detractors for other issues was not significant, we also calculate the total without considering the reduction in this factor, which can add unnecessary error in the predictions)
# for the total - excluding uncertainty othe issues)
dist.differences.total.ex <- f.distribution.differences ((dist.present.ii * 0.10), (dist.photo.corrected.ii * 0.10))
# --- Bind the results in a single data frame
results.dist.differences <- data.frame (condition = c("within incorrect item issues attended by chat","within other issues attended by chat", "within all cases - including variability from non-significant factors", "within all cases - excluding variability from non-significant factors"))
merged.results <- bind_rows(dist.differences.ii, dist.differences.oi, dist.differences.total.inc, dist.differences.total.ex)
results.dist.differences <- cbind (results.dist.differences, merged.results)
results.dist.differences
The results indicate that we would expect a 1.40% less detractors within all customer service cases after the correction of the issue.
Similarly to the procedure applied for correcting photo issues, we can calculate the expected impact of addressing other types of problems and compare them. Due to space constraints, we simulate how this comparison could look.
# Extract the results for the impact of correcting photo issues
impact.across.corrections <- results.dist.differences [ 4, ]
impact.across.corrections [1,1] <- "Issues loading photos in the chat"
#Introduce made-up results to display
impact.across.corrections [2, ] <- c ("Lag issues in the checking information tool", 1.23, 0.85, 1.55)
impact.across.corrections [3, ] <- c ("Connectivity issues Drinkland", 0.84, 0.35, 1.35)
impact.across.corrections [4, ] <- c ("Slow email responses", 0.60, 0.22, 1.01)
#Convert values to numeric
impact.across.corrections [, 2:4] <- apply(impact.across.corrections[ ,2:4], 2, as.numeric)
# Visualize the results
rownames(impact.across.corrections) <- NULL
impact.across.corrections
We can also visualize these results in dot-and-whisker plot:
# Create the dot-and-whisker plot
ggplot(impact.across.corrections, aes(x = dif,
y = reorder(condition, dif))) +
geom_vline(xintercept = 0, linetype = "dashed", color = "darkgrey") +
geom_errorbarh(aes(xmin = lower.ci, xmax = upper.ci), height = 0.2) +
geom_point(size = 3) +
geom_text(aes(label = paste(round(dif, 1), "%")),
vjust = -1, size = 3.5) +
labs(
title = "Expected Impact of Correcting Each Problem",
x = "% of Less Detractors",
y = "Problems"
) +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", hjust = 0.5, margin = margin(b = 40)),
plot.title.position = "plot", # To center the title on the whole area
axis.title.y = element_text(angle = 90, size = 12, face = "bold", margin = margin(r = 20)),
axis.title.x = element_text(angle = 0, size = 12, face = "bold", margin = margin(t = 20))
)
## `height` was translated to `width`.
The comparison shows that correcting photo issues has the largest expected impact on reducing detractors, which can be considered to prioritize this correction.
However, it is important to note that simulations of future scenarios are unstable. It is important to follow them up with analyses once the corrections are applied.
As an example, we illustrate a follow-up of the effect of correcting the issue of loading photos in the chat, assuming it was corrected on October 1st.
To the data we already had for the first semester, we simulate additional data for the second semester. The simulation is similar to the one in section 4.1, but applying a reduction in detractors after October 1st to represent the effect of correcting the photo issues.
#####
# Simulation of data for the second semester
#####
# Define dataset size
data.semester2.n <- 40000 # Large enough to allow subgroup analysis despite unbalanced category probabilities
# Initialize dataframe with IDs
data.semester2 <- data.frame(id = factor((data.n + 1):(data.n + data.semester2.n)))
# Simulate the date of the evaluation
data.semester2$date <- sample(seq(from = as.Date("2024-08-01"), to = as.Date("2024-12-31"), by = "day"),
size = data.semester2.n,
replace = TRUE)
# ------ 1. Demographic Information and Contextual Variables
# Simulate age with a log-normal distribution, capped between 15 and 90
data.semester2$age <- round(rlnorm(n = data.semester2.n, meanlog = log(40), sdlog = log(1.4)))
data.semester2$age[data.semester2$age > 90] <- 90
data.semester2$age[data.semester2$age < 15] <- 15
# Simulate gender
data.semester2$gender <- factor(sample(c("man", "woman"),
prob = c(0.5, 0.5),
replace = TRUE, size = data.semester2.n))
# Simulate reason for contacting
data.semester2$reason <- factor(sample(c("incorrect_item", "delay", "status", "cancellation", "other"),
prob = c(0.20, 0.30, 0.20, 0.10, 0.20),
replace = TRUE, size = data.semester2.n))
# Simulate contact method
data.semester2$method <- factor(sample(c("chat", "web_form", "email", "phone"),
prob = c(0.40, 0.10, 0.15, 0.35),
replace = TRUE, size = data.semester2.n))
# Simulate country (two fictional countries for anonymity)
data.semester2$country <- factor(sample(c("Drinkland", "Eatland"),
prob = c(0.5, 0.5),
replace = TRUE, size = data.semester2.n))
data$country <- factor(data$country, levels = c("Eatland", "Drinkland"))
# ------ 2. Customer Complaints
# These variables represent underlying complaints that were not available at this stage of the investigation. We use them only to simulate NPS and satisfaction scores and store them in a separate dataframe to reflect their unavailability.
## Create a dataset to store the complaints in connection to the case id
data.complaints2 <- data.frame(id = data.semester2$id)
## Complaints about connection issues
# Base probability of a connection complaint: 10%
prob_connection <- rep(0.10, data.semester2.n)
# In Drinkland, probability is doubled
prob_connection[data.semester2$country == "Drinkland"] <- prob_connection[data.semester2$country == "Drinkland"] * 2
# Ensure probability values remain within [0, 1]
prob_connection [prob_connection > 1] <- 1
prob_connection [prob_connection < 0] <- 0
# Generate binary variable based on probabilities
data.complaints2$complain.connect <- rbinom(n = data.semester2.n, size = 1, prob = prob_connection)
## Complaints about slow response
# Base probability of a slow response complaint: 5%
prob_slow_response <- rep(0.05, data.semester2.n)
# Email contacts: 6x higher likelihood
prob_slow_response[data.semester2$method == "email"] <- prob_slow_response[data.semester2$method == "email"] * 6
# Chat and web_form: 3x higher likelihood
prob_slow_response[data.semester2$method %in% c("chat", "web_form")] <- prob_slow_response[data.semester2$method %in% c("chat", "web_form")] * 3
# Ensure probability values remain within [0, 1]
prob_slow_response [prob_slow_response > 1] <- 1
prob_slow_response [prob_slow_response < 0] <- 0
# Generate binary variable
data.complaints2$complain.sp.respond <- rbinom(n = data.semester2.n, size = 1, prob = prob_slow_response)
#### SPECIFIC CHANGES FOR SLOW RESOLUTION COMPLAINTS ####
# This part differs from the simulation of the first semester
## --- Complaints about slow resolution
# Specify the base probability of a slow resolution complaint: 10%
prob_slow_resolution <- rep(0.10, data.semester2.n)
# Specify the multiplying factor If contact is via chat and reason is incorrect_item before the correction day
multiplying_factor <- 5
# Specify the day the correction is implemented
day.correction <- as.Date("2024-10-1")
# Specify the duration of the changing period
days.change <- 25
## Apply a correction factor for the probability so that:
# its value is 1 before the correction date
# progressively becomes equal to the multiplying factor until reaching the duration of the changing period.
# it maintains this value the rest of the time, so the probability stays on the base probability established earlier (.10)
# Create a vector that indicates the days passed after the change was implemented
days.passed <- as.numeric(data.semester2$date - day.correction)
days.passed [days.passed <= 0] <- 0
# Specify the values of the correction factors
correction_factor <- 1 + (days.passed * (multiplying_factor-1))/ days.change
correction_factor [correction_factor > multiplying_factor] <- multiplying_factor
# Specify the selection of cases incorrect item cases attended by chat
is_chat_incorrect <- data.semester2$method == "chat" & data.semester2$reason == "incorrect_item"
# Modify the probability according to the expected parameters
prob_slow_resolution[is_chat_incorrect] <- prob_slow_resolution[is_chat_incorrect] * (multiplying_factor / correction_factor[is_chat_incorrect])
# Ensure probability values remain within [0, 1]
prob_slow_resolution [prob_slow_resolution > 1] <- 1
prob_slow_resolution [prob_slow_resolution < 0] <- 0
# Generate the binary variable for slow resolution complaints
data.complaints2$complain.sp.solve <- rbinom(n = data.semester2.n, size = 1, prob = prob_slow_resolution)
####### END OF SPECIFIC CHANGES FOR SLOW RESOLUTION COMPLAINTS ####
# From here, the simulation is similar as the first semester.
## Complaints about having to repeat information
# Base probability of a repetition complaint: 2%
prob_repeat_info <- rep(0.02, data.semester2.n)
# If contact reason is cancellation → 10x higher likelihood
prob_repeat_info[data.semester2$reason == "cancellation"] <- prob_repeat_info[data.semester2$reason == "cancellation"] * 10
# Ensure probability values remain within [0, 1]
prob_repeat_info [prob_repeat_info > 1] <- 1
prob_repeat_info [prob_repeat_info < 0] <- 0
# Generate binary variable
data.complaints2$complain.repeat <- rbinom(n = data.semester2.n, size = 1, prob = prob_repeat_info)
## Other types of complaints (uniform distribution)
# Constant probability of 20% for other unspecified complaints
data.complaints2$complain.other <- rbinom(n = data.semester2.n, size = 1, prob = 0.20)
# ------ 3. NPS scores
# Simulate base NPS scores assuming no complaints (mean = 9, sd = 2)
data.semester2$nps.score <- rnorm(data.semester2.n, mean = 9, sd = 2)
# Apply fixed penalties based on complaints
# -3 for connection issues
# -4 for slow response
# -6 for slow resolution
# -2 for repeated information
# -1 for other complaints
data.semester2$nps.score <- data.semester2$nps.score -
3 * data.complaints2$complain.connect -
4 * data.complaints2$complain.sp.respond -
6 * data.complaints2$complain.sp.solve -
2 * data.complaints2$complain.repeat -
1 * data.complaints2$complain.other
# Ensure scores stay within the 0–10 range
data.semester2$nps.score[data.semester2$nps.score < 0] <- 0
data.semester2$nps.score[data.semester2$nps.score > 10] <- 10
# Convert to integer values (using floor)
data.semester2$nps.score <- floor(data.semester2$nps.score)
# Add placeholder for open-text comments
data.semester2$open.comment <- rep("bla bla", data.semester2.n)
## Create NPS segments based on score values
data.semester2$nps.segment <- rep(NA_character_, data.semester2.n) # Initialize empty character variable
data.semester2$nps.segment[data.semester2$nps.score <= 6] <- "detractor"
data.semester2$nps.segment[data.semester2$nps.score > 6 & data.semester2$nps.score < 9] <- "passive"
data.semester2$nps.segment[data.semester2$nps.score >= 9] <- "promoter" # Assign segment based on score range
data.semester2$nps.segment <- factor(data.semester2$nps.segment, levels = c("detractor", "passive", "promoter")) # Convert to factor
## Create a binary variable for detractors
data.semester2$detractor <- NA_real_
data.semester2$detractor[data.semester2$nps.segment == "detractor"] <- 1
data.semester2$detractor[data.semester2$nps.segment != "detractor"] <- 0
#------ Satisfaction scores (1 to 5 scale)
# Generate satisfaction scores based on NPS
# We use a normal distribution centered at 3, with some variability (sd = 0.5)
# and add a scaled component based on NPS to simulate a positive relationship.
data.semester2$satisfaction <- floor(
rnorm(data.semester2.n, mean = 3, sd = 0.5) +
0.9 * (1 + as.numeric(scale(data.semester2$nps.score)))
)
# Ensure scores stay within the 1–5 range
data.semester2$satisfaction[data.semester2$satisfaction < 1] <- 1
data.semester2$satisfaction[data.semester2$satisfaction > 5] <- 5
# Check correlation with NPS scores (Spearman method)
with(data.semester2, cor(nps.score, satisfaction, method = "spearman"))
## [1] 0.8270852
# Set satisfaction to missing for Drinkland
data.semester2$satisfaction[data.semester2$country == "Drinkland"] <- NA_real_
# Set satisfaction to missing for email and web_form contacts
data.semester2$satisfaction[data.semester2$method %in% c("email", "web_form")] <- NA_real_
# Introduce 10% additional random missingness among remaining valid values
na_randomizer <- sample(
c(0, 1),
size = sum(!is.na(data.semester2$satisfaction)),
replace = TRUE,
prob = c(0.1, 0.9)
) # Create a random vector (0 = missing, 1 = keep) for non-NA entries
data.semester2$satisfaction[which(!is.na(data.semester2$satisfaction))[na_randomizer == 0]] <- NA_real_ # Apply randomizer
#####
# Saving the data for the whole year
#####
# Merge the databases for the 2 semesters
data.2024 <- bind_rows(data, data.semester2)
#Check the results for the whole dataset
summary(data.2024)
## id date age gender
## 1 : 1 Min. :2024-01-01 Min. :15.0 man :40145
## 2 : 1 1st Qu.:2024-04-16 1st Qu.:32.0 woman:39855
## 3 : 1 Median :2024-07-31 Median :40.0
## 4 : 1 Mean :2024-07-16 Mean :42.2
## 5 : 1 3rd Qu.:2024-10-15 3rd Qu.:50.0
## 6 : 1 Max. :2024-12-31 Max. :90.0
## (Other):79994
## reason method country nps.score
## cancellation : 8016 chat :31883 Eatland :39902 Min. : 0.00
## delay :23972 email :12160 Drinkland:40098 1st Qu.: 4.00
## incorrect_item:16108 phone :28031 Median : 7.00
## other :15939 web_form: 7926 Mean : 6.42
## status :15965 3rd Qu.: 9.00
## Max. :10.00
##
## open.comment nps.segment detractor satisfaction
## Length:80000 detractor:35397 Min. :0.0000 Min. :1.000
## Class :character passive :20857 1st Qu.:0.0000 1st Qu.:3.000
## Mode :character promoter :23746 Median :0.0000 Median :4.000
## Mean :0.4425 Mean :3.465
## 3rd Qu.:1.0000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000
## NA's :53080
The data is simulated as expected. We can see a slightly lower percentage of detractors in the data for the whole year in comparison to the data for just the first semester. Let’s analyze this difference with more detail.
As a first exploration of the change in detractors after correcting the photo issue, we calculate the percentage of detractors across different weeks along the year. For this purpose, we use a reusable function from my R toolkit for percentages with 95% CI across multiple factors
# --- 1. Specify the function
f.prop.dicdvl.1f <- function(my.data, my.dv.list, my.factor){
# Initialize an empty list to store the sampled subsets from each stratum.
results <- list()
# Initialize a counter for indexing elements in the 'results' list.
p <- 1
# Obtain the levels of the factor
f.levels <- unique (my.data[[my.factor]])
#Create a loop for each dv and for each level of the factors
for (dv in my.dv.list){
for (lf in f.levels){
# Extract the vector, successes and number of cases
current.vector <- my.data[my.data[[my.factor]]==lf, dv]
current.vector.c <- as.numeric(current.vector[!is.na(current.vector)]) #cleaning missing cases
n <- length(current.vector.c) # number of cases in the vector
s <- length(current.vector.c[current.vector.c == 1]) # number of cases satisfying the condition
# Calculate proportions and their 95% CI
if(n > 0){
proportion <- s / n #calculating proportions
binomial.test <- binom.test (s, n) # binomial test to calculate 95% CI of the proportions
ci_lower <- binomial.test$conf.int[1] #extracting the lower limit of the CI
ci_upper <- binomial.test$conf.int[2] #extracting the upper limit of the CI
}else{
proportion <- NA
ci_lower <- NA
ci_upper <- NA
}
# Save the results in a named vector
results [[p]] <- data.frame(dep.variable = dv, factor = lf, percentage = proportion * 100,
ci_lower.perc = ci_lower * 100, ci_upper.perc = ci_upper * 100, n.cases = n, stringsAsFactors = FALSE)
p <- p + 1
} # close the loop for the factor
} # close the loop for the dv
#Combine the results in a dataframe and convert outcomes to numeric columns
results.df <- (do.call(rbind, results))
rownames(results.df) <- NULL
#Return the dataframe
return(results.df)
}
# --- 2. Apply the function to a factor that specifies the weeks
# Order the data by date to recognize more easy the weeks
data.2024 <- data.2024[order(data.2024$date), ]
# Group dates in weeks
data.2024$date.week <- as.Date(cut(data.2024$date, "week"))
# Apply the function to calculate percentages
dv.list = c("detractor")
prop.detractors.by.week <- f.prop.dicdvl.1f (my.data = data.2024,
my.dv.list = dv.list,
my.factor = "date.week")
# Respecify the dates as a date
prop.detractors.by.week$factor <- as.Date(prop.detractors.by.week$factor)
# Visualize the results
prop.detractors.by.week
We can see that after October 1st there is a decreasing tendency in the detractors. This trend becomes clearer when visualized in a line graph that shows the weekly percentages with their 95% confidence intervals, together with a marker of the correction date.
# Specify the date the correction was implemented
date_correction <- as.Date("2024-09-30")
# Create the line graph
ggplot(prop.detractors.by.week, aes(x = factor, y = percentage)) +
geom_line(color = "steelblue", linewidth = 1.2) +
geom_point(aes(), color = "darkblue", alpha = 0.6) +
geom_ribbon(aes(ymin = ci_lower.perc, ymax = ci_upper.perc),
alpha = 0.2, fill = "steelblue") +
# Vertical line to indicate the correction
geom_vline(xintercept = date_correction,
color = "red", linewidth = 1, linetype = "dashed") +
# Anotation of the correction
annotate("text", x = date_correction, y = max(prop.detractors.by.week$percentage, na.rm = TRUE),
label = "Correction of\nphoto issues",
hjust = -0.1, vjust = 1.2, color = "red", fontface = "bold", size = 4) +
scale_x_date(
date_breaks = "1 month",
date_labels = "%b %Y",
expand = expansion(mult = c(0.05, 0.15)) # Más espacio a la derecha para la anotación
) +
scale_y_continuous(labels = scales::percent_format(scale = 1)) +
labs(
title = "Weekly Evolution in the Percentage of Detractors",
x = "Week",
y = "Porcentage of Detractors"
) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
The graph depicts how the decrease in detractors after correcting the photo issue is maintained along the rest of the period, supporting the claim that this correction impacted the probability to recommend our customer service.
To further validate this claim, we can test its statistical significance.
For this test, we compare the proportion of detractors before and after October 1st, the date of the correction. We first summarize the percentages and their confidence intervals for each period, and then formally evaluate the change with a Chi-Square test.
# Create a dicotomous variable for before and after the correction
data.2024$moment.photo.correction <- NA_character_
data.2024$moment.photo.correction [data.2024$date < "2024-10-1"] <- "before"
data.2024$moment.photo.correction [data.2024$date >= "2024-10-1"] <- "after"
data.2024$moment.photo.correction <- factor (data.2024$moment.photo.correction)
# Apply the function to calculate percentages, using detractors as our dv and the moment of the correction as a factor:
dv.list = c("detractor")
prop.detractors.after.photo.cor <- f.prop.dicdvl.1f (my.data = data.2024,
my.dv.list = dv.list,
my.factor = "moment.photo.correction")
# Visualize the results
prop.detractors.after.photo.cor
# Calculate the exact difference in the proportion of detractors
prop.difference.before.after <- prop.detractors.after.photo.cor [1,3] - prop.detractors.after.photo.cor [2,3]
prop.difference.before.after
## [1] 2.442318
# Run a chi square test to observe test the significance of the change
chisq.test(with(data.2024, table(detractor, moment.photo.correction)))
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: with(data.2024, table(detractor, moment.photo.correction))
## X-squared = 40.369, df = 1, p-value = 2.103e-10
The results show that the decrease in detractors after correcting the photo issue was statistically significant. Interestingly, the observed reduction (about 2.4%) was even stronger than our prediction (about 1.4%).
A plausible explanation is that our earlier prediction was based only on customers who explicitly reported the problem. In practice, however, the correction may have also benefited other customers indirectly affected by the issue, amplifying its positive impact. It is also important to consider that, since we do not have an experimental design, we cannot rule out that other improvements contributed to the reduction of detractors.
Our analysis demonstrates that correcting the photo issue in chat directly reduced the percentage of detractors by 2.4%. This provides managers with a clear, high-ROI business case: targeted investments in resolving specific operational failures can generate significant improvements in customer loyalty and brand reputation.
It is important to recognize some limitations. First, the analyses are based on observational data, so other contextual factors (e.g., seasonality, parallel process improvements) might also contribute to the observed change. Second, some of our predictions were based on self-reported cases, which might underestimate the total number of customers affected. Finally, while we focused on detractors, other loyalty metrics (e.g., promoters, NPS distribution) could provide a more complete picture.
Future work could extend these analyses by:
Evaluating the impact of other operational fixes in parallel, to prioritize interventions by ROI.
Using more advanced causal inference techniques (e.g., difference-in-differences, synthetic control) to strengthen the causal attribution of changes.
Exploring heterogeneity across customer segments, to identify whether some groups benefited more strongly from the correction.
These steps would allow companies not only to validate the robustness of the results, but also to scale the approach into a systematic framework for continuous service improvement.