1. Introduction

This case study identifies and quantifies high-impact opportunities in the customer service to improve the likelihood that customers recommend the company, building directly on the findings of Predicting Customer Detractors (Part 1): Analyzing Contextual Factors via Logistic Regression. Where Part 1 identified which service contexts had the most detractors (e.g., chat for incorrect items), this project investigates the root causes of dissatisfaction through thematic analysis of open-ended feedback and simulates the potential business impact of addressing them.

The analysis includes: - Data simulation and cleaning. - Exploratory analysis with descriptive statistics and visualizations.
- Logistic regression modeling. - Simulation-based predictions using bootstrapping.

Although based on a real-world project, all data, variables, and insights presented here have been simulated to ensure confidentiality.

2. Setup

We begin by loading the necessary R packages for data manipulation, visualization, and statistical modeling.

# Data handling
library(dplyr)       # Data manipulation

## 
## Adjuntando el paquete: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(openxlsx)    # Open Excel files

# Visualization
library(ggplot2)     # General plotting
library(vcd)         # Mosaic plots

## Cargando paquete requerido: grid

# Statistical analysis
library(psych)       # Descriptive statistics

## 
## Adjuntando el paquete: 'psych'

## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha

library(glmnet)      # Penalized logistic regression

## Cargando paquete requerido: Matrix

## Loaded glmnet 4.1-10

library(boot)        # Bootstrapping procedures

## 
## Adjuntando el paquete: 'boot'

## The following object is masked from 'package:psych':
## 
##     logit

library(MASS)        # Simulate coefficients from multivariate normal distributions

## 
## Adjuntando el paquete: 'MASS'

## The following object is masked from 'package:dplyr':
## 
##     select

3. Review of Prior Results

In the previous study Predicting Customer Detractors (Part 1), we identified several contextual factors in customer service that were associated with higher levels of detractors, i.e., customers less likely to recommend our service.

The key findings are summarized in the chart below:

# Import the image of the mosaic graph about the predicted detractors across contextual factors
knitr::include_graphics("Predicted.Detractors.png")

As shown in the figure, customers attended via chat for incorrect item issues had the highest probability of becoming detractors (i.e., providing a score of 6 or lower in the Net Promoter Score).

Based on this finding, stakeholders and the research team agreed to conduct a follow-up to compare the customers’ complaints in cases attended by chat for incorrect item issues versus cases attended by chat for other issues. By holding the contact channel constant (chat), we avoid confounding the results with chat’s general negative impact.

4. Data Simulation

4.1. Retrieving Data of Contextual Factors and NPS Scores

This project builds on the dataset from Predicting Customer Detractors (Part 1). Analyzing Contextual Factors via Logistic Regression, which contained information about contact reasons, contact methods, countries, and NPS scores of customers who contacted the customer service.

We apply the code below to retrieve this information (please refer to Part 1 for full details in the dataset structure).

# Set seed for reproducibility
set.seed(1000)

# Define dataset size 
data.n <- 40000  # Large enough to allow subgroup analysis despite unbalanced category probabilities

# Initialize dataframe with IDs
data <- data.frame(id = factor(1:data.n))

# Simulate the date of the evaluation
data$date <- sample(seq(from = as.Date("2024-01-01"), to = as.Date("2024-07-31"), by = "day"), 
                    size = data.n, 
                    replace = TRUE)


# ------ 1. Demographic Information and Contextual Variables

# Simulate age with a log-normal distribution, capped between 15 and 90
data$age <- round(rlnorm(n = data.n, meanlog = log(40), sdlog = log(1.4)))
data$age[data$age > 90] <- 90
data$age[data$age < 15] <- 15

# Simulate gender
data$gender <- factor(sample(c("man", "woman"),
                             prob = c(0.5, 0.5),
                             replace = TRUE, size = data.n))

# Simulate reason for contacting
data$reason <- factor(sample(c("incorrect_item", "delay", "status", "cancellation", "other"),
                             prob = c(0.20, 0.30, 0.20, 0.10, 0.20),
                             replace = TRUE, size = data.n))

# Simulate contact method
data$method <- factor(sample(c("chat", "web_form", "email", "phone"),
                             prob = c(0.40, 0.10, 0.15, 0.35),
                             replace = TRUE, size = data.n))

# Simulate country (two fictional countries for anonymity)
data$country <- factor(sample(c("Drinkland", "Eatland"),
                              prob = c(0.5, 0.5),
                              replace = TRUE, size = data.n))

data$country <- factor(data$country, levels = c("Eatland", "Drinkland"))


# ------ 2. Customer Complaints

# These variables represent underlying complaints that were not available at this stage of the investigation. We use them only to simulate NPS and satisfaction scores and store them in a separate dataframe to reflect their unavailability. They become visible only after thematic coding in the subsamples of interest (see next section).

## Create a dataset to store the complaints in connection to the case id
data.complaints <- data.frame(id = data$id)

## Complaints about connection issues
# Base probability of a connection complaint: 10%
prob_connection <- rep(0.10, data.n)
# In Drinkland, probability is doubled
prob_connection[data$country == "Drinkland"] <- prob_connection[data$country == "Drinkland"] * 2
# Ensure probability values remain within [0, 1]
prob_connection [prob_connection > 1] <- 1
prob_connection [prob_connection < 0] <- 0
# Generate binary variable based on probabilities
data.complaints$complain.connect <- as.integer(rbinom(n = data.n, size = 1, prob = prob_connection))

## Complaints about slow response
# Base probability of a slow response complaint: 5%
prob_slow_response <- rep(0.05, data.n)
# Email contacts: 6x higher likelihood
prob_slow_response[data$method == "email"] <- prob_slow_response[data$method == "email"] * 6
# Chat and web_form: 3x higher likelihood
prob_slow_response[data$method %in% c("chat", "web_form")] <- prob_slow_response[data$method %in% c("chat", "web_form")] * 3
# Ensure probability values remain within [0, 1]
prob_slow_response [prob_slow_response > 1] <- 1
prob_slow_response [prob_slow_response < 0] <- 0
# Generate binary variable
data.complaints$complain.sp.respond <- as.integer(rbinom(n = data.n, size = 1, prob = prob_slow_response))

## Complaints about slow resolution
# Base probability of a slow resolution complaint: 10%
prob_slow_resolution <- rep(0.10, data.n)
# If contact is via chat and reason is incorrect_item → 5x higher likelihood
is_chat_incorrect <- data$method == "chat" & data$reason == "incorrect_item"
prob_slow_resolution[is_chat_incorrect] <- prob_slow_resolution[is_chat_incorrect] * 5
# Ensure probability values remain within [0, 1]
prob_slow_resolution [prob_slow_resolution > 1] <- 1
prob_slow_resolution [prob_slow_resolution < 0] <- 0
# Generate binary variable
data.complaints$complain.sp.solve <- as.integer(rbinom(n = data.n, size = 1, prob = prob_slow_resolution))

## Complaints about having to repeat information
# Base probability of a repetition complaint: 2%
prob_repeat_info <- rep(0.02, data.n)
# If contact reason is cancellation → 10x higher likelihood
prob_repeat_info[data$reason == "cancellation"] <- prob_repeat_info[data$reason == "cancellation"] * 10
# Ensure probability values remain within [0, 1]
prob_repeat_info [prob_repeat_info > 1] <- 1
prob_repeat_info [prob_repeat_info < 0] <- 0
# Generate binary variable
data.complaints$complain.repeat <- as.integer(rbinom(n = data.n, size = 1, prob = prob_repeat_info))

## Other types of complaints (uniform distribution)
# Constant probability of 20% for other unspecified complaints
data.complaints$complain.other <- as.integer(rbinom(n = data.n, size = 1, prob = 0.20))


# ------ 3. NPS scores

# Simulate base NPS scores assuming no complaints (mean = 9, sd = 2)
data$nps.score <- rnorm(data.n, mean = 9, sd = 2)

# Apply fixed penalties based on complaints
# -3 for connection issues
# -4 for slow response
# -6 for slow resolution
# -2 for repeated information
# -1 for other complaints
data$nps.score <- data$nps.score -
  3 * data.complaints$complain.connect -
  4 * data.complaints$complain.sp.respond -
  6 * data.complaints$complain.sp.solve -
  2 * data.complaints$complain.repeat -
  1 * data.complaints$complain.other 

# Ensure scores stay within the 0–10 range
data$nps.score[data$nps.score < 0] <- 0
data$nps.score[data$nps.score > 10] <- 10 

# Convert to integer values (using floor)
data$nps.score <- floor(data$nps.score)

# Add placeholder for open-text comments
data$open.comment <- rep("bla bla", data.n)

## Create NPS segments based on score values
data$nps.segment <- rep(NA_character_, data.n) # Initialize empty character variable
data$nps.segment[data$nps.score <= 6] <- "detractor" 
data$nps.segment[data$nps.score > 6 & data$nps.score < 9] <- "passive"
data$nps.segment[data$nps.score >= 9] <- "promoter" # Assign segment based on score range
data$nps.segment <- factor(data$nps.segment, levels = c("detractor", "passive", "promoter")) # Convert to factor

## Create a binary variable for detractors
data$detractor <- NA_real_
data$detractor[data$nps.segment == "detractor"] <- 1
data$detractor[data$nps.segment != "detractor"] <- 0
data$detractor <- as.integer(data$detractor)


#------ 4. Satisfaction scores (1 to 5 scale)

# Generate satisfaction scores based on NPS
# We use a normal distribution centered at 3, with some variability (sd = 0.5)
# and add a scaled component based on NPS to simulate a positive relationship.
data$satisfaction <- floor(
  rnorm(data.n, mean = 3, sd = 0.5) + 
  0.9 * (1 + as.numeric(scale(data$nps.score)))
)

# Ensure scores stay within the 1–5 range
data$satisfaction[data$satisfaction < 1] <- 1
data$satisfaction[data$satisfaction > 5] <- 5

# Set satisfaction to missing for Drinkland
data$satisfaction[data$country == "Drinkland"] <- NA_real_

# Set satisfaction to missing for email and web_form contacts
data$satisfaction[data$method %in% c("email", "web_form")] <- NA_real_

# Introduce 10% additional random missingness among remaining valid values
na_randomizer <- sample(
  c(0, 1),
  size = sum(!is.na(data$satisfaction)),
  replace = TRUE,
  prob = c(0.1, 0.9)
) # Create a random vector (0 = missing, 1 = keep) for non-NA entries
data$satisfaction[which(!is.na(data$satisfaction))[na_randomizer == 0]] <- NA_real_ # Apply randomizer


# ------ 6. Check the data was created as expected

summary(data)

##        id             date                 age          gender     
##  1      :    1   Min.   :2024-01-01   Min.   :15.00   man  :19983  
##  2      :    1   1st Qu.:2024-02-23   1st Qu.:32.00   woman:20017  
##  3      :    1   Median :2024-04-16   Median :40.00                
##  4      :    1   Mean   :2024-04-16   Mean   :42.26                
##  5      :    1   3rd Qu.:2024-06-09   3rd Qu.:50.00                
##  6      :    1   Max.   :2024-07-31   Max.   :90.00                
##  (Other):39994                                                     
##             reason           method           country        nps.score     
##  cancellation  : 4018   chat    :15893   Eatland  :19856   Min.   : 0.000  
##  delay         :11911   email   : 6091   Drinkland:20144   1st Qu.: 4.000  
##  incorrect_item: 8083   phone   :14007                     Median : 7.000  
##  other         : 7987   web_form: 4009                     Mean   : 6.364  
##  status        : 8001                                      3rd Qu.: 9.000  
##                                                            Max.   :10.000  
##                                                                            
##  open.comment          nps.segment      detractor       satisfaction  
##  Length:40000       detractor:17966   Min.   :0.0000   Min.   :1.000  
##  Class :character   passive  :10278   1st Qu.:0.0000   1st Qu.:3.000  
##  Mode  :character   promoter :11756   Median :0.0000   Median :4.000  
##                                       Mean   :0.4491   Mean   :3.466  
##                                       3rd Qu.:1.0000   3rd Qu.:4.000  
##                                       Max.   :1.0000   Max.   :5.000  
##                                                        NA's   :26541

The simulated dataset contains the following key variables:

A variable “id” to identify each of the 40000 customers.
The variable “date” indicates the day in which each customer completed the survey.
The 3 contextual factors about the customer service attention:
- “country” defines the country the customer was attended, either Drinkland or Eatland.
- “reason” defines the contact reason customers contacted for: cancellations of the order, delays on the order delivery, status of the order, incorrect items received by customers, or other contact reasons.
- “method” defines the contact method used by customers: either chat, email, phone or a web_form.
3 variables that indicate the likelihood to recommend the service:
- “nps.score” indicate the responses to the Net Promoter Score (NPS), in a scale from 0 to 10
- “nps.segment” groups the NPS scores in detractors (scores 0 to 6), passive (scores 7-8), and promoters (scores 9-10).
- “detractor” indicates whether customers are grouped as detractors or not.
A variable, “open.comment”, that contained open feedback (to simplify, we have simulated each comment as “blabla”).
Additional variables (age, gender, satisfaction) were simulated as part of the original dataset from Part 1, but are not used in this study.

4.2. Sampling for Thematic Analysis

Due to the time-consuming nature of manual qualitative coding, we first checked sample size of the two conditions of interest to ensure a feasible workload.

# Check the number of cases contacting through different contact methods across contact reasons and countries.
with(data[data$method == "chat",], table(method, reason, country))

## , , country = Eatland
## 
##           reason
## method     cancellation delay incorrect_item other status
##   chat              822  2333           1638  1554   1586
##   email               0     0              0     0      0
##   phone               0     0              0     0      0
##   web_form            0     0              0     0      0
## 
## , , country = Drinkland
## 
##           reason
## method     cancellation delay incorrect_item other status
##   chat              839  2356           1631  1538   1596
##   email               0     0              0     0      0
##   phone               0     0              0     0      0
##   web_form            0     0              0     0      0

This cross-tabulation shows that:

There are more than 1,600 cases per country for chat contacts regarding incorrect item issues.
There are more than 6,000 cases per country for chat contacts regarding other types of issues.

Coding all cases was not feasible. To maintain rigor while ensuring efficiency, we randomly selected 300 cases for each condition.

# Specify the sample sizes for both samples
sample.n <- 300

# Sample 1: Cases of chat attending incorrect items
subset_chat_incorrect<- data[data$reason == "incorrect_item" & data$method == "chat", ]
subset1.rows <- nrow(subset_chat_incorrect)
sample_chat_incorrect <- subset_chat_incorrect[sample(1:subset1.rows, sample.n), ]
sample_chat_incorrect$condition <- "chat_incorrect_item"

#Sample 2: Cases of chat attending other issues
subset_chat_other <- data[!(data$reason == "incorrect_item") & data$method == "chat", ]
subset2.rows <- nrow(subset_chat_other)
sample_chat_other <- subset_chat_other[sample(1:subset2.rows, sample.n), ]
sample_chat_other$condition <- "chat_other_issues"

# Bind together the two samples in a dataset
data.ch.i <- bind_rows(sample_chat_incorrect, sample_chat_other)
data.ch.i$condition <- factor(data.ch.i$condition, levels = c("chat_other_issues", "chat_incorrect_item"))

#Check the final sample
summary(data.ch.i)

##        id           date                 age          gender   
##  514    :  1   Min.   :2024-01-01   Min.   :15.00   man  :296  
##  570    :  1   1st Qu.:2024-02-25   1st Qu.:32.00   woman:304  
##  705    :  1   Median :2024-04-21   Median :41.00              
##  719    :  1   Mean   :2024-04-18   Mean   :42.17              
##  755    :  1   3rd Qu.:2024-06-10   3rd Qu.:51.00              
##  824    :  1   Max.   :2024-07-31   Max.   :90.00              
##  (Other):594                                                   
##             reason         method         country      nps.score     
##  cancellation  : 37   chat    :600   Eatland  :301   Min.   : 0.000  
##  delay         :116   email   :  0   Drinkland:299   1st Qu.: 2.000  
##  incorrect_item:300   phone   :  0                   Median : 6.000  
##  other         : 81   web_form:  0                   Mean   : 5.478  
##  status        : 66                                  3rd Qu.: 9.000  
##                                                      Max.   :10.000  
##                                                                      
##  open.comment          nps.segment    detractor     satisfaction  
##  Length:600         detractor:324   Min.   :0.00   Min.   :1.000  
##  Class :character   passive  :118   1st Qu.:0.00   1st Qu.:2.000  
##  Mode  :character   promoter :158   Median :1.00   Median :3.000  
##                                     Mean   :0.54   Mean   :3.151  
##                                     3rd Qu.:1.00   3rd Qu.:4.000  
##                                     Max.   :1.00   Max.   :5.000  
##                                                    NA's   :329    
##                condition  
##  chat_other_issues  :300  
##  chat_incorrect_item:300  
##                           
##                           
##                           
##                           
##

We confirm that the simulated dataset provides 300 cases per condition, with distributions consistent with the original sample.

To validate the representativeness of this reduced dataset, we fit a logistic regression model predicting the likelihood of detractors based on the interaction between chat and incorrect item cases. The goal is to confirm whether the reduced dataset reproduces the same patterns observed in Predicting Customer Detractors (Part 1). Analyzing Contextual Factors via Logistic Regression. If these results do not align, the follow-up analysis would lack validity.

To obtain predictive metrics from the model, we apply a reusable function from my toolkit for logistic regression.

#####
# Fit the binomial model excluding the non-significant status factor
#####
rep.binomial.model.c <- glm(detractor ~ condition + country,
                        data = data.ch.i, family = "binomial")

summary(rep.binomial.model.c)

## 
## Call:
## glm(formula = detractor ~ condition + country, family = "binomial", 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -0.3627     0.1443  -2.514   0.0119 *  
## conditionchat_incorrect_item   0.8477     0.1678   5.050 4.41e-07 ***
## countryDrinkland               0.2146     0.1678   1.279   0.2008    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 827.93  on 599  degrees of freedom
## Residual deviance: 800.31  on 597  degrees of freedom
## AIC: 806.31
## 
## Number of Fisher Scoring iterations: 4

#####
#Function to obtain key interpretation metrics for a binomial logistic model
#####


f.metrics.binomial <- function(my.model, my.matrix){
  
  #--------1. Preliminary Checks: 
  
  # Making sure the model is glm binomial
  if (!inherits(my.model, "glm") || my.model$family$family != "binomial") {
    stop("The function requires a binomial model.")
  }
  
  
  #--------2. Odds Ratios
  
  # Obtaining the coefficients and confidence intervals
  coefs <- coef(my.model)
  ci <- confint(my.model)
  intercept <- coefs[1]
  
  # Exponentiation of the coefficients and their CIs to obtain odds ratios
  or <- exp(coefs)
  or_ci <- exp(ci)
  
  
  #--------3. Relative Probabilities (Risk Ratios)
  
  # Calculating the base probability from the intercept
  p0 <- or[1] / (1 + or[1])
  
  # Obtaining relative probabilities from risk ratios and base probability
  rp <- or / ((1 - p0) + (p0 * or))
  rp_ci_low <- or_ci[, 1] / ((1 - p0) + (p0 * or_ci[, 1]))
  rp_ci_high <- or_ci[, 2] / ((1 - p0) + (p0 * or_ci[, 2]))
  
  
  #--------4. Predicted Proportions
  
  # Extract log-odds and their sd for the individual presence of the factors (using the specified matrix)
  pred <- predict(my.model, newdata = my.matrix, type = "link", se.fit = TRUE)

  #Calculate confidence intervals in log-odds
  lower_logit <- pred$fit - 1.96 * pred$se.fit
  upper_logit <- pred$fit + 1.96 * pred$se.fit

  #Specify a function to transform log-odds to probabilities
  f.inv_logit <- function(x) {1 / (1 + exp(-x))}
  
  #Apply the function to the log-odds and the limits of their ci
  predicted_prob <- f.inv_logit(pred$fit) # Predicted probabilities
  lower_predicted_prob <- f.inv_logit(lower_logit) # Lower limit of CI for predicted probabilities
  upper_predicted_prob <- f.inv_logit(upper_logit) # Upper limit of CI for predicted probabilities

  
  #--------5. Differences in Predicted Probabilities via Bootstrap

  # Simulate a distribution of log-odds, for example with 10000 cases
  set.seed(123)
  simulated_log_odds <- as.data.frame(matrix(
  rnorm(10000 * length(pred$fit), mean = pred$fit, sd = pred$se.fit),
  ncol = length(pred$fit), byrow = TRUE
  ))

  # Transform logg-odds to expected probabilities (using the formula specified above)
  simulated_probs <- f.inv_logit(simulated_log_odds)

  # Calculate differences from the first column, which correspond to the intercept probabilities
  diffs <- apply(simulated_probs, 2, function(col){
  col - simulated_probs[[1]]
  })

  # Calculate the means of the differences and 95% CI with percentiles
  prob_diff <- colMeans(diffs)
  lower_prob_diff  <- apply(diffs, 2, quantile, probs = 0.025)
  upper_prob_diff <- apply(diffs, 2, quantile, probs = 0.975)
  
  
  #-------- 6. Combine Results
  
  results <- data.frame(
    Term = names(rp),
    Prediction = as.numeric(predicted_prob)*100,
    Lower_Prediction = as.numeric(lower_predicted_prob)*100,
    Upper_Prediction = as.numeric(upper_predicted_prob)*100,
    Predicted_Change = as.numeric(prob_diff)*100,
    Lower_Predicted_Change = as.numeric(lower_prob_diff)*100,
    Upper_Predicted_Change = as.numeric(upper_prob_diff)*100,
    Relative_Probability = as.numeric(rp),
    Lower_Relative_Probability = as.numeric(rp_ci_low),
    Upper_Relative_Probability = as.numeric(rp_ci_high),
    Odds_Ratio = as.numeric(or),
    Lower_Odds_Ratio = as.numeric(or_ci[,1]),
    Upper_Odds_Ratio = as.numeric(or_ci[,2])
)
  
  # Remove meaningless metrics for intercept
  results[1, c("Predicted_Change",
             "Lower_Predicted_Change",
             "Upper_Predicted_Change",
             "Relative_Probability",
             "Lower_Relative_Probability",
             "Upper_Relative_Probability",
             "Odds_Ratio",
             "Lower_Odds_Ratio",
             "Upper_Odds_Ratio")] <- NA_real_

  
  # Return the results
  return(results)
}

#####
# Design Matrix for Predictions
#####

# Extract the factor names from the model, with and without the intercept
f.names.interc <- names(coef(rep.binomial.model.c))
f.names <- f.names.interc[-1]

# Create a matrix with 1s on the diagonal (each case represents one factor set to 1)
my.matrix.df <- as.data.frame(diag(length(f.names.interc)))
my.matrix.df[, 1] <- f.names.interc
names(my.matrix.df) <- c("case", f.names)

# Ensure factor variables match the dataset format: we need to do it for country and condition. 
names(my.matrix.df)[names(my.matrix.df) == "countryDrinkland"] <- "country"
my.matrix.df$country[my.matrix.df$country == 1] <- "Drinkland"
my.matrix.df$country[my.matrix.df$country == 0] <- "Eatland"
my.matrix.df$country <- as.factor(my.matrix.df$country)

names(my.matrix.df)[names(my.matrix.df) == "conditionchat_incorrect_item"] <- "condition"
my.matrix.df$condition[my.matrix.df$condition == 1] <- "chat_incorrect_item"
my.matrix.df$condition[my.matrix.df$condition == 0] <- "chat_other_issues"
my.matrix.df$condition <- as.factor(my.matrix.df$condition)

#####
# Calculate the metrics to the model using the function and matrix above
#####

sample.model.metrics.df <- f.metrics.binomial(rep.binomial.model.c, my.matrix.df)

## Waiting for profiling to be done...

# Display the resulting metrics
sample.model.metrics.df

We observe that the predictions for the reduced subsample are consistent with those obtained from the full dataset, with overlapping confidence intervals (see Section 3). This supports the representativeness of the subsample and justifies using it for further exploration of the differences between chat cases involving incorrect item issues and other issues.

Having established this, we proceed to code the open comments from customers to identify the most common complaints and opportunities for improvement.

4.3. Simulating Open-Text Coding for Complaints

Drawing from prior literature on customer service, we defined five initial categories of complaints to guide the coding of customer comments:

Connection issues
Slow response time
Slow issue resolution
Repetition of information
Other types of complaints

Since this is a simulated dataset, we generate these categories directly from the complaints information embedded in the data simulated earlier.

# Merge the complaints information in our subsample
data.ch.i <- merge(data.ch.i, data.complaints, by = "id", all.x = TRUE)

# Check the results
summary(data.ch.i)

##        id           date                 age          gender   
##  514    :  1   Min.   :2024-01-01   Min.   :15.00   man  :296  
##  570    :  1   1st Qu.:2024-02-25   1st Qu.:32.00   woman:304  
##  705    :  1   Median :2024-04-21   Median :41.00              
##  719    :  1   Mean   :2024-04-18   Mean   :42.17              
##  755    :  1   3rd Qu.:2024-06-10   3rd Qu.:51.00              
##  824    :  1   Max.   :2024-07-31   Max.   :90.00              
##  (Other):594                                                   
##             reason         method         country      nps.score     
##  cancellation  : 37   chat    :600   Eatland  :301   Min.   : 0.000  
##  delay         :116   email   :  0   Drinkland:299   1st Qu.: 2.000  
##  incorrect_item:300   phone   :  0                   Median : 6.000  
##  other         : 81   web_form:  0                   Mean   : 5.478  
##  status        : 66                                  3rd Qu.: 9.000  
##                                                      Max.   :10.000  
##                                                                      
##  open.comment          nps.segment    detractor     satisfaction  
##  Length:600         detractor:324   Min.   :0.00   Min.   :1.000  
##  Class :character   passive  :118   1st Qu.:0.00   1st Qu.:2.000  
##  Mode  :character   promoter :158   Median :1.00   Median :3.000  
##                                     Mean   :0.54   Mean   :3.151  
##                                     3rd Qu.:1.00   3rd Qu.:4.000  
##                                     Max.   :1.00   Max.   :5.000  
##                                                    NA's   :329    
##                condition   complain.connect complain.sp.respond
##  chat_other_issues  :300   Min.   :0.0000   Min.   :0.0000     
##  chat_incorrect_item:300   1st Qu.:0.0000   1st Qu.:0.0000     
##                            Median :0.0000   Median :0.0000     
##                            Mean   :0.1617   Mean   :0.1467     
##                            3rd Qu.:0.0000   3rd Qu.:0.0000     
##                            Max.   :1.0000   Max.   :1.0000     
##                                                                
##  complain.sp.solve complain.repeat complain.other
##  Min.   :0.000     Min.   :0.00    Min.   :0.00  
##  1st Qu.:0.000     1st Qu.:0.00    1st Qu.:0.00  
##  Median :0.000     Median :0.00    Median :0.00  
##  Mean   :0.305     Mean   :0.03    Mean   :0.18  
##  3rd Qu.:1.000     3rd Qu.:0.00    3rd Qu.:0.00  
##  Max.   :1.000     Max.   :1.00    Max.   :1.00  
##

The results are simulated as expected, with the following proportions:

16% of cases reported connection issues,
15% reported slow response,
31% reported slow resolution,
3% reported repeated information, and
18% reported other types of complaints.

5. Evaluating Complaints Across Conditions

To investigate why chat cases handling incorrect item issues have a higher proportion of detractors compared to other issues, we examine the type and frequency of complaints in each condition.

5.1. Descriptive Analysis of Complaints

We calculate the percentage of cases reporting each type of complaint across the two conditions and countries, along with 95% confidence intervals. This is done using a reusable function from my R toolkit for percentages with 95% CI across multiple factors.

#####
#  Function to calculate percentages and 95% CI across two factors
#####

f.prop.2f <- function(my.data, my.dv.list, my.factor1, my.factor2){
  
  # Initialize an empty list to store the sampled subsets from each stratum.
  results <- list() 
  
  # Initialize a counter for indexing elements in the 'results' list.
  p <- 1 
  
  # Obtain the levels of the factors
  f1.levels <- unique (my.data[[my.factor1]])
  f2.levels <- unique (my.data[[my.factor2]])
  
  #Create a loop for each dv and for each level of the factors
    for (dv in my.dv.list){
      for (lf1 in f1.levels){
        for(lf2 in f2.levels){
          
          # Subset the data for the current factor combination
          current.vector <- my.data[my.data[[my.factor1]]==lf1 & my.data[[my.factor2]]==lf2, dv]
          current.vector.c <- as.numeric(current.vector[!is.na(current.vector)]) #cleaning missing cases
          n <- length(current.vector.c) # number of cases in the vector
          s <- length(current.vector.c[current.vector.c == 1]) # number of cases satisfying the condition
          
          # Calculate proportions and 95% CI
          if(n > 0){
            proportion <- s / n #calculating proportions
            binomial.test <- binom.test (s, n) # binomial test to calculate 95% CI of the proportions
            ci_lower <- binomial.test$conf.int[1] #extracting the lower limit of the CI
            ci_upper <- binomial.test$conf.int[2] #extracting the upper limit of the CI    
          }else{
            proportion <- NA
            ci_lower <- NA
            ci_upper <- NA
          }

          # Save results
          results [[p]] <- data.frame(dep.variable = dv, factor1 = lf1, factor2 = lf2, percentage = proportion * 100, 
                           ci_lower.perc = ci_lower * 100, ci_upper.perc = ci_upper * 100, stringsAsFactors = FALSE)
          p <- p + 1 
          
        } # close the loop for factor 2
      } # close the loop for factor 1
    } # close the loop for the dv

  #Combine the results in a dataframe
  results.df <- (do.call(rbind, results))
  rownames(results.df) <- NULL

  #Return the dataframe
  return(results.df)
}


#####
#  Application of the function to calculate frequency of complaints across conditions and countries
#####

# Save a vector with the names of the dependent variables referring to complaints
dv.variables <- c("complain.connect", "complain.sp.respond",  "complain.sp.solve", "complain.repeat", "complain.other")

# Apply the function
prop.df <- f.prop.2f (my.data = data.ch.i , my.dv.list = dv.variables, my.factor1 = "condition", my.factor2 = "country")

# Visualize the results
prop.df # to see the results

5.2. Visualizing Complaint Frequencies

To better understand how complaints differ across chat cases handling incorrect item issues versus other issues, we visualize the above percentages and their 95% confidence intervals for each complaint type, separated by country.

# Set order of complaint types for consistent plotting
prop.df$dep.variable <- factor(prop.df$dep.variable, levels = c("complain.connect", "complain.repeat", "complain.sp.respond", "complain.sp.solve", "complain.other"))

# Custom titles for facets
custom_titles <- c(
  complain.connect   = "Connection Problems",
  complain.repeat     = "Need to Repeat Information",
  complain.sp.respond = "Slow Responses",
  complain.sp.solve   = "Slow Resolution",
  complain.other      = "Other Complaints"
)

# Plot bar chart with error bars and faceting (using ggplot2)
ggplot(prop.df, aes(x = factor1, y = percentage, fill = factor2)) +
  geom_col(position = position_dodge(width = 0.8), width = 0.7) +
  geom_errorbar(
    aes(ymin = ci_lower.perc, ymax = ci_upper.perc),
    position = position_dodge(width = 0.8),
    width = 0.2
  ) +
  facet_wrap(~ dep.variable, 
             scales = "fixed", 
             axes = "all", 
             labeller = labeller(dep.variable = custom_titles)) +
  labs(
    y = "% of Complaints",
    fill = "Country",
    title = "% of Chat Complaints by Issue Type"
  ) +
  scale_x_discrete(labels = c ("incorrect item", "other issue")) +
  ylim(0, 100) +  
  geom_text(
    aes(label = paste0(round(percentage, 0), "%")),         # label with percentage rounded
    position = position_dodge(width = 0.8),   # match the dodge of bars
    vjust = -3,                             # slightly above the bar
    size = 2.5                                # text size
  ) +
  theme_minimal(base_size = 12) +
  theme(
    axis.title.x = element_blank(),
    legend.position = c(1, 0),
    legend.justification = c("right", "bottom"),
    legend.title = element_text(size = 10, face = "bold"),
    legend.text  = element_text(size = 10),
    panel.spacing = unit(1, "lines"),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    strip.text = element_text(face = "bold"),
    axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),  
    plot.margin = margin(10, 5, 10, 5)
  )

The graphs suggest that chat attention to incorrect item issues and chat attention to other issues differ mainly in the presence of slow resolution complaints. These are substantially more common for incorrect item issues compared to other issues.

The graphs also suggest that this pattern of results is consistent across countries.

5.3. Modeling Complaint Likelihood

To formally test the observations from the descriptive graphs, we estimate binomial logistic regression models for each type of complaint. These models examine whether the likelihood of each complaint differs between chat cases involving incorrect item issues versus other issues, controlling for country.

Given that our subsamples have similar proportions across countries, multicollinearity is not expected to be an issue.

##-- Logistic Models for different type of complaints, controlling for country

# Connection complaints
m.connect <- glm(complain.connect ~ condition + country, data = data.ch.i, family = binomial)
summary(m.connect)

## 
## Call:
## glm(formula = complain.connect ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -2.3606     0.2315 -10.197  < 2e-16 ***
## conditionchat_incorrect_item   0.2325     0.2262   1.028    0.304    
## countryDrinkland               1.0181     0.2394   4.254  2.1e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 530.91  on 599  degrees of freedom
## Residual deviance: 510.38  on 597  degrees of freedom
## AIC: 516.38
## 
## Number of Fisher Scoring iterations: 4

# Complaints about needing to repeat information
m.repeat <- glm(complain.repeat ~ condition + country, data = data.ch.i, family = binomial)
summary(m.repeat)

## 
## Call:
## glm(formula = complain.repeat ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -3.2611     0.3873  -8.419   <2e-16 ***
## conditionchat_incorrect_item  -0.2308     0.4816  -0.479    0.632    
## countryDrinkland              -0.2240     0.4816  -0.465    0.642    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 161.69  on 599  degrees of freedom
## Residual deviance: 161.24  on 597  degrees of freedom
## AIC: 167.24
## 
## Number of Fisher Scoring iterations: 6

# Slow response complaints
m.respond <- glm(complain.sp.respond ~ condition + country, data = data.ch.i, family = binomial)
summary(m.respond)

## 
## Call:
## glm(formula = complain.sp.respond ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -1.8760109  0.2055836  -9.125   <2e-16 ***
## conditionchat_incorrect_item  0.0007374  0.2309741   0.003    0.997    
## countryDrinkland              0.2213915  0.2316233   0.956    0.339    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 500.26  on 599  degrees of freedom
## Residual deviance: 499.34  on 597  degrees of freedom
## AIC: 505.34
## 
## Number of Fisher Scoring iterations: 4

# Slow resolution complaints
m.solve <- glm(complain.sp.solve ~ condition + country, data = data.ch.i, family = binomial)
summary(m.solve)

## 
## Call:
## glm(formula = complain.sp.solve ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -2.25502    0.21994 -10.253   <2e-16 ***
## conditionchat_incorrect_item  2.28847    0.22697  10.083   <2e-16 ***
## countryDrinkland              0.04008    0.19888   0.202     0.84    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 738.05  on 599  degrees of freedom
## Residual deviance: 606.25  on 597  degrees of freedom
## AIC: 612.25
## 
## Number of Fisher Scoring iterations: 4

# Other complaints
m.other <- glm(complain.other ~ condition + country, data = data.ch.i, family = binomial)
summary(m.other)

## 
## Call:
## glm(formula = complain.other ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -1.1153     0.1707  -6.534 6.41e-11 ***
## conditionchat_incorrect_item  -0.3229     0.2152  -1.500   0.1335    
## countryDrinkland              -0.5446     0.2175  -2.504   0.0123 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 565.67  on 599  degrees of freedom
## Residual deviance: 557.05  on 597  degrees of freedom
## AIC: 563.05
## 
## Number of Fisher Scoring iterations: 4

The statistical models confirm our initial observation: slow resolution is the primary driver of dissatisfaction for incorrect item cases. To identify actionable solutions, we conducted a deeper thematic analysis of these specific complaints.

Based on this observation, we proceed to examine more granular categories within slow resolution complaints to identify specific problems and opportunities for improvement.

6. Identifying Root Causes

Slow resolution complaints appear to be the primary driver of the high percentage of detractors in chat cases attending incorrect item issues. To identify specific problems and opportunities, we conduct additional thematic analyses focused in these complaints.

6.1. Simulating Open-Text Coding of Root Causes

To represent the qualitative coding within the slow resolution complaints, we simulate the presence of four specific problems in them:

Problems uploading photos in the chat.
Delays in verifying customer information.
Missed or delayed deadlines.
Unspecified slow resolution issues.

We simulate the coding in two steps:

Determine how many specific problems are mentioned per comment (none, one, two, or three).
Allocate problems without replacement, with probabilities weighted by case condition (higher probability of photo upload issues in incorrect item cases).

# 1) Initialize problem variables
vars <- c("problem.photo","problem.info","problem.deadline","problem.unspecified")
for(v in vars) data.ch.i[[v]] <- NA_real_

# 2) Define probability distributions for number of problems per comment
pk_other    <- c(`0` = 0.10, `1` = 0.70, `2` = 0.18, `3` = 0.02)   # other issues
pk_incorrect<- c(`0` = 0.05, `1` = 0.70, `2` = 0.22, `3` = 0.03)   # incorrect item

# 3) Define weighted probabilities for selecting each specific problem
probs <- c("photo","info","deadline")
w_other     <- c(photo = 0.05, info = 0.60, deadline = 0.35)       # other issues
w_incorrect <- c(photo = 0.70, info = 0.20, deadline = 0.10)       # incorrect item

# 4) Function: sample k items without replacement, weighted by probability
sample_weighted_no_replace <- function(items, weights, k){
  if(k <= 0) return(character(0))
  picked <- character(0)
  w <- weights
  for(i in seq_len(k)){
    p <- w / sum(w)
    choice <- sample(items, size = 1, prob = p)
    picked <- c(picked, choice)
    w[choice] <- 0  # quita el elegido (sin reemplazo)
  }
  picked
}

# 5) Assign problems for cases with slow resolution complaints
idx <- which(data.ch.i$complain.sp.solve == 1)

for(i in idx){
  cond <- data.ch.i$condition[i]
  
  # Select distribution parameters based on condition
  pk <- if(cond == "chat_incorrect_item") pk_incorrect else pk_other
  K <- as.integer(sample(as.integer(names(pk)), size = 1, prob = pk))
  w  <- if(cond == "chat_incorrect_item") w_incorrect else w_other
  
  # Default: all problem variables = 0
  data.ch.i$problem.photo[i]       <- 0
  data.ch.i$problem.info[i]        <- 0
  data.ch.i$problem.deadline[i]    <- 0
  data.ch.i$problem.unspecified[i] <- 0
  
  if(K == 0){
    # unspecified
    data.ch.i$problem.unspecified[i] <- 1
  } else {
    chosen <- sample_weighted_no_replace(probs, w, K)
    if("photo"    %in% chosen) data.ch.i$problem.photo[i]    <- 1
    if("info"     %in% chosen) data.ch.i$problem.info[i]     <- 1
    if("deadline" %in% chosen) data.ch.i$problem.deadline[i] <- 1
  }
}

# 6) For those without complaints, leave 0 as a value
no_idx <- which(data.ch.i$complain.sp.solve == 0)
data.ch.i[no_idx, c("problem.photo","problem.info","problem.deadline","problem.unspecified")] <- 0


# Check the results
summary(data.ch.i)

##        id           date                 age          gender   
##  514    :  1   Min.   :2024-01-01   Min.   :15.00   man  :296  
##  570    :  1   1st Qu.:2024-02-25   1st Qu.:32.00   woman:304  
##  705    :  1   Median :2024-04-21   Median :41.00              
##  719    :  1   Mean   :2024-04-18   Mean   :42.17              
##  755    :  1   3rd Qu.:2024-06-10   3rd Qu.:51.00              
##  824    :  1   Max.   :2024-07-31   Max.   :90.00              
##  (Other):594                                                   
##             reason         method         country      nps.score     
##  cancellation  : 37   chat    :600   Eatland  :301   Min.   : 0.000  
##  delay         :116   email   :  0   Drinkland:299   1st Qu.: 2.000  
##  incorrect_item:300   phone   :  0                   Median : 6.000  
##  other         : 81   web_form:  0                   Mean   : 5.478  
##  status        : 66                                  3rd Qu.: 9.000  
##                                                      Max.   :10.000  
##                                                                      
##  open.comment          nps.segment    detractor     satisfaction  
##  Length:600         detractor:324   Min.   :0.00   Min.   :1.000  
##  Class :character   passive  :118   1st Qu.:0.00   1st Qu.:2.000  
##  Mode  :character   promoter :158   Median :1.00   Median :3.000  
##                                     Mean   :0.54   Mean   :3.151  
##                                     3rd Qu.:1.00   3rd Qu.:4.000  
##                                     Max.   :1.00   Max.   :5.000  
##                                                    NA's   :329    
##                condition   complain.connect complain.sp.respond
##  chat_other_issues  :300   Min.   :0.0000   Min.   :0.0000     
##  chat_incorrect_item:300   1st Qu.:0.0000   1st Qu.:0.0000     
##                            Median :0.0000   Median :0.0000     
##                            Mean   :0.1617   Mean   :0.1467     
##                            3rd Qu.:0.0000   3rd Qu.:0.0000     
##                            Max.   :1.0000   Max.   :1.0000     
##                                                                
##  complain.sp.solve complain.repeat complain.other problem.photo   
##  Min.   :0.000     Min.   :0.00    Min.   :0.00   Min.   :0.0000  
##  1st Qu.:0.000     1st Qu.:0.00    1st Qu.:0.00   1st Qu.:0.0000  
##  Median :0.000     Median :0.00    Median :0.00   Median :0.0000  
##  Mean   :0.305     Mean   :0.03    Mean   :0.18   Mean   :0.1917  
##  3rd Qu.:1.000     3rd Qu.:0.00    3rd Qu.:0.00   3rd Qu.:0.0000  
##  Max.   :1.000     Max.   :1.00    Max.   :1.00   Max.   :1.0000  
##                                                                   
##   problem.info    problem.deadline  problem.unspecified
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.00       
##  1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00       
##  Median :0.0000   Median :0.00000   Median :0.00       
##  Mean   :0.1217   Mean   :0.07833   Mean   :0.02       
##  3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00       
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.00       
##

We can see the data of the new variables was correctly simulated, with the following frequencies: 19% of cases complaining about problems loading photos, 12% about slow processes for checking customer information, 8% about unfulfillment of the promised deadlines, and 2% of cases reporting unspecified slow resolution issues.

6.2 Analyzing Root Causes Across Conditions

To investigate whether the higher prevalence of slow resolution complaints in chat cases attending incorrect item issues can be attributed to specific addressable problems, we compare the frequencies of these issues across the two conditions.

6.2.1. Descriptive Statistics

We calculated the percentage of cases mentioning each type of problem across conditions and countries using the reusable function from my R toolkit for percentages with 95% CI across multiple factors, already specified in section 5.2.

For comparability with prior analyses, we also calculated the percentage of cases without any slow resolution complaints.

# Create a variable that indicates the absence of slow resolution complaints
data.ch.i$no.complain.sp.solve <- NA_real_
data.ch.i$no.complain.sp.solve [data.ch.i$complain.sp.solve == 1] <- 0
data.ch.i$no.complain.sp.solve [data.ch.i$complain.sp.solve == 0] <- 1


# Specify the list of dependent variables as a vector
dv.problems <- c("no.complain.sp.solve", "problem.photo", "problem.info", "problem.deadline", "problem.unspecified")

# Calculate the percentages using the reusable function
prob.prop.df <- f.prop.2f (my.data = data.ch.i , my.dv.list = dv.problems, my.factor1 = "condition", my.factor2 = "country")

#See the results
prob.prop.df

6.2.2. Comparative Visualizations of Root Causes

To facilitate interpretation, we visualize the percentages of each problem type along with their 95% confidence intervals using bar graphs.

# Setting the order of the complaints variables
prob.prop.df$dep.variable <- factor(prob.prop.df$dep.variable, levels = c("no.complain.sp.solve", "problem.photo", "problem.info", "problem.deadline", "problem.unspecified"))

# Titles for each faceted graph representing a type of complaint
custom_titles <- c(
  no.complain.sp.solve = "No Complaints",
  problem.photo   = "Problems Uploading Photos",
  problem.info    = "Delays in Verifying Info",
  problem.deadline = "Unmet Deadlines",
  problem.unspecified   = "Unspecified Problems"
)

# Plotting the bar graph
ggplot(prob.prop.df, aes(x = factor1, y = percentage, fill = factor2)) +
  geom_col(position = position_dodge(width = 0.8), width = 0.7) +
  geom_errorbar(
    aes(ymin = ci_lower.perc, ymax = ci_upper.perc),
    position = position_dodge(width = 0.8),
    width = 0.2
  ) +
  facet_wrap(~ dep.variable, 
             scales = "fixed", 
             axes = "all", 
             labeller = labeller(dep.variable = custom_titles)) +
  labs(
    y = "% of Cases",
    fill = "Country",
    title = "Problems in Slow Resolution Complaints"
  ) +
  scale_x_discrete(labels = c ("incorrect item", "other issue")) +
  ylim(0, 110) +  # allow space for percentage labels above bars
  geom_text(
    aes(label = paste0(round(percentage, 0), "%")),         # label with percentage rounded
    position = position_dodge(width = 0.8),   # match the dodge of bars
    vjust = -2,                             # slightly above the bar
    size = 2.5                                # text size
  ) +
  theme_minimal(base_size = 12) +
  theme(
    axis.title.x = element_blank(),
    legend.position = c(1, 0),
    legend.justification = c("right", "bottom"),
    legend.title = element_text(size = 10, face = "bold"),
    legend.text  = element_text(size = 10),
    panel.spacing = unit(1, "lines"),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    strip.text = element_text(face = "bold"),
    axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5),  
    plot.margin = margin(10, 5, 10, 5)
  )

Cases attended by chat for incorrect item issues differ not only in having more slow resolution complaints, but also in the nature of these complaints:

Most complaints in this condition concern difficulties uploading photos, which are rare in other issues. This aligns with stakeholder expectations, as customers with incorrect items are asked to provide photos as proof, while this requirement is uncommon for other issues.
Delays in verifying customer information are also more common in this condition.

We continue with formal testing of these differences.

6.2.3. Statistical Modeling of Root Causes

To formally test these differences, we fit binomial logistic regression models predicting each type of slow resolution complaint from condition (incorrect item vs. other issues), controlling for country. Given similar proportions across countries, multicollinearity is not a concern.

##-- Logistic Models for predicting different type of slow resolution complaints

# Problems loading the photo
m.problem.photo <- glm (problem.photo ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.photo)

## 
## Call:
## glm(formula = problem.photo ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -5.6430     1.0078  -5.599 2.15e-08 ***
## conditionchat_incorrect_item   5.2118     1.0087   5.167 2.38e-07 ***
## countryDrinkland              -0.1184     0.2364  -0.501    0.617    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 586.36  on 599  degrees of freedom
## Residual deviance: 411.59  on 597  degrees of freedom
## AIC: 417.59
## 
## Number of Fisher Scoring iterations: 8

# Problems of slow information check
m.problem.info <- glm (problem.info ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.info)

## 
## Call:
## glm(formula = problem.info ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -2.96739    0.29535 -10.047  < 2e-16 ***
## conditionchat_incorrect_item  1.51617    0.30257   5.011 5.42e-07 ***
## countryDrinkland              0.04544    0.25602   0.177    0.859    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 444.28  on 599  degrees of freedom
## Residual deviance: 413.69  on 597  degrees of freedom
## AIC: 419.69
## 
## Number of Fisher Scoring iterations: 5

# Problems of unmet deadlines
m.problem.deadline <- glm (problem.deadline ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.deadline)

## 
## Call:
## glm(formula = problem.deadline ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -3.0490     0.3107  -9.813   <2e-16 ***
## conditionchat_incorrect_item   0.6184     0.3158   1.958   0.0502 .  
## countryDrinkland               0.4324     0.3102   1.394   0.1633    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 329.62  on 599  degrees of freedom
## Residual deviance: 323.69  on 597  degrees of freedom
## AIC: 329.69
## 
## Number of Fisher Scoring iterations: 5

# Unspecified problems
m.problem.unspecified <- glm (problem.unspecified ~ condition + country, data = data.ch.i, family = binomial)
summary(m.problem.unspecified)

## 
## Call:
## glm(formula = problem.unspecified ~ condition + country, family = binomial, 
##     data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -5.3127     0.7709  -6.891 5.53e-12 ***
## conditionchat_incorrect_item   1.1279     0.6731   1.676   0.0938 .  
## countryDrinkland               1.1346     0.6731   1.686   0.0919 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 117.65  on 599  degrees of freedom
## Residual deviance: 111.17  on 597  degrees of freedom
## AIC: 117.17
## 
## Number of Fisher Scoring iterations: 7

The logistic models confirm the higher presence of photo problems and slow information check problems in chat cases attending incorrect item issues compared to other issues.

To further understand to what extent these problems explain the high proportion of detractors in chat cases attending incorrect item issues, we proceed to model these effects.

6.3. Quantifying the Impact of Root Causes on Detractors

To evaluate the impact of the problems on the likelihood to recommend we will fit binomial logistic regression models predicting detractors from the specific slow resolution problems, condition, and country.

6.3.1. Checking for Multicollinearity and Interactions

Before fitting the models, we conduct mosaic plots of all considered predictors to visually inspect collinearity and potential interactions. Given the number of predictors, we examine the mosaics in several steps.

## Checking interactions between problems, countries and conditions
# Problems Loading Photos
table.photos <- with (data.ch.i, table(problem.photo, condition, country, detractor))
table.photos

## , , country = Eatland, detractor = 0
## 
##              condition
## problem.photo chat_other_issues chat_incorrect_item
##             0                88                  56
##             1                 0                   2
## 
## , , country = Drinkland, detractor = 0
## 
##              condition
## problem.photo chat_other_issues chat_incorrect_item
##             0                81                  47
##             1                 0                   2
## 
## , , country = Eatland, detractor = 1
## 
##              condition
## problem.photo chat_other_issues chat_incorrect_item
##             0                62                  35
##             1                 0                  58
## 
## , , country = Drinkland, detractor = 1
## 
##              condition
## problem.photo chat_other_issues chat_incorrect_item
##             0                68                  48
##             1                 1                  52

doubledecker(table.photos)

# Problems of slow information check
table.info <- with (data.ch.i, table(problem.info, condition, country, detractor))
table.info

## , , country = Eatland, detractor = 0
## 
##             condition
## problem.info chat_other_issues chat_incorrect_item
##            0                88                  56
##            1                 0                   2
## 
## , , country = Drinkland, detractor = 0
## 
##             condition
## problem.info chat_other_issues chat_incorrect_item
##            0                81                  47
##            1                 0                   2
## 
## , , country = Eatland, detractor = 1
## 
##             condition
## problem.info chat_other_issues chat_incorrect_item
##            0                56                  65
##            1                 6                  28
## 
## , , country = Drinkland, detractor = 1
## 
##             condition
## problem.info chat_other_issues chat_incorrect_item
##            0                60                  74
##            1                 9                  26

doubledecker(table.info)

# Problems of failing the deadline
table.deadline <- with (data.ch.i, table(problem.deadline, condition, country, detractor))
table.deadline

## , , country = Eatland, detractor = 0
## 
##                 condition
## problem.deadline chat_other_issues chat_incorrect_item
##                0                88                  57
##                1                 0                   1
## 
## , , country = Drinkland, detractor = 0
## 
##                 condition
## problem.deadline chat_other_issues chat_incorrect_item
##                0                81                  48
##                1                 0                   1
## 
## , , country = Eatland, detractor = 1
## 
##                 condition
## problem.deadline chat_other_issues chat_incorrect_item
##                0                55                  82
##                1                 7                  11
## 
## , , country = Drinkland, detractor = 1
## 
##                 condition
## problem.deadline chat_other_issues chat_incorrect_item
##                0                59                  83
##                1                10                  17

doubledecker(table.deadline)

## Checking interactions between problems
table.problems <- with (data.ch.i, table(problem.photo, problem.info, problem.deadline, detractor))
table.problems

## , , problem.deadline = 0, detractor = 0
## 
##              problem.info
## problem.photo   0   1
##             0 270   2
##             1   1   1
## 
## , , problem.deadline = 1, detractor = 0
## 
##              problem.info
## problem.photo   0   1
##             0   0   0
##             1   1   1
## 
## , , problem.deadline = 0, detractor = 1
## 
##              problem.info
## problem.photo   0   1
##             0 159  27
##             1  65  28
## 
## , , problem.deadline = 1, detractor = 1
## 
##              problem.info
## problem.photo   0   1
##             0  20   7
##             1  11   7

doubledecker(table.problems)

We cannot identify relevant interactions between the factors used to predict detractors: across different conditions and countries, the presence of each of the problems is generally associated with most users being detractors. Furthermore, this ceiling effect makes it difficult to clearly observe whether the effects of the different problems interact with each other.

We also do not identify any issues of collinearity beyond the expected strong relation between problems loading photos and the incorrect item condition. Although we initially include the condition factor to illustrate how much of its effect is explained by the problems, it will be removed in the final model.

6.3.2. Predictive Model

We start by fitting a model that includes all potential predictors to evaluate their individual contributions.

# Specify the model

m.problems.sp.solve <- glm (detractor ~ problem.photo + problem.info + problem.deadline + country + condition, data = data.ch.i, family = binomial)
summary(m.problems.sp.solve)

## 
## Call:
## glm(formula = detractor ~ problem.photo + problem.info + problem.deadline + 
##     country + condition, family = binomial, data = data.ch.i)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -0.6034     0.1575  -3.830 0.000128 ***
## problem.photo                  3.4225     0.5406   6.331 2.44e-10 ***
## problem.info                   2.4683     0.5490   4.496 6.93e-06 ***
## problem.deadline               2.7492     0.7485   3.673 0.000240 ***
## countryDrinkland               0.3180     0.1922   1.655 0.097987 .  
## conditionchat_incorrect_item  -0.1455     0.2040  -0.714 0.475449    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 827.93  on 599  degrees of freedom
## Residual deviance: 633.58  on 594  degrees of freedom
## AIC: 645.58
## 
## Number of Fisher Scoring iterations: 6

The model indicates that the presence of all the expected problems is associated with a significantly higher probability of users becoming detractors.

Furthermore, once we account for these specific problems, the effect of condition is no longer significant, suggesting that these problems explain the higher presence of detractors in chat cases attending incorrect item issues.

To evaluate the effect of each problem with a more parsimonious model, we evaluate the fitting of a model excluding the non-significant factors.

# Specify the model
m.problems.sp.solve2 <- glm (detractor ~ problem.photo + problem.info + problem.deadline, data = data.ch.i, family = "binomial")

#Obtain the results
summary(m.problems.sp.solve2)

## 
## Call:
## glm(formula = detractor ~ problem.photo + problem.info + problem.deadline, 
##     family = "binomial", data = data.ch.i)
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)      -0.49250    0.09938  -4.956 7.21e-07 ***
## problem.photo     3.29055    0.52339   6.287 3.24e-10 ***
## problem.info      2.39322    0.54268   4.410 1.03e-05 ***
## problem.deadline  2.75534    0.74717   3.688 0.000226 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 827.93  on 599  degrees of freedom
## Residual deviance: 636.79  on 596  degrees of freedom
## AIC: 644.79
## 
## Number of Fisher Scoring iterations: 6

# Compare the models with and without the status factor
anova(m.problems.sp.solve2 , m.problems.sp.solve, test = "Chisq")

The likelihood ratio test confirms that removing the country and condition variables does not significantly reduce model fit. Therefore, we keep this simplified model that excluedes them.

To interpret the model, we derived meaningful metrics using a reusable function from my toolkit for logistic regression, already specified in section 4.2.

#####
# Design Matrix for Predictions
#####

# Extract the factor names from the model, with and without the intercept
f.names.interc <- names(coef(m.problems.sp.solve2)) #specify the new model here
f.names <- f.names.interc[-1]

# Create a matrix with 1s on the diagonal (each case represents one factor set to 1) 
my.matrix.df <- as.data.frame(diag(length(f.names.interc)))
my.matrix.df[, 1] <- f.names.interc
names(my.matrix.df) <- c("case", f.names)


#####
# Calculation of the Metrics (probability predictions, risk rations, odds ratios)
#####

# The function has already been specified in section 4.2
# Apply the function to the final binomial model
m.problems.sp.solve2.metrics.df <- f.metrics.binomial(m.problems.sp.solve2, my.matrix.df)

## Waiting for profiling to be done...

# Display the resulting metrics
m.problems.sp.solve2.metrics.df

We can see that the presence of each of the slow resolution problems is strongly associated with a higher likelihood of detractors. This relationship can be better visualized with a dot-and-whisker plot, where each dot represents the expected percentage of detractors and the whiskers represent the 95% confidence intervals.

# Add descriptive labels for each factor
m.problems.sp.solve2.metrics.df$Label <- NA_character_
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "(Intercept)"] <- 
  "No Specific Problems"
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "problem.photo"] <- "Problems Loading Photos"
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "problem.info"] <- "Problems Checking Information"
m.problems.sp.solve2.metrics.df$Label[m.problems.sp.solve2.metrics.df$Term == "problem.deadline"] <- "Problems of Failing Deadlines"



# Create the dot-and-whisker plot for the predictions
ggplot(m.problems.sp.solve2.metrics.df, aes(x = Prediction, 
                    y = reorder(Label, Prediction))) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "darkgrey") +
  geom_errorbarh(aes(xmin = Lower_Prediction, xmax = Upper_Prediction), height = 0.2) +
  geom_point(size = 3) +
    geom_text(aes(label = paste(round(Prediction,1), "%")), 
            vjust = -1, size = 3.5) + 
  labs(
    title = "Expected Detractors by Slow Resolution Problems",
    x = "% of Detractors", 
    y = "Stated Problems"
  ) +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5, face = "bold"))

## Warning: `geom_errobarh()` was deprecated in ggplot2 4.0.0.
## ℹ Please use the `orientation` argument of `geom_errorbar()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## `height` was translated to `width`.

We can see that When the slow resolution problems are present, nearly all users are expected to become detractors. This effect is similar for all slow resolution problems. In contrast, in the absence of the problems, only around 37% of the users are predicted to be detractors.

We can further investigate the impact of correcting these problems.

7. Predicted Impact of Correcting Problems

To predict the potential reduction in detractors by correcting problems, we need to consider:

The extent to which we predict the problems can be corrected.
The expected combination of problems across the population.
The effect of each combination on detractors.
The uncertainty associated with each of these predictions.

Given that these uncertainties may not follow normal distributions, we will use bootstrapping to estimate the variability of the predicted impact.

7.1. Building a Bootstrapping Simulation

We define a reusable function to estimate the uncertainty of predicted outcomes for any scenario. The function works as follows:

1. Simulate regression coefficients from the original model assuming a multivariate normal distribution.
1. Bootstrap the sample to simulate the combination of problems for each subgroup.
1. For each subgroup, calculate the predicted probability of the outcome across all simulated coefficient-sample combinations to obtain a sampling distribution, which is then summarized with medians and 95% confidence intervals.

f.bootstrap.predictions <- function (my.n, my.model, my.data, my.factor){

# --- Setup
  
# Make sure library MASS is installed and loaded
    if (!require(MASS, quietly = TRUE)) {
    install.packages("MASS")
    library(MASS)
  }

# Create a list to store the final results and the summary of the results
final.results <- list()
summary.list <- list()
pt <- 1
  
# Create subsets for subgroups
groups <- unique(my.data [[my.factor]])

# Extract coefficients from the original model
original.coefficients <- coef(my.model)

# Extract the names of the predictors in the original model
predictor.names <- names(original.coefficients)[-1]



# --- 1 Generate Simulated Coefficients

# Extract the covariance matrix
matriz_vcov <- vcov(my.model)

# Generate a matrix where each line is a set of simulated coefficients
simulated_coefficients <- mvrnorm(n = my.n, 
                                 mu = original.coefficients, 
                                 Sigma = matriz_vcov)

simulated_coefficients.df <- as.data.frame (simulated_coefficients)


# --- 2. Calculate Probabilities with Bootstrap Samples

# - A loop for each subgroup
for (group in groups) {
   
  #Create an empty vector for the sampling distribution
  sampling.distribution <- rep(NA, my.n)
  
  # Select a the subsample for the subgroup
  subsample <- my.data[my.data [[my.factor]] == group, ]
  
  # - A loop for each sample we simulate
  for (i in 1:my.n) {
  
    # Select the simulated coefficients for the sample
    sample.coefficients <- simulated_coefficients.df [i , ]
  
    # Select a bootstrap sample from our sample
    bootstrap.indexes <- sample(x = 1:nrow(subsample),
                           size = nrow(subsample),
                           replace = TRUE)
    bootstrap.sample <- subsample [bootstrap.indexes, ]
    
    # Create a vector to keep the results of the sample cases
    prob.success <- numeric(nrow(bootstrap.sample))
  
      # A loop for each case in the bootstrapped sample
      for (c in 1:nrow(bootstrap.sample)){
      
      # Empty vector to keep the effect for the presence of each predictor
      predictor.effects.vector <- numeric(length(predictor.names))
      
      # A loop for each predictor
       for (p in 1:length(predictor.names)) { 
         
         # Extract the values of the predictive model from each predictor
         name <- predictor.names[p] # save the name of the predictor
         coef_value <- as.numeric (sample.coefficients [[name]])
         data_value <- as.numeric (bootstrap.sample[[name]] [c])
         predictor.effects.vector [p] <- sample.coefficients [[name]] * bootstrap.sample [[name]][c]
        
      } # Close the loop each predictor
       
      # Extract the remaining values of the predictive model
      intercept <- as.numeric (sample.coefficients[1])
      predictors.effect <- sum (predictor.effects.vector)
      
      # Calculate the probability of each case
      prob.success [c] <- 1 / (1 + exp (-1 * (intercept + predictors.effect)))
    
      # Debug check
        if (is.na(prob.success[c])) {
          print(paste("NA in case ", c, ", sample ", i, "subgroup ", group))
        }
      
      } # Close the loop for each case in the bootstrapped sample

  # Verification of NAs in the estimation of sample cases
      if (any(is.na(prob.success))) {
        print(paste("ADVERTENCIA: NAs en prob.success para iteración", i))
        print(summary(prob.success))
      }
    
  # Calculate the average probability of success in the bootstrapped sample
  proportion.sample <- mean(prob.success, na.rm = TRUE)
  
  # Store the sample probability average in a vector for the sampling distribution
  sampling.distribution [i] <- proportion.sample 
  
  } #Close the loop for each sample 
  
  # Save results of all the samples
  estimate.probability <- median (sampling.distribution, na.rm = TRUE)
  lower.ci.estimate <- quantile(sampling.distribution, probs = 0.025, na.rm = TRUE)
  upper.ci.estimate <- quantile(sampling.distribution, probs = 0.975, na.rm = TRUE)
  
  # Create a name for the sublist of results for each subgroup
  name.group.results <- paste0(group, "results")
  
  # Store the results
  final.results [[name.group.results]] <- list(
    estimate = estimate.probability, 
    lower.ci = lower.ci.estimate,
    upper.ci = upper.ci.estimate, 
    bootstrap.sampling.dist = sampling.distribution
  )
  
  # Create a named vector to build a summary of results
  summary.list [[pt]] <- data.frame(group.factor = paste0 (group), 
                     estimate = estimate.probability,
                     lower.ci = lower.ci.estimate,
                     upper.ci = upper.ci.estimate,
                     stringsAsFactors = FALSE)
  pt <- pt + 1 
  
} #Close the loop for each subgroup

#  Add the summary to the final results as a dataframe
summary.df <- do.call (rbind, summary.list)
rownames(summary.df) <- NULL
final.results [["summary.df"]] <- summary.df

# Add the summary 

return (final.results)
}

Now we are ready to apply the bootstrapping function to scenarios where specific problems are corrected. We start with photo issues, as they are the most common slow resolution problem.

7.2. Simulating the Impact of Fixing Photo Issues

We compare the current probability of being a detractor with the expected probability if photo issues were fully corrected.

#####
# 1. Obtaining estimations for the present situation
#####

# Apply the bootstrapping function to the present data
sim.present <- f.bootstrap.predictions (my.n = 5000,
                        my.model = m.problems.sp.solve2, 
                        my.data = data.ch.i, 
                        my.factor = "condition")

# Extract a dataframe indicating the present moment
present.df <- data.frame(moment = rep("present", nrow(sim.present$summary.df)))
present.df <- cbind(present.df, sim.present$summary.df)


#####
# 2. Obtaining estimations for the corrected scenario
#####

# Represent the problem photo as totally eliminated
data.corrected <- data.ch.i
data.corrected$problem.photo <- 0

# Apply the formula
sim.photo.corrected <- f.bootstrap.predictions (my.n = 5000,
                        my.model = m.problems.sp.solve2, 
                        my.data = data.corrected, 
                        my.factor = "condition")

# Extract a dataframe indicating the moment
photo.corrected.df <- data.frame(moment = rep("expected if photo issues are fixed", nrow(sim.photo.corrected$summary.df)))
photo.corrected.df <- cbind(photo.corrected.df, sim.photo.corrected$summary.df)


#####
# 3. Save the results in a dataframe
#####

# Bind the results for the present and the photo correction simulations
impact.photo.correction.df <- rbind(present.df, photo.corrected.df)


# Express in percentages
impact.photo.correction.df [, c(3:5)] <- impact.photo.correction.df [, c(3:5)] * 100
impact.photo.correction.df

7.2.1. Visualization of the Impact

To better illustrate the comparison of the present and the correction scenario, we can use a bar graph

# Setting the order of the complaints variables
impact.photo.correction.df$group.factor <- factor(impact.photo.correction.df$group.factor, levels = c("chat_incorrect_item", "chat_other_issues"))
impact.photo.correction.df$moment <- factor(impact.photo.correction.df$moment, levels = c("present", "expected if photo issues are fixed"))

# Titles for each faceted graph representing a type of complaint
custom_titles <- c(
  chat_incorrect_item   = "Incorrect Item Issues",
  chat_other_issues     = "Other Issues"
)

# Plotting the bar graph
ggplot(impact.photo.correction.df, aes(x = moment, y = estimate)) +
  geom_col(position = position_dodge(width = 0.8), width = 0.7, fill = "grey80") +
  geom_errorbar(
    aes(ymin = lower.ci, ymax = upper.ci),
    position = position_dodge(width = 0.8),
    width = 0.2
  ) +
  facet_wrap(~ group.factor, 
             scales = "fixed", 
             axes = "all", 
             labeller = labeller(group.factor = custom_titles)) +
  labs(
    y = "% of Detractors",
    title = "Impact of Correcting Photo Issues in Chat"
  ) +
  scale_x_discrete(labels = c ("Current", "After\nCorrection")) +
  ylim(0, 100) +  
  geom_text(
    aes(label = paste0(round(estimate, 0), "%")),         # label with percentage rounded
    position = position_dodge(width = 0.8),   # match the dodge of bars
    vjust = -3,                             # slightly above the bar
    size = 3.5                                # text size
  ) +
  theme_minimal(base_size = 12) +
  theme(
    axis.title.x = element_blank(),
    panel.spacing = unit(1, "lines"),
    plot.title = element_text(hjust = 0.5, face = "bold"),
    strip.text = element_text(face = "bold", size = 11), # the panel titles
    axis.text.x = element_text(angle = 0, vjust = 0, hjust = 0.5, size = 11),  
    plot.margin = margin(10, 5, 10, 5)
  )

The impact of correcting photo issues in the chat can be huge for cases attending incorrect item issues, with an expected reduction from 65% to 51% in the percentage of detractors.

Yet, to better calculate the global impact, it can be beneficial to evaluate the % of detractors reduction within all customer service cases.

7.2.2. Estimating the Overall Business Impact

To calculate the overall impact in detractors reduction, we need to ponderate the results above by the percentage of cases in customer service that these scenarios represent. In coherence with Part 1, we will assume that:

the cases of incorrect item issues attended by chat represent a 10% of total cases.
the cases of other issues attended by chat represent a 30% of total cases.

# --- 1. Extract the sampling distributions simulated for each scenario

# Incorrect item issues for the present moment
dist.present.ii <- sim.present[["chat_incorrect_itemresults"]][["bootstrap.sampling.dist"]]

# Incorrect item issues after the correction of the issue
dist.photo.corrected.ii <- sim.photo.corrected[["chat_incorrect_itemresults"]][["bootstrap.sampling.dist"]]

# Other issues for the present moment
dist.present.oi <- sim.present[["chat_other_issuesresults"]][["bootstrap.sampling.dist"]]

# Other issues after the correction of the issue
dist.photo.corrected.oi <- sim.photo.corrected[["chat_other_issuesresults"]][["bootstrap.sampling.dist"]]


# --- 2. Express a function to calculate differences between two distributions

f.distribution.differences <- function(my.vector1, my.vector2) {
  
  # Obtain the difference between the two distributions
  distribution.differences <- my.vector1 - my.vector2
  
  # Define the distribution of differences with metrics
  dif.estimate <- median (distribution.differences,  na.rm = TRUE) *100
  lower.ci.estimate <- quantile(distribution.differences, probs = 0.025, na.rm = TRUE) *100
  upper.ci.estimate <- quantile(distribution.differences, probs = 0.975, na.rm = TRUE) *100
  
  # Save the results in a dataframe
  results.df <- data.frame (dif = dif.estimate, lower.ci = lower.ci.estimate, upper.ci = upper.ci.estimate)
  rownames(results.df) <- NULL
  
  return(results.df)
  
}


# --- Calculate the distribution of differences 

# for incorrect item issues
dist.differences.ii <- f.distribution.differences(dist.present.ii, dist.photo.corrected.ii)

# for other issues
dist.differences.oi <- f.distribution.differences(dist.present.oi, dist.photo.corrected.oi)

# total (including uncertainty for other issues)
dist.differences.total.inc <- f.distribution.differences ((dist.present.ii * 0.10 + dist.present.oi * 0.30), (dist.photo.corrected.ii * 0.10 + dist.photo.corrected.oi *0.30))

# Since the reduction of detractors for other issues was not significant, we also calculate the total without considering the reduction in this factor, which can add unnecessary error in the predictions)
# for the total - excluding uncertainty othe issues)
dist.differences.total.ex <- f.distribution.differences ((dist.present.ii * 0.10), (dist.photo.corrected.ii * 0.10))


# --- Bind the results in a single data frame

results.dist.differences <- data.frame (condition = c("within incorrect item issues attended by chat","within other issues attended by chat", "within all cases - including variability from non-significant factors", "within all cases - excluding variability from non-significant factors"))
merged.results <- bind_rows(dist.differences.ii, dist.differences.oi, dist.differences.total.inc, dist.differences.total.ex)
results.dist.differences <- cbind (results.dist.differences, merged.results)
results.dist.differences

The results indicate that we would expect a 1.40% less detractors within all customer service cases after the correction of the issue.

7.3. Comparing Potential ROI of Different Solutions

Similarly to the procedure applied for correcting photo issues, we can calculate the expected impact of addressing other types of problems and compare them. Due to space constraints, we simulate how this comparison could look.

# Extract the results for the impact of correcting photo issues
impact.across.corrections <- results.dist.differences [ 4, ]
impact.across.corrections [1,1] <- "Issues loading photos in the chat"

#Introduce made-up results to display
impact.across.corrections [2, ] <- c ("Lag issues in the checking information tool", 1.23, 0.85, 1.55)
impact.across.corrections [3, ] <- c ("Connectivity issues Drinkland", 0.84, 0.35, 1.35)
impact.across.corrections [4, ] <- c ("Slow email responses", 0.60, 0.22, 1.01)

#Convert values to numeric
impact.across.corrections [, 2:4] <- apply(impact.across.corrections[ ,2:4], 2, as.numeric)

# Visualize the results
rownames(impact.across.corrections) <- NULL
impact.across.corrections

We can also visualize these results in dot-and-whisker plot:

# Create the dot-and-whisker plot
ggplot(impact.across.corrections, aes(x = dif,
                                      y = reorder(condition, dif))) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "darkgrey") +
  geom_errorbarh(aes(xmin = lower.ci, xmax = upper.ci), height = 0.2) +
  geom_point(size = 3) +
  geom_text(aes(label = paste(round(dif, 1), "%")),
            vjust = -1, size = 3.5) +
  labs(
    title = "Expected Impact of Correcting Each Problem",
    x = "% of Less Detractors",
    y = "Problems"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5, margin = margin(b = 40)),
    plot.title.position = "plot", # To center the title on the whole area
    axis.title.y = element_text(angle = 90, size = 12, face = "bold", margin = margin(r = 20)),
    axis.title.x = element_text(angle = 0, size = 12, face = "bold", margin = margin(t = 20))
  )

## `height` was translated to `width`.

The comparison shows that correcting photo issues has the largest expected impact on reducing detractors, which can be considered to prioritize this correction.

However, it is important to note that simulations of future scenarios are unstable. It is important to follow them up with analyses once the corrections are applied.

8. Following Up the Predictions

As an example, we illustrate a follow-up of the effect of correcting the issue of loading photos in the chat, assuming it was corrected on October 1st.

8.1. Simulating Post-Intervention Data

To the data we already had for the first semester, we simulate additional data for the second semester. The simulation is similar to the one in section 4.1, but applying a reduction in detractors after October 1st to represent the effect of correcting the photo issues.

#####
# Simulation of data for the second semester
#####

# Define dataset size 
data.semester2.n <- 40000  # Large enough to allow subgroup analysis despite unbalanced category probabilities

# Initialize dataframe with IDs
data.semester2 <- data.frame(id = factor((data.n + 1):(data.n + data.semester2.n)))

# Simulate the date of the evaluation
data.semester2$date <- sample(seq(from = as.Date("2024-08-01"), to = as.Date("2024-12-31"), by = "day"), 
                    size = data.semester2.n, 
                    replace = TRUE)


# ------ 1. Demographic Information and Contextual Variables

# Simulate age with a log-normal distribution, capped between 15 and 90
data.semester2$age <- round(rlnorm(n = data.semester2.n, meanlog = log(40), sdlog = log(1.4)))
data.semester2$age[data.semester2$age > 90] <- 90
data.semester2$age[data.semester2$age < 15] <- 15

# Simulate gender
data.semester2$gender <- factor(sample(c("man", "woman"),
                             prob = c(0.5, 0.5),
                             replace = TRUE, size = data.semester2.n))

# Simulate reason for contacting
data.semester2$reason <- factor(sample(c("incorrect_item", "delay", "status", "cancellation", "other"),
                             prob = c(0.20, 0.30, 0.20, 0.10, 0.20),
                             replace = TRUE, size = data.semester2.n))

# Simulate contact method
data.semester2$method <- factor(sample(c("chat", "web_form", "email", "phone"),
                             prob = c(0.40, 0.10, 0.15, 0.35),
                             replace = TRUE, size = data.semester2.n))

# Simulate country (two fictional countries for anonymity)
data.semester2$country <- factor(sample(c("Drinkland", "Eatland"),
                              prob = c(0.5, 0.5),
                              replace = TRUE, size = data.semester2.n))

data$country <- factor(data$country, levels = c("Eatland", "Drinkland"))


# ------ 2. Customer Complaints
# These variables represent underlying complaints that were not available at this stage of the investigation. We use them only to simulate NPS and satisfaction scores and store them in a separate dataframe to reflect their unavailability. 

## Create a dataset to store the complaints in connection to the case id
data.complaints2 <- data.frame(id = data.semester2$id)

## Complaints about connection issues
# Base probability of a connection complaint: 10%
prob_connection <- rep(0.10, data.semester2.n)
# In Drinkland, probability is doubled
prob_connection[data.semester2$country == "Drinkland"] <- prob_connection[data.semester2$country == "Drinkland"] * 2
# Ensure probability values remain within [0, 1]
prob_connection [prob_connection > 1] <- 1
prob_connection [prob_connection < 0] <- 0
# Generate binary variable based on probabilities
data.complaints2$complain.connect <- rbinom(n = data.semester2.n, size = 1, prob = prob_connection)


## Complaints about slow response
# Base probability of a slow response complaint: 5%
prob_slow_response <- rep(0.05, data.semester2.n)
# Email contacts: 6x higher likelihood
prob_slow_response[data.semester2$method == "email"] <- prob_slow_response[data.semester2$method == "email"] * 6
# Chat and web_form: 3x higher likelihood
prob_slow_response[data.semester2$method %in% c("chat", "web_form")] <- prob_slow_response[data.semester2$method %in% c("chat", "web_form")] * 3
# Ensure probability values remain within [0, 1]
prob_slow_response [prob_slow_response > 1] <- 1
prob_slow_response [prob_slow_response < 0] <- 0
# Generate binary variable
data.complaints2$complain.sp.respond <- rbinom(n = data.semester2.n, size = 1, prob = prob_slow_response)


#### SPECIFIC CHANGES FOR SLOW RESOLUTION COMPLAINTS ####
# This part differs from the simulation of the first semester


## --- Complaints about slow resolution 

# Specify the base probability of a slow resolution complaint: 10%
prob_slow_resolution <- rep(0.10, data.semester2.n)

# Specify the multiplying factor If contact is via chat and reason is incorrect_item before the correction day
multiplying_factor <- 5

# Specify the day the correction is implemented
day.correction <- as.Date("2024-10-1")

# Specify the duration of the changing period
days.change <- 25

##  Apply a correction factor for the probability so that: 
  # its value is 1 before the correction date
  # progressively becomes equal to the multiplying factor until reaching the duration of the changing period. 
  # it maintains this value the rest of the time, so the probability stays on the base probability established earlier (.10)

# Create a vector that indicates the days passed after the change was implemented
days.passed <- as.numeric(data.semester2$date - day.correction)
days.passed [days.passed <= 0] <- 0

# Specify the values of the correction factors
correction_factor <- 1 + (days.passed * (multiplying_factor-1))/ days.change
correction_factor  [correction_factor > multiplying_factor] <- multiplying_factor

# Specify the selection of cases incorrect item cases attended by chat
is_chat_incorrect <- data.semester2$method == "chat" & data.semester2$reason == "incorrect_item"

# Modify the probability according to the expected parameters
prob_slow_resolution[is_chat_incorrect] <- prob_slow_resolution[is_chat_incorrect] * (multiplying_factor / correction_factor[is_chat_incorrect])

# Ensure probability values remain within [0, 1]
prob_slow_resolution [prob_slow_resolution > 1] <- 1
prob_slow_resolution [prob_slow_resolution < 0] <- 0

# Generate the binary variable for slow resolution complaints
data.complaints2$complain.sp.solve <- rbinom(n = data.semester2.n, size = 1, prob = prob_slow_resolution)

####### END OF SPECIFIC CHANGES FOR SLOW RESOLUTION COMPLAINTS ####
# From here, the simulation is similar as the first semester. 


## Complaints about having to repeat information
# Base probability of a repetition complaint: 2%
prob_repeat_info <- rep(0.02, data.semester2.n)
# If contact reason is cancellation → 10x higher likelihood
prob_repeat_info[data.semester2$reason == "cancellation"] <- prob_repeat_info[data.semester2$reason == "cancellation"] * 10
# Ensure probability values remain within [0, 1]
prob_repeat_info [prob_repeat_info > 1] <- 1
prob_repeat_info [prob_repeat_info < 0] <- 0
# Generate binary variable
data.complaints2$complain.repeat <- rbinom(n = data.semester2.n, size = 1, prob = prob_repeat_info)


## Other types of complaints (uniform distribution)
# Constant probability of 20% for other unspecified complaints
data.complaints2$complain.other <- rbinom(n = data.semester2.n, size = 1, prob = 0.20)



# ------ 3. NPS scores

# Simulate base NPS scores assuming no complaints (mean = 9, sd = 2)
data.semester2$nps.score <- rnorm(data.semester2.n, mean = 9, sd = 2)

# Apply fixed penalties based on complaints
# -3 for connection issues
# -4 for slow response
# -6 for slow resolution
# -2 for repeated information
# -1 for other complaints
data.semester2$nps.score <- data.semester2$nps.score -
  3 * data.complaints2$complain.connect -
  4 * data.complaints2$complain.sp.respond -
  6 * data.complaints2$complain.sp.solve -
  2 * data.complaints2$complain.repeat -
  1 * data.complaints2$complain.other 

# Ensure scores stay within the 0–10 range
data.semester2$nps.score[data.semester2$nps.score < 0] <- 0
data.semester2$nps.score[data.semester2$nps.score > 10] <- 10 

# Convert to integer values (using floor)
data.semester2$nps.score <- floor(data.semester2$nps.score)

# Add placeholder for open-text comments
data.semester2$open.comment <- rep("bla bla", data.semester2.n)


## Create NPS segments based on score values

data.semester2$nps.segment <- rep(NA_character_, data.semester2.n) # Initialize empty character variable

data.semester2$nps.segment[data.semester2$nps.score <= 6] <- "detractor" 
data.semester2$nps.segment[data.semester2$nps.score > 6 & data.semester2$nps.score < 9] <- "passive"
data.semester2$nps.segment[data.semester2$nps.score >= 9] <- "promoter" # Assign segment based on score range

data.semester2$nps.segment <- factor(data.semester2$nps.segment, levels = c("detractor", "passive", "promoter")) # Convert to factor

## Create a binary variable for detractors
data.semester2$detractor <- NA_real_
data.semester2$detractor[data.semester2$nps.segment == "detractor"] <- 1
data.semester2$detractor[data.semester2$nps.segment != "detractor"] <- 0



#------ Satisfaction scores (1 to 5 scale)

# Generate satisfaction scores based on NPS
# We use a normal distribution centered at 3, with some variability (sd = 0.5)
# and add a scaled component based on NPS to simulate a positive relationship.
data.semester2$satisfaction <- floor(
  rnorm(data.semester2.n, mean = 3, sd = 0.5) + 
  0.9 * (1 + as.numeric(scale(data.semester2$nps.score)))
)

# Ensure scores stay within the 1–5 range
data.semester2$satisfaction[data.semester2$satisfaction < 1] <- 1
data.semester2$satisfaction[data.semester2$satisfaction > 5] <- 5

# Check correlation with NPS scores (Spearman method)
with(data.semester2, cor(nps.score, satisfaction, method = "spearman"))

## [1] 0.8270852

# Set satisfaction to missing for Drinkland
data.semester2$satisfaction[data.semester2$country == "Drinkland"] <- NA_real_

# Set satisfaction to missing for email and web_form contacts
data.semester2$satisfaction[data.semester2$method %in% c("email", "web_form")] <- NA_real_

# Introduce 10% additional random missingness among remaining valid values
na_randomizer <- sample(
  c(0, 1),
  size = sum(!is.na(data.semester2$satisfaction)),
  replace = TRUE,
  prob = c(0.1, 0.9)
) # Create a random vector (0 = missing, 1 = keep) for non-NA entries
data.semester2$satisfaction[which(!is.na(data.semester2$satisfaction))[na_randomizer == 0]] <- NA_real_ # Apply randomizer


#####
# Saving the data for the whole year
#####

# Merge the databases for the 2 semesters
data.2024 <- bind_rows(data, data.semester2)

#Check the results for the whole dataset
summary(data.2024)

##        id             date                 age         gender     
##  1      :    1   Min.   :2024-01-01   Min.   :15.0   man  :40145  
##  2      :    1   1st Qu.:2024-04-16   1st Qu.:32.0   woman:39855  
##  3      :    1   Median :2024-07-31   Median :40.0                
##  4      :    1   Mean   :2024-07-16   Mean   :42.2                
##  5      :    1   3rd Qu.:2024-10-15   3rd Qu.:50.0                
##  6      :    1   Max.   :2024-12-31   Max.   :90.0                
##  (Other):79994                                                    
##             reason           method           country        nps.score    
##  cancellation  : 8016   chat    :31883   Eatland  :39902   Min.   : 0.00  
##  delay         :23972   email   :12160   Drinkland:40098   1st Qu.: 4.00  
##  incorrect_item:16108   phone   :28031                     Median : 7.00  
##  other         :15939   web_form: 7926                     Mean   : 6.42  
##  status        :15965                                      3rd Qu.: 9.00  
##                                                            Max.   :10.00  
##                                                                           
##  open.comment          nps.segment      detractor       satisfaction  
##  Length:80000       detractor:35397   Min.   :0.0000   Min.   :1.000  
##  Class :character   passive  :20857   1st Qu.:0.0000   1st Qu.:3.000  
##  Mode  :character   promoter :23746   Median :0.0000   Median :4.000  
##                                       Mean   :0.4425   Mean   :3.465  
##                                       3rd Qu.:1.0000   3rd Qu.:4.000  
##                                       Max.   :1.0000   Max.   :5.000  
##                                                        NA's   :53080

The data is simulated as expected. We can see a slightly lower percentage of detractors in the data for the whole year in comparison to the data for just the first semester. Let’s analyze this difference with more detail.

8.2. Tracking Metrics Over Time

As a first exploration of the change in detractors after correcting the photo issue, we calculate the percentage of detractors across different weeks along the year. For this purpose, we use a reusable function from my R toolkit for percentages with 95% CI across multiple factors

# --- 1. Specify the function

f.prop.dicdvl.1f <- function(my.data, my.dv.list, my.factor){
  
  # Initialize an empty list to store the sampled subsets from each stratum.
  results <- list() 
  
  # Initialize a counter for indexing elements in the 'results' list.
  p <- 1 
  
  # Obtain the levels of the factor
  f.levels <- unique (my.data[[my.factor]])
  
  #Create a loop for each dv and for each level of the factors
    for (dv in my.dv.list){
      for (lf in f.levels){
          
          # Extract the vector, successes and number of cases
          current.vector <- my.data[my.data[[my.factor]]==lf, dv]
          current.vector.c <- as.numeric(current.vector[!is.na(current.vector)]) #cleaning missing cases
          n <- length(current.vector.c) # number of cases in the vector
          s <- length(current.vector.c[current.vector.c == 1]) # number of cases satisfying the condition
          
          # Calculate proportions and their 95% CI
          if(n > 0){
            proportion <- s / n #calculating proportions
            binomial.test <- binom.test (s, n) # binomial test to calculate 95% CI of the proportions
            ci_lower <- binomial.test$conf.int[1] #extracting the lower limit of the CI
            ci_upper <- binomial.test$conf.int[2] #extracting the upper limit of the CI    
          }else{
            proportion <- NA
            ci_lower <- NA
            ci_upper <- NA
          }

          # Save the results in a named vector
          results [[p]] <- data.frame(dep.variable = dv, factor = lf, percentage = proportion * 100, 
                           ci_lower.perc = ci_lower * 100, ci_upper.perc = ci_upper * 100, n.cases = n, stringsAsFactors = FALSE)
          p <- p + 1 
          
      } # close the loop for the factor
    } # close the loop for the dv

  #Combine the results in a dataframe and convert outcomes to numeric columns
  results.df <- (do.call(rbind, results))
  rownames(results.df) <- NULL

  #Return the dataframe
  return(results.df)
}


# --- 2. Apply the function to a factor that specifies the weeks 

# Order the data by date to recognize more easy the weeks
data.2024 <- data.2024[order(data.2024$date), ]

# Group dates in weeks
data.2024$date.week <- as.Date(cut(data.2024$date, "week"))

# Apply the function to calculate percentages
dv.list = c("detractor")
prop.detractors.by.week <- f.prop.dicdvl.1f (my.data = data.2024,
                  my.dv.list = dv.list,
                  my.factor = "date.week")

# Respecify the dates as a date
prop.detractors.by.week$factor <- as.Date(prop.detractors.by.week$factor)

# Visualize the results
prop.detractors.by.week

We can see that after October 1st there is a decreasing tendency in the detractors. This trend becomes clearer when visualized in a line graph that shows the weekly percentages with their 95% confidence intervals, together with a marker of the correction date.

# Specify the date the correction was implemented
date_correction <- as.Date("2024-09-30")

# Create the line graph
ggplot(prop.detractors.by.week, aes(x = factor, y = percentage)) +
  geom_line(color = "steelblue", linewidth = 1.2) +
    geom_point(aes(), color = "darkblue", alpha = 0.6) +
  geom_ribbon(aes(ymin = ci_lower.perc, ymax = ci_upper.perc), 
              alpha = 0.2, fill = "steelblue") +
  # Vertical line to indicate the correction
  geom_vline(xintercept = date_correction, 
             color = "red", linewidth = 1, linetype = "dashed") +
  # Anotation of the correction
  annotate("text", x = date_correction, y = max(prop.detractors.by.week$percentage, na.rm = TRUE),
           label = "Correction of\nphoto issues", 
           hjust = -0.1, vjust = 1.2, color = "red", fontface = "bold", size = 4) +
  scale_x_date(
    date_breaks = "1 month", 
    date_labels = "%b %Y",
    expand = expansion(mult = c(0.05, 0.15))  # Más espacio a la derecha para la anotación
  ) +
  scale_y_continuous(labels = scales::percent_format(scale = 1)) +
  labs(
    title = "Weekly Evolution in the Percentage of Detractors",
    x = "Week",
    y = "Porcentage of Detractors"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

The graph depicts how the decrease in detractors after correcting the photo issue is maintained along the rest of the period, supporting the claim that this correction impacted the probability to recommend our customer service.

To further validate this claim, we can test its statistical significance.

8.3. Validating the Impact with Statistical Testing

For this test, we compare the proportion of detractors before and after October 1st, the date of the correction. We first summarize the percentages and their confidence intervals for each period, and then formally evaluate the change with a Chi-Square test.

# Create a dicotomous variable for before and after the correction

data.2024$moment.photo.correction <- NA_character_
data.2024$moment.photo.correction [data.2024$date < "2024-10-1"] <- "before"
data.2024$moment.photo.correction [data.2024$date >= "2024-10-1"] <- "after"
data.2024$moment.photo.correction <- factor (data.2024$moment.photo.correction)

# Apply the function to calculate percentages, using detractors as our dv and the moment of the correction as a factor: 
dv.list = c("detractor")
prop.detractors.after.photo.cor <- f.prop.dicdvl.1f (my.data = data.2024,
                  my.dv.list = dv.list,
                  my.factor = "moment.photo.correction")

# Visualize the results
prop.detractors.after.photo.cor

# Calculate the exact difference in the proportion of detractors
prop.difference.before.after <- prop.detractors.after.photo.cor [1,3] - prop.detractors.after.photo.cor [2,3]
prop.difference.before.after

## [1] 2.442318

# Run a chi square test to observe test the significance of the change
chisq.test(with(data.2024, table(detractor, moment.photo.correction)))

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  with(data.2024, table(detractor, moment.photo.correction))
## X-squared = 40.369, df = 1, p-value = 2.103e-10

The results show that the decrease in detractors after correcting the photo issue was statistically significant. Interestingly, the observed reduction (about 2.4%) was even stronger than our prediction (about 1.4%).

A plausible explanation is that our earlier prediction was based only on customers who explicitly reported the problem. In practice, however, the correction may have also benefited other customers indirectly affected by the issue, amplifying its positive impact. It is also important to consider that, since we do not have an experimental design, we cannot rule out that other improvements contributed to the reduction of detractors.

9. Discussion

9.1. Managerial Implications

Our analysis demonstrates that correcting the photo issue in chat directly reduced the percentage of detractors by 2.4%. This provides managers with a clear, high-ROI business case: targeted investments in resolving specific operational failures can generate significant improvements in customer loyalty and brand reputation.

9.2. Limitations

It is important to recognize some limitations. First, the analyses are based on observational data, so other contextual factors (e.g., seasonality, parallel process improvements) might also contribute to the observed change. Second, some of our predictions were based on self-reported cases, which might underestimate the total number of customers affected. Finally, while we focused on detractors, other loyalty metrics (e.g., promoters, NPS distribution) could provide a more complete picture.

9.3. Next Steps

Future work could extend these analyses by:

Evaluating the impact of other operational fixes in parallel, to prioritize interventions by ROI.
Using more advanced causal inference techniques (e.g., difference-in-differences, synthetic control) to strengthen the causal attribution of changes.
Exploring heterogeneity across customer segments, to identify whether some groups benefited more strongly from the correction.

These steps would allow companies not only to validate the robustness of the results, but also to scale the approach into a systematic framework for continuous service improvement.

Predicting Customer Detractors (Part 2): Opportunity Analysis from Open Feedback