Week 6 - Scenario 2

Research Scenario 2

A customer service firm wants to test whether customer satisfaction scores differ between those served by human agents versus those served by an AI chatbot. After interactions, customers rate their satisfaction (single-item rating scale). Is there a difference in the average satisfaction scores of the two groups?

Hypotheses

Null Hypothesis (H₀)

There is no difference in average customer satisfaction scores between customers served by human agents and those served by an AI chatbot.

Alternate Hypothesis (H₁)

There is a difference in average customer satisfaction scores between customers served by human agents and those served by an AI chatbot.

Results Paragraph

A Mann-Whitney U test was conducted to test whether customer satisfaction scores differ for customers(n=126) between those served by human agents versus those served by an AI chatbot.Customers who were served by human agents had significantly higher median scores (Mdn = 8) than those served by AI agents (Mdn = 3).The effect size was large (r = 0.90), indicating a meaningful difference between those served by Human Agents than those of AI agent.Overall, The customer satisfaction was higher for those customers who were served by human agents.

R Code

# INSTALL REQUIRED PACKAGE
# install.packages("readxl")

# LOAD THE PACKAGE
library(readxl)

# IMPORT EXCEL FILE INTO R STUDIO
A6R2 <- read_excel("C:/Users/rushi/Downloads/A6R2.xlsx")

# INSTALL REQUIRED PACKAGE
# install.packages("dplyr")

# LOAD THE PACKAGE
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

# CALCULATE THE DESCRIPTIVE STATISTICS
A6R2 %>%
  group_by(ServiceType) %>%
  summarise(
    Mean = mean(SatisfactionScore, na.rm = TRUE),
    Median = median(SatisfactionScore, na.rm = TRUE),
    SD = sd(SatisfactionScore, na.rm = TRUE),
    N = n()
  )

## # A tibble: 2 × 5
##   ServiceType  Mean Median    SD     N
##   <chr>       <dbl>  <dbl> <dbl> <int>
## 1 AI           3.6       3  1.60   100
## 2 Human        7.42      8  1.44   100

# CREATE THE HISTOGRAMS 
hist(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"],
main = "Histogram of Group 1 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)

hist(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)

# QUESTIONS

# Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
#     It looks like it is almost symmetrical and a bit negatively skewed.

# Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
#     It almost has a proper bell curve.

# Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
#     It doesn't look symmetrical and it is positively skewed.

# Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
#     It appears a bit flater.

# SHAPIRO-WILK TEST
shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"])

## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "Human"]
## W = 0.93741, p-value = 0.0001344

shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"])

## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "AI"]
## W = 0.91143, p-value = 5.083e-06

# QUESTION

# Was the data normally distributed for Variable 1?
# It is not normal.
# Was the data normally distributed for Variable 2?
# It is not normal.

# INSTALL REQUIRED PACKAGE
#install.packages("ggplot2")
#install.packages("ggpubr")

# LOAD THE PACKAGE
library(ggplot2)
library(ggpubr)

# CREATE THE BOXPLOT
ggboxplot(A6R2, x = "ServiceType", y = "SatisfactionScore",
          color = "ServiceType",
          palette = "jco",
          add = "jitter")

# QUESTION

# Q1) Were there any dots outside of the boxplot? Are these dots close to the whiskers of the boxplot or are they very far away?
#     There are many dots (more than one or two) and they are very far away from the whiskers, so we switched to Mann Whitney U test.

# MANN-WHITNEY U TEST
wilcox.test(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  SatisfactionScore by ServiceType
## W = 497, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

# INSTALL REQUIRED PACKAGE
# install.packages("effectsize")

# LOAD THE PACKAGE
library(effectsize)

# CALCULATE EFFECT SIZE (R VALUE)
rank_biserial(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE)

## r (rank biserial) |         95% CI
## ----------------------------------
## -0.90             | [-0.93, -0.87]

# QUESTIONS

# Q1) What is the size of the effect?
# The effect size is 0.90 and it indicates that the difference between the groups was big.

# Q2) Which group had the higher average rank?
#     The human agents served group had higher average rank(mean=7.42 and median=8)