Research Scenario 2

Research Question

A customer service firm wants to test whether customer satisfaction scores differ between those served by human agents versus those served by an AI chatbot. After interactions, customers rate their satisfaction (single-item rating scale). Is there a difference in the average satisfaction scores of the two groups?

Hypothesis

Null Hypothesis(H0) : There is no difference in average customer satisfaction scores between customers served by Human service and those served by AI.

Alternate Hypothesis(H1) : There is a difference in average customer satisfaction scores between customers served by Human service and those served by AI.

# Install .packages("readxl")
# Load required packages
library(readxl)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

#Import the Excel file
A6R2 <- read_excel("C:/Users/sravz/Downloads/A6R2.xlsx")

# calculate descriptive statistics
A6R2 %>%
  group_by(ServiceType) %>%
  summarise(
    Mean = mean(SatisfactionScore, na.rm = TRUE),
    Median = median(SatisfactionScore, na.rm = TRUE),
    SD = sd(SatisfactionScore, na.rm = TRUE),
    N = n()
  )

## # A tibble: 2 × 5
##   ServiceType  Mean Median    SD     N
##   <chr>       <dbl>  <dbl> <dbl> <int>
## 1 AI           3.6       3  1.60   100
## 2 Human        7.42      8  1.44   100

HISTOGRAMS

hist(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"],
main = "Histogram of Human Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightpink",
border = "black",
breaks = 20)

hist(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"],
main = "Histogram of AI Scores",
xlab = "Value",
ylab = "Frequency",
col = "orange",
border = "black",
breaks = 20)

QUESTIONS

Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

It looks like it is negatively skewed.

Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?

Histogram appears to be too tall.

Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

The histogram appears to be positively skewed.

Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?

Histogram appears to be too tall.

SHAPIRO-WILK TEST

shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"])

## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "Human"]
## W = 0.93741, p-value = 0.0001344

shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"])

## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "AI"]
## W = 0.91143, p-value = 5.083e-06

QUESTIONs

Was the data normally distributed for Variable 1?

Data is NOT normally distributed.

Was the data normally distributed for Variable 2?

Data is NOT normally distributed.

library(ggplot2)
library(ggpubr)

BOXPLOT

ggboxplot(A6R2, x = "ServiceType", y = "SatisfactionScore",
          color = "ServiceType",
          palette = "jco",
          add = "jitter")

QUESTIONs

Q1) Were there any dots outside of the boxplot? Are these dots close to the whiskers of the boxplot or are they very far away?

There are many dots and they are very far away from the whiskers, so switched to Mann Whitney U test.

MANN-WHITNEY U TEST

wilcox.test(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  SatisfactionScore by ServiceType
## W = 497, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

library(effectsize)

EFFECT SIZE (R VALUE)

rank_biserial(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE)

## r (rank biserial) |         95% CI
## ----------------------------------
## -0.90             | [-0.93, -0.87]

QUESTIONS

Q1) What is the size of the effect?

A rank-biserial correlation of - 0.90 indicates the difference between the groups was large

Q2) Which group had the higher average rank?

The human agents served group has higher average value.

Final Report

A Mann-Whitney U test was conducted to test whether customer satisfaction scores differ for customers(n=126) between those served by human agents versus those served by an AI chatbot.Customers who were served by human agents had significantly higher median scores (Mdn = 8) than those served by AI agents (Mdn = 3).The effect size was large (r = - 0.90), indicating a meaningful difference between those served by Human Agents than those of AI agent.Overall, The customer satisfaction was higher for those customers who were served by human agents.

Research Scenario 2

Sravya Valluri

2025-11-20

Research Question

Hypothesis

Final Report