MANN-WHITNEY U TEST

HYPOTHESIS TESTED:

NULL HYPOTHESIS (H0) There is no difference between customer satisfaction scores by those served by human agents and those served by an AI chatbot

ALTERNATE HYPOTHESIS (H1) NON-DIRECTIONAL ALTERNATE HYPOTHESIS: There is a difference between the customer satisfaction scores of Human agents and the AI Chatbot.

DIRECTIONAL ALTERNATE HYPOTHESES ONE: The customer satisfaction scores by those served by human agents is higher than those served by AI chatbot

IMPORT EXCEL FILE

INSTALL REQUIRED PACKAGE

install.packages(“readxl”)

LOAD THE PACKAGE

library(readxl)

IMPORT EXCEL FILE INTO R STUDIO

A6R2 <- read_excel("C:\\Users\\ADEBAYO\\Documents\\A6R2.xlsx")

DESCRIPTIVE STATISTICS

INSTALL REQUIRED PACKAGE

install.packages(“dplyr”)

LOAD THE PACKAGE

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

CALCULATE THE DESCRIPTIVE STATISTICS

A6R2 %>%
  group_by(ServiceType) %>%
  summarise(
    Mean = mean(SatisfactionScore, na.rm = TRUE),
    Median = median(SatisfactionScore, na.rm = TRUE),
    SD = sd(SatisfactionScore, na.rm = TRUE),
    N = n()
  )

HISTOGRAMS

CREATE THE HISTOGRAMS

hist(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"],
main = "Histogram of Human Agents",
xlab = "Human Agents",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)

hist(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"],
main = "Histogram of AI Chatbot",
xlab = "AI Chatbot",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)

Q1) It is negatively skewed Q2) It has a proper bell curve Q3) It is Positively skewed Q4) It has a proper bell curve

SHAPIRO-WILK TEST

CONDUCT THE SHAPIRO-WILK TEST

shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"])

## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "Human"]
## W = 0.93741, p-value = 0.0001344

shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"])

## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "AI"]
## W = 0.91143, p-value = 5.083e-06

The data is not normally distributed for Human agents The data is not normally distributed for AI Chatbot

BOXPLOT

INSTALL REQUIRED PACKAGE

install.packages(“ggplot2”)

install.packages(“ggpubr”)

LOAD THE PACKAGE

library(ggplot2)
library(ggpubr)

CREATE THE BOXPLOT

ggboxplot(A6R2, x = "ServiceType", y = "SatisfactionScore",
          color = "ServiceType",
          palette = "jco",
          add = "jitter")

There are dots outside of the boxplots

MANN-WHITNEY U TEST

wilcox.test(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  SatisfactionScore by ServiceType
## W = 497, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

DETERMINE STATISTICAL SIGNIFICANCE

EFFECT-SIZE

INSTALL REQUIRED PACKAGE

#install.packages(“effectsize”)

LOAD THE PACKAGE

library(effectsize)

CALCULATE EFFECT SIZE (R VALUE)

#mw_results <- wilcox.test(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE) #rank_biserial(mw_results)

install.packages(“coin”) # required by rstatix for this function

library(rstatix)

## 
## Attaching package: 'rstatix'

## The following objects are masked from 'package:effectsize':
## 
##     cohens_d, eta_squared

## The following object is masked from 'package:stats':
## 
##     filter

wilcox_effsize(A6R2, SatisfactionScore ~ ServiceType, ci = TRUE)

Assignment 6B

Team 1

2025-09-26

install.packages(“readxl”)

install.packages(“dplyr”)

install.packages(“ggplot2”)

install.packages(“ggpubr”)

Q1) The size of the effect is large (0.784)?

The effect means how big or small was the difference between the two groups.

± 0.00 to 0.10 = ignore

± 0.10 to 0.30 = small

± 0.30 to 0.50 = moderate

± 0.50 to + = large