MANN-WHITNEY U TEST

This analysis pertains to Research Scenario 2 from Assignment 6. The objective is to determine whether there is a statistically significant difference in the distribution of customer satisfaction scores between two independent groups: customers served by human agents and those assisted by an AI chatbot. Given that the data violated normality assumptions and exhibited outliers, the non-parametric Mann-Whitney U test was selected as the appropriate inferential method.

Hypotheses

R code and Analysis

Import Excel File

Purpose:

Import your Excel dataset into R for conduct analyses.

# INSTALL REQUIRED PACKAGE
 
# install.packages("readxl")
 
# LOAD THE PACKAGE
 
library(readxl)
 
# IMPORT EXCEL FILE INTO R STUDIO
 
A6R2 <- read_excel("C:/Users/konifade/Downloads/A6R2.xlsx")
 
 head(A6R2)
## # A tibble: 6 × 3
##   CustomerID ServiceType SatisfactionScore
##        <dbl> <chr>                   <dbl>
## 1          1 Human                       9
## 2          2 Human                       6
## 3          3 Human                       6
## 4          4 Human                       9
## 5          5 Human                       9
## 6          6 Human                       8

Calculate Descriptive Statistics

# install.packages("dplyr")

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
A6R2 %>%
  group_by(ServiceType) %>%
  summarise(
    Mean = mean(SatisfactionScore, na.rm = TRUE),
    Median = median(SatisfactionScore, na.rm = TRUE),
    SD = sd(SatisfactionScore, na.rm = TRUE),
    N = n()
  )
## # A tibble: 2 × 5
##   ServiceType  Mean Median    SD     N
##   <chr>       <dbl>  <dbl> <dbl> <int>
## 1 AI           3.6       3  1.60   100
## 2 Human        7.42      8  1.44   100

Check Normal Distribution

# HISTOGRAMS
# Purpose: Visually check the normality of the scores for each group.

 hist(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"],
main = "Histogram of Group 1 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)

hist(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)

QUESTIONS

Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

  • Ans. The histogram is negatively skewed.

Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?

  • Ans. The histogram looks too tall

Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

  • Ans. The histogram positively skewed.

Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve.

  • Ans. The histogram look too flat.

SHAPIRO-WILK TEST

Purpose:

Check the normality for each group’s score statistically.

shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "Human"])
## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "Human"]
## W = 0.93741, p-value = 0.0001344
shapiro.test(A6R2$SatisfactionScore[A6R2$ServiceType == "AI"])
## 
##  Shapiro-Wilk normality test
## 
## data:  A6R2$SatisfactionScore[A6R2$ServiceType == "AI"]
## W = 0.91143, p-value = 5.083e-06

QUESTION

Q1) Was the data normally distributed for Variable 1?

  • Ans. No, the data was not normally distributed for Variable (p = 0.0001344 < 0.05) Q2) Was the data normally distributed for Variable 2?

  • Ans. No, the data was not normally distributed for Variable 2 (p = 5.083e-06 < 0.05)

BOXPLOT

Purpose:

Check for any outliers impacting the mean for each group’s scores.

# install.packages("ggplot2")

# install.packages("ggpubr")

library(ggplot2)

library(ggpubr)

ggboxplot(A6R2, x = "ServiceType", y = "SatisfactionScore",
          color = "ServiceType",
          palette = "jco",
          add = "jitter")

QUESTION

Q1) Were there any dots outside of the boxplot? Are these dots close to the whiskers of the boxplot (check if there are any dots past the lines on the boxes) or are they very far away?

  • Ans. Yes, several dots were outside, some are far from whisker

MANN-WHITNEY U TEST

PURPOSE:

Test if there was a difference between the distributions of the two groups.

wilcox.test(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  SatisfactionScore by ServiceType
## W = 497, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

EFFECT-SIZE

PURPOSE:

Determine how big of a difference there was between the group distributions.

 # install.packages("effectsize")

library(effectsize)

# CALCULATE EFFECT SIZE (R VALUE)

rank_biserial(SatisfactionScore ~ ServiceType, data = A6R2, exact = FALSE)
## r (rank biserial) |         95% CI
## ----------------------------------
## -0.90             | [-0.93, -0.87]

QUESTIONS

Q1) What is the size of the effect?

  • Ans. The effect size was large (r = -0.90 in absolute value), indicating a substantial difference between the two groups.

Q2) Which group had the higher average rank?

  • Ans. Human service had the higher average rank with Mean = 7.42 and median = 8

FINAL REPORT

A Mann-Whitney U test was conducted to compare customer satisfaction scores between customers served by human agents (n = 100) and those served by an AI chatbot (n = 100). Customers served by human agents had a median satisfaction score of (Mdn = 8), while customers served by the AI chatbot had a median satisfaction score of (Mdn = 3) The test indicated a statistically significant difference between the two groups, U = 497, p < 0.001. The effect size was large (r = -0.90), indicating a substantial difference in satisfaction scores. Overall, customers served by human agents reported significantly higher satisfaction than those served by the AI chatbot.