Scenario 2: Human vs. AI Service
A customer service firm wants to test whether customer satisfaction scores differ between those served by human agents versus those served by an AI chatbot. After interactions, customers rate their satisfaction (single-item rating scale). Is there a difference in the average satisfaction scores of the two groups?
HYPOTHESIS TESTED:
Used to test if there is a difference between the means of two groups.
NULL HYPOTHESIS (H0)
There is no difference between the scores of Group A and Group B
ALTERNATE HYPOTHESIS (H1)
There is a difference between the scores of Group A and Group B.
IMPORT EXCEL FILE
Purpose: Import your Excel data set into R to conduct analyses.
INSTALL REQUIRED PACKAGE
options(repos = c(CRAN = "https://cloud.r-project.org"))
install.packages("readxl")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'readxl' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'readxl'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying
## C:\Users\N Geetha
## Shivani\AppData\Local\R\win-library\4.5\00LOCK\readxl\libs\x64\readxl.dll to
## C:\Users\N Geetha
## Shivani\AppData\Local\R\win-library\4.5\readxl\libs\x64\readxl.dll: Permission
## denied
## Warning: restored 'readxl'
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpAjYFFL\downloaded_packages
LOAD THE PACKAGE
library(readxl)
IMPORT EXCEL FILE INTO R STUDIO
dataset <- read_excel("C:\\Users\\N Geetha Shivani\\Downloads\\A6R2.xlsx")
DESCRIPTIVE STATISTICS
PURPOSE: Calculate the mean, median, SD, and sample size for each group.
INSTALL REQUIRED PACKAGE
install.packages("dplyr")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'dplyr' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'dplyr'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying
## C:\Users\N Geetha
## Shivani\AppData\Local\R\win-library\4.5\00LOCK\dplyr\libs\x64\dplyr.dll to
## C:\Users\N Geetha
## Shivani\AppData\Local\R\win-library\4.5\dplyr\libs\x64\dplyr.dll: Permission
## denied
## Warning: restored 'dplyr'
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpAjYFFL\downloaded_packages
LOAD THE PACKAGE
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
CALCULATE THE DESCRIPTIVE STATISTICS
dataset %>%
group_by(ServiceType) %>%
summarise(
Mean = mean(SatisfactionScore, na.rm = TRUE),
Median = median(SatisfactionScore, na.rm = TRUE),
SD = sd(SatisfactionScore, na.rm = TRUE),
N = n()
)
## # A tibble: 2 × 5
## ServiceType Mean Median SD N
## <chr> <dbl> <dbl> <dbl> <int>
## 1 AI 3.6 3 1.60 100
## 2 Human 7.42 8 1.44 100
HISTOGRAMS
Purpose: Visually check the normality of the scores for each group.
CREATE THE HISTOGRAMS
hist(dataset$SatisfactionScore[dataset$ServiceType == "Human"],
main = "Histogram of Human Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)
hist(dataset$SatisfactionScore[dataset$ServiceType == "AI"],
main = "Histogram of AI Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
QUESTIONS
Q1) Check the SKEWNESS of the Human Scores histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
A)The histogram for Human Scores looks negatively skewed
Q2) Check the KURTOSIS of the Human Scores histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
A)The histogram does not have a proper bell shaped curve
Q3) Check the SKEWNESS of the AI Scores histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
A)The histogram for AI scores looks positively skewed
Q4) Check the KUROTSIS of the AI Scores histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
A)The histogram does not have a proper bell shaped curve
SHAPIRO-WILK TEST
Purpose: Check the normality for each group’s score statistically.
The Shapiro-Wilk Test is a test that checks skewness and kurtosis at the same time.The test is checking “Is this variable the SAME as normal data (null hypothesis) or DIFFERENT from normal data (alternate hypothesis)?”For this test, if p is GREATER than .05 (p > .05), the data is NORMAL.If p is LESS than .05 (p < .05), the data is NOT normal.
CONDUCT THE SHAPIRO-WILK TEST
{r} shapiro.test(dataset$SatisfactionScore[dataset$ServiceType == “Human”]) shapiro.test(dataset$SatisfactionScore[dataset$ServiceType == “AI”])
QUESTION
Q)Was the data normally distributed for Group A?
A)No, the data is not Normally distributed for Human score
Q)Was the data normally distributed for Group B?
A)No, the data is not Normally distributed for AI score
Note:
If p > 0.05 (P-value is GREATER than .05) this means the data is NORMAL.
If p < 0.05 (P-value is LESS than .05) this means the data is NOT normal.
BOXPLOT
Purpose: Check for any outliers impacting the mean for each group’s scores.
INSTALL REQUIRED PACKAGE
install.packages("ggplot2")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'ggplot2' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpAjYFFL\downloaded_packages
install.packages("ggpubr")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'ggpubr' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpAjYFFL\downloaded_packages
LOAD THE PACKAGE
library(ggplot2)
library(ggpubr)
CREATE THE BOXPLOT
ggboxplot(dataset, x = "ServiceType", y = "SatisfactionScore",
color = "ServiceType",
palette = "jco",
add = "jitter")
QUESTION
Q1) Were there any dots outside of the boxplot? Are these dots close to the whiskers of the boxplot or are they very far away?
[NOTE: If there are no dots, continue with Independent t-test. If there are a few dots (two or less), and they are close to the whiskers, continue with the Independent t-test. If there are a few dots (two or less), and they are far away from the whiskers, consider switching to Mann Whitney U test. If there are many dots (more than one or two) and they are very far away from the whiskers, you should switch to the Mann Whitney U test.]
A)For Human scores the box plot has many dots far away from whiskers while for the AI score there is a lesser proportion of dots outside, hence switching to Mann Whitney U test.
MANN-WHITNEY U TEST
PURPOSE: Test if there was a difference between the distributions of the two groups.
wilcox.test(SatisfactionScore ~ ServiceType, data = dataset, exact = FALSE)
##
## Wilcoxon rank sum test with continuity correction
##
## data: SatisfactionScore by ServiceType
## W = 497, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
DETERMINE STATISTICAL SIGNIFICANCE
If results were statistically significant (p < .05), continue to effect size section below.
If results were NOT statistically significant (p > .05), skip to reporting section below.
NOTE: The Mann-Whitney U test is used when your data is abnormally distributed or when the assumptions of the t-test are not met.it is not chosen based on whether the t-test was significant.
EFFECT-SIZE
PURPOSE: Determine how big of a difference there was between the group distributions.
INSTALL REQUIRED PACKAGE
install.packages("effectsize")
## Installing package into 'C:/Users/N Geetha Shivani/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'effectsize' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\N Geetha Shivani\AppData\Local\Temp\RtmpAjYFFL\downloaded_packages
LOAD THE PACKAGE
library(effectsize)
CALCULATE EFFECT SIZE (R VALUE)
rank_biserial(SatisfactionScore ~ ServiceType, data = dataset, exact = FALSE)
## r (rank biserial) | 95% CI
## ----------------------------------
## -0.90 | [-0.93, -0.87]
QUESTIONS
Q1) What is the size of the effect?
A)A rank-biserial correlation of -0.90 indicates the difference between the groups was large.
Q2) Which group had the higher average rank?
A)Satisfaction scores for Human services have higher average rank
WRITTEN REPORT FOR MANN-WHITNEY U TEST
A Mann–Whitney U test was conducted to compare satisfaction scores between human customer service (n = 100) and AI customer service (n = 100). Median satisfaction was higher for the human-assisted group (Mdn = 8) than for the AI chat bot group (Mdn = 3), U = 497, p < .001. The effect size was large (r = –0.90), indicating a pronounced difference between the two conditions. Overall, the human customer service condition produced significantly higher satisfaction scores compared to the AI chatbot condition.