INDEPENDENT T-TEST & MANN-WHITNEY U TEST

QUESTION

What are the null and alternate hypotheses for YOUR research scenario? H0:There is no difference in SatisfactionScore between Human and AI . H1:There is a difference in SatisfactionScore between Human service and AI service.

IMPORT EXCEL FILE

#install.packages("readxl")

LOAD THE PACKAGE

library(readxl)

dataset <- read_excel("C:\\Users\\navya\\Downloads\\A6R2.xlsx")

DESCRIPTIVE STATISTICS

PURPOSE: Calculate the mean, median, SD, and sample size for each group.

#install.packages("dplyr")
#install.packages("tidyverse")

LOAD THE PACKAGE

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.1     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.2.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(dplyr)

CALCULATE THE DESCRIPTIVE STATISTICS

dataset %>%
  group_by(ServiceType) %>%
  summarise(
    Mean = mean(SatisfactionScore, na.rm = TRUE),
    Median = median(SatisfactionScore, na.rm = TRUE),
    SD = sd(SatisfactionScore, na.rm = TRUE),
    N = n()
  )

## # A tibble: 2 × 5
##   ServiceType  Mean Median    SD     N
##   <chr>       <dbl>  <dbl> <dbl> <int>
## 1 AI           3.6       3  1.60   100
## 2 Human        7.42      8  1.44   100

HISTOGRAMS

hist(dataset$SatisfactionScore[dataset$ServiceType == "Human"],
    main = "Histogram of Human",
    xlab = "Value",
    ylab = "Frequency",
    col = "lightblue",
    border = "black",
    breaks = 20)

hist(dataset$SatisfactionScore[dataset$ServiceType == "AI"],
    main = "Histogram of AI",
    xlab = "Value",
    ylab = "Frequency",
    col = "pink",
    border = "black",
    breaks = 20)

# QUESTIONS
#Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
#Answer: The histogram of Human is negatively skewed the tails extends towards left.

#Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
#Answer: The histogram of Human looks too tall where it deviates from proper bell curve 

#Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
#Answer:The histogram of AI is roughly symmetrical where it is balanced properly.

#Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
#Answer:The histogram of AI looks too flat.

SHAPIRO-WILK TEST

shapiro.test(dataset$SatisfactionScore[dataset$ServiceType == "Human"])

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$SatisfactionScore[dataset$ServiceType == "Human"]
## W = 0.93741, p-value = 0.0001344

shapiro.test(dataset$SatisfactionScore[dataset$ServiceType == "AI"])

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$SatisfactionScore[dataset$ServiceType == "AI"]
## W = 0.91143, p-value = 5.083e-06

# QUESTION
#Was the data normally distributed for Variable 1?
#No, because the p=0.0001344<0.05 it is not normally distributed.

#Was the data normally distributed for Variable 2?
#No, because the p=5.083e-06<0.05 it is not normally distributed

BOXPLOT

#install.packages("ggplot2")
#install.packages("ggpubr")

library(ggplot2)
library(ggpubr)

CREATE THE BOXPLOT

ggboxplot(dataset, x = "ServiceType", y = "SatisfactionScore",
          color = "ServiceType",
          palette = "jco",
          add = "jitter")

# QUESTION
#Q1) Were there any dots outside of the boxplot? Are these dots close to the whiskers of the boxplot (check if there are any dots past the lines on the boxes) or are they very far away?
#Answer: 
#BOXPLOT HUMAN
#Yes, there are many dots outside the boxplot. There are 4 dots lower the whisker and they are far from line
#BOXPLOT AI
#Yes, there are 3 dots outside the boxplot and it is far below the lower whisker

#Conclusion Since both groups have not passed normality check and not close to whiskers we should go with Mann-Whitney U test.

MANN-WHITNEY U TEST

wilcox.test(SatisfactionScore ~ ServiceType, data = dataset, exact = FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  SatisfactionScore by ServiceType
## W = 497, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

DETERMINE STATISTICAL SIGNIFICANCE

#install.packages("effectsize")

library(effectsize)

CALCULATE EFFECT SIZE (R VALUE)

rank_biserial(SatisfactionScore ~ ServiceType, data = dataset, exact = FALSE)

## r (rank biserial) |         95% CI
## ----------------------------------
## -0.90             | [-0.93, -0.87]

# QUESTIONS
#Q1) What is the size of the effect?
#The rank biserial correlation is -0.90 it is +/-0.50 to +, size of effect is large.

#Q2) Which group had the higher average rank?
#The human group had highest score which is median=8.0 compared to AI group median=3.0. This concludes Human group had higher average rank.

WRITTEN REPORT FOR MANN-WHITNEY U TEST

Mann-Whitney U test was conducted to compare customer satisfaction scores between human and AI groups. This non-parametric test was chosen because satisfaction scores were found to be not normally distributed and contained serious outliers. The average satisfaction score (Median= 8.00) of customers interacting with human (n=100)was significantly higher than that of customers interacting with AI(n=100), which had an average score of median=3.00. The difference was statistically significant, U=497, p<0.001. Effect size was calculated using rank biserial correlation, r=−0.90. This indicates a large effect, showing meaningful differences in satisfaction depending on service type. Overall, customer satisfaction was much higher when interacting with a human compared to AI.

A6R2

NAVYA SRI MULUKUNTLA

2025-11-21