INDEPENDENT T-TEST & MANN-WHITNEY U TEST
What are the null and alternate hypotheses for YOUR research
scenario?
H0:There is no difference between two groups (Human and
AI)
H1: There is difference between two groups (Human and
AI)
LOAD THE PACKAGE
library(readxl)
IMPORT EXCEL FILE INTO R STUDIO
dataset <- read_excel("C:/Users/burug/Downloads/A6R2.xlsx")
LOAD THE PACKAGE
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
CALCULATE THE DESCRIPTIVE STATISTICS
dataset %>%
group_by(ServiceType) %>%
summarise(
Mean = mean(SatisfactionScore, na.rm = TRUE),
Median = median(SatisfactionScore, na.rm = TRUE),
SD = sd(SatisfactionScore, na.rm = TRUE),
N = n()
)
## # A tibble: 2 Ă— 5
## ServiceType Mean Median SD N
## <chr> <dbl> <dbl> <dbl> <int>
## 1 AI 3.6 3 1.60 100
## 2 Human 7.42 8 1.44 100
CREATE THE HISTOGRAMS
hist(dataset$SatisfactionScore[dataset$ServiceType == "Human"],
main = "Histogram of Group 1 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)
hist(dataset$SatisfactionScore[dataset$ServiceType == "AI"],
main = "Histogram of Group 2 Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion,
does the histogram look symmetrical, positively skewed, or negatively
skewed?
A1: Negatively skewed.
Q2) Check the KURTOSIS of the
VARIABLE 1 histogram. In your opinion, does the histogram look too flat,
too tall, or does it have a proper bell curve?
A2: Too tall
Q3)
Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does
the histogram look symmetrical, positively skewed, or negatively
skewed?
A3: Positively Skewed
Q4) Check the KUROTSIS of the
VARIABLE 2 histogram. In your opinion, does the histogram look too flat,
too tall, or does it have a proper bell curve?
A4: Too tall
CONDUCT THE SHAPIRO-WILK TEST
shapiro.test(dataset$SatisfactionScore[dataset$ServiceType == "Human"])
##
## Shapiro-Wilk normality test
##
## data: dataset$SatisfactionScore[dataset$ServiceType == "Human"]
## W = 0.93741, p-value = 0.0001344
shapiro.test(dataset$SatisfactionScore[dataset$ServiceType == "AI"])
##
## Shapiro-Wilk normality test
##
## data: dataset$SatisfactionScore[dataset$ServiceType == "AI"]
## W = 0.91143, p-value = 5.083e-06
Q1: Was the data normally distributed for Variable 1?
A1 : The p
= 0.00013, So the data is not normal
Was the data normally
distributed for Variable 2?
A2 : The p = 5.083e-06 ,so the data is
not normal
LOAD THE PACKAGE
library(ggplot2)
library(ggpubr)
CREATE THE BOXPLOT
ggboxplot(dataset, x = "ServiceType", y = "SatisfactionScore",
color = "ServiceType",
palette = "jco",
add = "jitter")
Q1) Were there any dots outside of the boxplot? Are these dots close
to the whiskers of the boxplot (check if there are any dots past the
lines on the boxes) or are they very far away?
A1 : Yes, there are
few dots far away from the whiskers
MANN-WHITNEY U TEST
wilcox.test(SatisfactionScore ~ ServiceType, data = dataset, exact = FALSE)
##
## Wilcoxon rank sum test with continuity correction
##
## data: SatisfactionScore by ServiceType
## W = 497, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
LOAD THE PACKAGE
library(effectsize)
CALCULATE EFFECT SIZE (R VALUE)
rank_biserial(SatisfactionScore ~ ServiceType, data = dataset, exact = FALSE)
## r (rank biserial) | 95% CI
## ----------------------------------
## -0.90 | [-0.93, -0.87]
Q1) What is the size of the effect?
A1: Cohen’s D of 0.90 indicates the difference between the group
averages was very large.
Q2) Which group had the higher average rank?
A2 : Human Satisfaction Score is higher than the AI Satisfaction
Score .
REPORT
A Mann-Whitney U test was conducted to compare
Satisfaction scores between human Service (n=100) and AI Service
(n=100).Human Service had significantly higher median scores (Mdn =
8.00) than the AI service (Mdn = 3.00), p < 2.2e-16 The effect size
was moderate (r = 0.90), indicating a meaningful difference between
Satisfaction scores. Overall, As per comparing human service
satisfaction score is higher than the AI satisfaction Score.