library(readxl)
library(ggpubr)
## Loading required package: ggplot2
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(effectsize)
library(effsize)
Dataset6.2 <- read_excel("/Volumes/Anup/SLU/3rd Sem/Applied Analytics/Assignment 6/Dataset6.2.xlsx")
Dataset6.2 %>%
group_by(Work_Status) %>%
summarise(
Mean = mean(Study_Hours, na.rm = TRUE),
Median = median(Study_Hours, na.rm = TRUE),
SD = sd(Study_Hours, na.rm = TRUE),
N = n()
)
## # A tibble: 2 × 5
## Work_Status Mean Median SD N
## <chr> <dbl> <dbl> <dbl> <int>
## 1 Does_Not_Work 9.62 8.54 7.45 30
## 2 Works 6.41 5.64 4.41 30
## Creating Histograms for Each Work Status
hist(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"],
main = "Histogram of Study Hours of Working Students",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 10)
hist(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"],
main = "Histogram of Study Hours of Non Working Students",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 10)
ggboxplot(Dataset6.2, x = "Work_Status", y = "Study_Hours",
color = "Work_Status",
palette = "jco",
add = "jitter")
The Works boxplot appears normal. There are no dots past the whiskers.
The Does not Work boxplot appears abnormal. There are several dots past the whiskers. Although some are very close to the whiskers, some are arguably far away.
We may need to use a Mann-Whitney U test.
shapiro.test(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"])
##
## Shapiro-Wilk normality test
##
## data: Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"]
## W = 0.94582, p-value = 0.1305
shapiro.test(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"])
##
## Shapiro-Wilk normality test
##
## data: Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"]
## W = 0.83909, p-value = 0.0003695
The data for works was normal (p > .05).
The data for Does Not Work was abnormal (p < .05).
After conducting all three normality tests, it is clear we must use a Mann-Whitney U test.
wilcox.test(Study_Hours ~ Work_Status, data = Dataset6.2)
##
## Wilcoxon rank sum exact test
##
## data: Study_Hours by Work_Status
## W = 569, p-value = 0.07973
## alternative hypothesis: true location shift is not equal to 0
If p > .05 (greater than .05), this means the results were NOT significant.
studying hours of students who works (Mdn = 5.64) were not significantly different from studying hours of student who do not work (Mdn = 8.54), U = 569, p = 0.07973