library(readxl)
library(ggpubr)
## Loading required package: ggplot2
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(effectsize)
library(effsize)
Dataset6.2 <- read_excel("D:/SLU/AdvAppliedAnalytics/Dataset6.2.xlsx")
Dataset6.2 %>%
group_by(Work_Status) %>%
summarise(
Mean = mean(Study_Hours, na.rm = TRUE),
Median = median(Study_Hours, na.rm = TRUE),
SD = sd(Study_Hours, na.rm = TRUE),
N = n()
)
## # A tibble: 2 × 5
## Work_Status Mean Median SD N
## <chr> <dbl> <dbl> <dbl> <int>
## 1 Does_Not_Work 9.62 8.54 7.45 30
## 2 Works 6.41 5.64 4.41 30
Does_Not_Work (M = 9.62, SD = 7.45) and Works (M = 6.41, SD = 4.41)
hist(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"],
main = "Histogram of Works Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 10)
For the Works Histogram, the data appears Positively Skewed (not-normal). The kurtosis also does not appears bell-shaped (not-normal)
hist(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"],
main = "Histogram of Does not Work Scores",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 10)
For the Does not Works Histogram, the data appears Positively Skewed (not-normal). The kurtosis also appears tall and flat (not-normal)
ggboxplot(Dataset6.2, x = "Work_Status", y = "Study_Hours",
color = "Work_Status",
palette = "jco",
add = "jitter")
In the Works group, the dots are close to the whiskers and the data looks normal. In the No Tutoring group, few dots are far from the whisker and the data does not look normal. Therefore, the data is not normally distributed.
shapiro.test(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"])
##
## Shapiro-Wilk normality test
##
## data: Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"]
## W = 0.94582, p-value = 0.1305
The data for Works(p-value = 0.13) was normal (p > .05)
shapiro.test(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"])
##
## Shapiro-Wilk normality test
##
## data: Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"]
## W = 0.83909, p-value = 0.0003695
The data for Does not Works(p-value = 0.0003695) was not normal (p < .05) After conducting all three normality tests, it is clear we must use a Mann-Whitney U test.
wilcox.test(Study_Hours ~ Work_Status, data = Dataset6.2)
##
## Wilcoxon rank sum exact test
##
## data: Study_Hours by Work_Status
## W = 569, p-value = 0.07973
## alternative hypothesis: true location shift is not equal to 0
p-value = 0.07973 which is greater than 0.05, this means the results were NOT SIGNIFICANT(p > .05).
Does_Not_Work (M = 9.62, SD = 7.45) was not significantly different from Works (M = 6.41, SD = 4.41), p = .08