library(readxl)
library(ggpubr)
## Loading required package: ggplot2
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(effectsize)
library(effsize)
Dataset6.2 <- read_excel("C:/Users/Leyav/Downloads/Dataset6.2.xlsx")
Dataset6.2 %>%
  group_by(Work_Status) %>%
  summarise(
    Mean = mean(Study_Hours, na.rm = TRUE),
    Median = median(Study_Hours, na.rm = TRUE),
    SD = sd(Study_Hours, na.rm = TRUE),
    N = n()
  )
## # A tibble: 2 × 5
##   Work_Status    Mean Median    SD     N
##   <chr>         <dbl>  <dbl> <dbl> <int>
## 1 Does_Not_Work  9.62   8.54  7.45    30
## 2 Works          6.41   5.64  4.41    30
hist(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"],
     main = "Histogram of study hours of students who works ",
     xlab = "Scores",
     ylab = "Frequency",
     col = "lightblue",
     border = "black",
     breaks = 10)

hist(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"],
     main = "Histogram of study hours of students who does not works",
     xlab = "Scores",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 10)

For the Works histogram, the data appears positively skewed. The kurtosis appears short. For the does not work histogram, the data appears positively skewed. The kurtosis appears tall. We may need to use a Mann-Whitney U test.

ggboxplot(Dataset6.2, x = "Work_Status", y = "Study_Hours",
          color = "Work_Status",
          palette = "jco",
          add = "jitter")

The box plot for works group appears normal, one dot is away from whisker but it is not that far away so data is normal The box plot for does not work appear abnormal, there is an outlier. The data is NOT normal. we might have to use Wilcoxon-Sign Rank.

shapiro.test(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"])
## 
##  Shapiro-Wilk normality test
## 
## data:  Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Works"]
## W = 0.94582, p-value = 0.1305
shapiro.test(Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"])
## 
##  Shapiro-Wilk normality test
## 
## data:  Dataset6.2$Study_Hours[Dataset6.2$Work_Status == "Does_Not_Work"]
## W = 0.83909, p-value = 0.0003695

The data for Works is normal as p value is >0.05 (0.1305) The data for Does not work is abnormal as p value is <0.05 (0.0003) After conducting all three normality tests, it is clear we must use a Mann-Whitney U test.

wilcox.test(Study_Hours ~ Work_Status, data = Dataset6.2)
## 
##  Wilcoxon rank sum exact test
## 
## data:  Study_Hours by Work_Status
## W = 569, p-value = 0.07973
## alternative hypothesis: true location shift is not equal to 0

The p > .05 (0.079), this means the results were NOT significant.

Works ((Mdn = 5.64) was not significantly different from Does not work (Mdn = 8.54), U = 569, p = .0797. Since not significant not effect size is calculated.