library(dplyr)
data=readxl::read_excel('DevStudentsOverTime.xlsx')
group1=data %>%
filter(GRADUATED=='NO')
group2=data %>%
filter(GRADUATED=='YES')
Group 1 is Non-Graduated and group 2 is Graduated
hist(group1$HOURS_EARNED)
hist(group2$HOURS_EARNED)
shapiro.test(group1$HOURS_EARNED)
##
## Shapiro-Wilk normality test
##
## data: group1$HOURS_EARNED
## W = 0.83686, p-value < 2.2e-16
shapiro.test(group2$HOURS_EARNED)
##
## Shapiro-Wilk normality test
##
## data: group2$HOURS_EARNED
## W = 0.96089, p-value < 2.2e-16
Is there a significant difference at all regarding the non-grad and the grad group in hours earned? Or was it by chance?