This report delves into the world of college student sleep patterns using the “SleepStudy” dataset from Lock5Stat (https://www.lock5stat.com/datapage3e.html). This dataset, rich with 253 observations across 27 variables, provides a window into the sleep habits, psychological well-being, and lifestyle choices of this dynamic population.
Our primary objective is to unlock valuable insights through a series of research questions. These questions dissect various aspects of college student sleep patterns, their academic performance, mental health, and lifestyle choices. By analyzing the dataset, we aim to shed light on the factors influencing sleep and its related outcomes. These findings can then pave the way for further research and interventions designed to enhance the overall well-being and academic success of college students.
The following research questions will guide our investigation:
By addressing these questions, we aim to gain a comprehensive understanding of the sleep patterns and related factors that influence college students. Ultimately, this knowledge can contribute to creating a more supportive environment that fosters the well-being and academic success of this population.
sleepStudy = read.csv("SleepStudy.csv")
head(sleepStudy)
## Gender ClassYear LarkOwl NumEarlyClass EarlyClass GPA ClassesMissed
## 1 0 4 Neither 0 0 3.60 0
## 2 0 4 Neither 2 1 3.24 0
## 3 0 4 Owl 0 0 2.97 12
## 4 0 1 Lark 5 1 3.76 0
## 5 0 4 Owl 0 0 3.20 4
## 6 1 4 Neither 0 0 3.50 0
## CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1 -0.26 4 4 3 8
## 2 1.39 6 1 0 3
## 3 0.38 18 18 18 9
## 4 1.39 9 1 4 6
## 5 1.22 9 7 25 14
## 6 -0.04 6 14 8 28
## DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1 normal normal normal 15 28 Moderate 10
## 2 normal normal normal 4 25 Moderate 6
## 3 moderate severe normal 45 17 Light 3
## 4 normal normal normal 11 32 Light 2
## 5 normal severe normal 46 15 Moderate 4
## 6 moderate moderate high 50 22 Abstain 0
## WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1 25.75 8.70 7.70 25.75 9.50 5.88
## 2 25.70 8.20 6.80 26.00 10.00 7.25
## 3 27.44 6.55 3.00 28.00 12.59 10.09
## 4 23.50 7.17 6.77 27.00 8.00 7.25
## 5 25.90 8.67 6.09 23.75 9.50 7.00
## 6 23.80 8.95 9.05 26.00 10.75 9.00
## AverageSleep AllNighter
## 1 7.18 0
## 2 6.93 0
## 3 5.02 0
## 4 6.90 0
## 5 6.35 0
## 6 9.04 0
The “SleepStudy” dataset contains 253 observations on 27 variables, providing a comprehensive overview of college students’ sleep patterns, academic performance, mental health, and lifestyle choices. Key variables include:
testResult <- t.test(GPA ~ Gender, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1
## 3.324901 3.123725
boxplot(GPA ~ Gender,
data = sleepStudy,
main = "GPA Distribution by Gender",
xlab = "Gender",
ylab = "GPA",
col = c("lightblue", "lightpink"),
names = c("Male", "Female"))
The boxplot is showing mean value for GPA between Male and Female
sleepStudy$ClassGroup <- ifelse(sleepStudy$ClassYear %in% c(1, 2),
"Lowerclassman",
ifelse(sleepStudy$ClassYear %in% c(3, 4),
"Upperclassman", NA))
testResult <- t.test(EarlyClass ~ ClassGroup, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: EarlyClass by ClassGroup
## t = 2.3233, df = 224.26, p-value = 0.02106
## alternative hypothesis: true difference in means between group Lowerclassman and group Upperclassman is not equal to 0
## 95 percent confidence interval:
## 0.02121868 0.25831438
## sample estimates:
## mean in group Lowerclassman mean in group Upperclassman
## 0.7253521 0.5855856
mean_values <- aggregate(EarlyClass ~ ClassGroup, data = sleepStudy, FUN = mean)
barplot(mean_values$EarlyClass,
names.arg = mean_values$ClassGroup,
col = c("lightblue", "lightgreen"),
main = "Mean Number of Early Classes by Class Year",
xlab = "Class Year Group",
ylab = "Mean Number of Early Classes",
ylim = c(0, max(mean_values$EarlyClass) + 0.2))
The boxplot is showing mean value for number of Early Classes between Freshman/Sophomore vs. Junior/Senior group
sleepStudy$LarkOwlGroup <- ifelse(sleepStudy$LarkOwl %in% c("Lark"),
"Lark",
ifelse(sleepStudy$LarkOwl %in% c("Owl"),
"Owl", NA))
sleepStudyFiltered <- sleepStudy[!is.na(sleepStudy$LarkOwlGroup), ]
testResult <- t.test(CognitionZscore ~ LarkOwlGroup, data = sleepStudyFiltered)
testResult
##
## Welch Two Sample t-test
##
## data: CognitionZscore by LarkOwlGroup
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
## -0.1893561 0.4465786
## sample estimates:
## mean in group Lark mean in group Owl
## 0.09024390 -0.03836735
# Create a boxplot to show the distribution of cognition z-scores by group
boxplot(CognitionZscore ~ LarkOwlGroup,
data = sleepStudyFiltered,
main = "Cognition Z-Score by Lark/Owl Group",
xlab = "Group",
ylab = "Cognition Z-Score",
col = c("lightblue", "lightgreen"),
names = c("Lark", "Owl"))
The boxplot is showing mean value for Cognition Z-Score for Lark(early riser) and Owl(night owl)
testResult <- t.test(ClassesMissed ~ EarlyClass, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: ClassesMissed by EarlyClass
## t = 1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.2233558 1.5412830
## sample estimates:
## mean in group 0 mean in group 1
## 2.647059 1.988095
boxplot(ClassesMissed ~ EarlyClass,
data = sleepStudy,
col = c("lightblue", "lightgreen"),
main = "Classes Missed by Early Class Status",
ylab = "Number of Classes Missed",
xlab = "Early Class Status")
The boxplot is showing mean value for number of classes missed for students with at least one early class(1) vs. no early class(0)
sleepStudy$DepressionStatusGroup <- ifelse(sleepStudy$DepressionStatus %in% c("normal"), "Normal", "Depressed")
boxplot(Happiness ~ DepressionStatusGroup,
data = sleepStudy,
col = c("lightblue", "lightgreen"),
main = "Happiness Level by Depression Status",
ylab = "Happiness",
xlab = "Depression Status")
testResult <- t.test(Happiness ~ DepressionStatusGroup, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: Happiness by DepressionStatusGroup
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means between group Depressed and group Normal is not equal to 0
## 95 percent confidence interval:
## -7.379724 -3.507836
## sample estimates:
## mean in group Depressed mean in group Normal
## 21.61364 27.05742
The boxplot is showing mean value for Happines Level of students who are Depressed versus those feeling Normal.
testResult <- t.test(PoorSleepQuality ~ AllNighter, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -1.9456958 0.1608449
## sample estimates:
## mean in group 0 mean in group 1
## 6.136986 7.029412
boxplot(PoorSleepQuality ~ AllNighter,
data = sleepStudy,
col = c("lightcoral", "lightseagreen"),
main = "Sleep Quality by All-Nighter Status",
ylab = "Sleep Quality",
xlab = "All-Nighter Status",
names = c("No All-Nighter (0)", "At Least One All-Nighter (1)"))
The boxplot is showing mean value for Sleep Quality of student who pulls all nighter and ones who don’t.
sleepStudy$AlcoholUseGroup <- ifelse(sleepStudy$AlcoholUse %in% c("Abstain", "Light"), "Low Use", "High Use")
testResult <- t.test(StressScore ~ AlcoholUseGroup, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: StressScore by AlcoholUseGroup
## t = 0.24753, df = 248.92, p-value = 0.8047
## alternative hypothesis: true difference in means between group High Use and group Low Use is not equal to 0
## 95 percent confidence interval:
## -1.722125 2.217223
## sample estimates:
## mean in group High Use mean in group Low Use
## 9.580882 9.333333
boxplot(StressScore ~ AlcoholUseGroup,
data = sleepStudy,
col = c("skyblue", "orange"),
main = "Stress Score by Alcohol Use Group",
ylab = "Stress Score",
xlab = "Alcohol Use Group",
names = c("Low Use", "High Use"))
The boxplot is showing mean value for Stress Score of students with low alcohol use and high alcohol use
testResult <- t.test(Drinks ~ Gender, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -4.360009 -2.241601
## sample estimates:
## mean in group 0 mean in group 1
## 4.238411 7.539216
boxplot(Drinks ~ Gender,
data = sleepStudy,
col = c("lightpink", "lightblue"),
main = "Number of Drinks per Week by Gender",
ylab = "Number of Drinks",
xlab = "Gender",
names = c("Female", "Male"))
The boxplot is showing mean value for Number of Drinks students have per week between Male and Female
testResult <- t.test(WeekdayBed ~ Stress, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
## -0.4856597 0.1447968
## sample estimates:
## mean in group high mean in group normal
## 24.71500 24.88543
boxplot(WeekdayBed ~ Stress,
data = sleepStudy,
col = c("lightgreen", "lightcoral"),
main = "Weekday Bedtime by Stress Level",
ylab = "Weekday Bedtime (24-hour format)",
xlab = "Stress Level",
names = c("High Stress", "Normal Stress"))
The boxplot is showing mean value for Weekday Bedtime for students with High Stress versus Normal Stress
sleepStudy$ClassGroup <- ifelse(sleepStudy$ClassYear %in% c(1, 2),
"Lowerclassman",
ifelse(sleepStudy$ClassYear %in% c(3, 4),
"Upperclassman", NA))
testResult <- t.test(WeekendSleep ~ ClassGroup, data = sleepStudy)
testResult
##
## Welch Two Sample t-test
##
## data: WeekendSleep by ClassGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group Lowerclassman and group Upperclassman is not equal to 0
## 95 percent confidence interval:
## -0.3497614 0.3331607
## sample estimates:
## mean in group Lowerclassman mean in group Upperclassman
## 8.213592 8.221892
boxplot(WeekendSleep ~ ClassGroup,
data = sleepStudy,
col = c("lightblue", "lightgreen"),
main = "Weekend Sleep by Class Year",
ylab = "Weekend Sleep Hours",
xlab = "Class Group",
names = c("Lowerclassman", "Upperclassman"))
The boxplot is showing mean value for Number of Hour Slept on the weekend for Freshman/Sophomore versus Junior/Senior.
This analysis explored several key aspects of college students’ sleep patterns, mental health, and lifestyle choices using the “SleepStudy” dataset. The findings from the statistical tests and visualizations are summarized below:
A significant difference in average GPA between male and female students was found, with females exhibiting higher GPAs on average. This was confirmed by a two-sample t-test, showing a p-value of less than 0.05.
The comparison of early class attendance between lowerclassmen (1st and 2nd year) and upperclassmen (3rd and 4th year) revealed a significant difference, with lowerclassmen attending more early classes (t = 2.3233, p = 0.02106).
There was no significant difference in cognition scores between “Lark” and “Owl” students (t = 0.80571, p = 0.4229). This suggests that a student’s preference for being active in the morning or evening does not have a notable impact on their cognition scores.
The analysis revealed no significant difference in the number of classes missed between students with early classes (1) and those without (0) (t = 1.4755, p = 0.1421). This suggests that having early classes does not significantly affect the number of classes students miss.
The comparison of happiness levels between students classified as “Depressed” and those classified as “Normal” revealed a significant difference (t = -5.6339, p = 6.057e-07). Students in the “Depressed” group reported significantly lower happiness scores (mean = 21.61) compared to those in the “Normal” group (mean = 27.06).
The comparison of sleep quality between students who reported having at least one all-nighter (AllNighter=1) and those who did not (AllNighter=0) showed no significant difference (t = -1.7068, p = 0.09479). Students who had all-nighters (mean = 7.03) had slightly worse sleep quality compared to those who did not (mean = 6.14), but the difference was not statistically significant.
There was no statistically significant difference in stress scores between high alcohol users and low alcohol users. The p-value of 0.8047 suggests that alcohol consumption does not have a meaningful impact on stress levels in this sample.
Males reported significantly higher alcohol consumption compared to females, with a statistically significant difference between the two groups. This result highlights a notable gender disparity in drinking behavior.
There was no significant difference in the weekday bedtime between students with high stress and those with normal stress levels. The p-value of 0.2855 indicates that stress levels do not appear to influence weekday bedtimes in this sample.
There was no significant difference in weekend sleep duration between lowerclassmen and upperclassmen. The p-value of 0.9618 suggests that class group does not have a meaningful effect on weekend sleep in this sample.
The findings of this study provide valuable insights into the sleep and lifestyle patterns of college students. They highlight the complex relationships between mental health, alcohol consumption, stress, and sleep. The results can be used to inform future interventions aimed at improving student well-being, particularly addressing issues like depression and stress that significantly affect sleep and overall health.
References Lock5Data. (n.d.). SleepStudy dataset. Retrieved from Lock5Data. R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Retrieved from https://www.r-project.org.