1. Introduction

This report delves into the world of college student sleep patterns using the “SleepStudy” dataset from Lock5Stat (https://www.lock5stat.com/datapage3e.html). This dataset, rich with 253 observations across 27 variables, provides a window into the sleep habits, psychological well-being, and lifestyle choices of this dynamic population.

Our primary objective is to unlock valuable insights through a series of research questions. These questions dissect various aspects of college student sleep patterns, their academic performance, mental health, and lifestyle choices. By analyzing the dataset, we aim to shed light on the factors influencing sleep and its related outcomes. These findings can then pave the way for further research and interventions designed to enhance the overall well-being and academic success of college students.

The following research questions will guide our investigation:

  1. Gender and GPA: Do male and female college students exhibit significant differences in average GPA?
  2. Early Classes and Class Year: Is there a significant difference in the average number of early classes between students in their first two years compared to other class years?
  3. Chronotype and Cognition: Do students who identify as “larks” (early risers) have significantly better cognitive skills (cognition z-score) compared to “owls” (late risers)?
  4. Early Classes and Missed Classes: Do students with at least one early class (EarlyClass=1) miss a significantly different number of classes in a semester compared to those with no early classes (EarlyClass=0)?
  5. Depression and Happiness: Is there a significant difference in average happiness level between students with at least moderate depression and those with normal depression status?
  6. All-nighters and Sleep Quality: Do students who reported having at least one all-nighter (AllNighter=1) experience a significant difference in average sleep quality scores compared to those who haven’t?
  7. Alcohol and Stress: Do students who abstain from alcohol use report significantly better stress scores than those who report heavy alcohol use?
  8. Gender and Alcohol Consumption: Is there a significant difference in the average number of drinks per week between students of different genders?
  9. Stress and Weekday Bedtime: Do students with high and low stress levels (Stress=High vs. Stress=Normal) have a significantly different average weekday bedtime?
  10. Class Year and Weekend Sleep: Is there a significant difference in the average hours of sleep on weekends between students in their first two years and other students?

By addressing these questions, we aim to gain a comprehensive understanding of the sleep patterns and related factors that influence college students. Ultimately, this knowledge can contribute to creating a more supportive environment that fosters the well-being and academic success of this population.

sleepStudy = read.csv("SleepStudy.csv")
head(sleepStudy)
##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0

2. Data

The “SleepStudy” dataset contains 253 observations on 27 variables, providing a comprehensive overview of college students’ sleep patterns, academic performance, mental health, and lifestyle choices. Key variables include:

3. Analysis

Question 1: Do male and female college students exhibit significant differences in average GPA?

testResult <- t.test(GPA ~ Gender, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725
boxplot(GPA ~ Gender, 
        data = sleepStudy, 
        main = "GPA Distribution by Gender", 
        xlab = "Gender", 
        ylab = "GPA", 
        col = c("lightblue", "lightpink"),
        names = c("Male", "Female"))

The boxplot is showing mean value for GPA between Male and Female

Question 2: Early Classes and Class Year: Is there a significant difference in the average number of early classes between students in their first two years compared to other class years?

sleepStudy$ClassGroup <- ifelse(sleepStudy$ClassYear %in% c(1, 2), 
                                "Lowerclassman", 
                                ifelse(sleepStudy$ClassYear %in% c(3, 4), 
                                       "Upperclassman", NA))
testResult <- t.test(EarlyClass ~ ClassGroup, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  EarlyClass by ClassGroup
## t = 2.3233, df = 224.26, p-value = 0.02106
## alternative hypothesis: true difference in means between group Lowerclassman and group Upperclassman is not equal to 0
## 95 percent confidence interval:
##  0.02121868 0.25831438
## sample estimates:
## mean in group Lowerclassman mean in group Upperclassman 
##                   0.7253521                   0.5855856
mean_values <- aggregate(EarlyClass ~ ClassGroup, data = sleepStudy, FUN = mean)
barplot(mean_values$EarlyClass, 
        names.arg = mean_values$ClassGroup, 
        col = c("lightblue", "lightgreen"),
        main = "Mean Number of Early Classes by Class Year", 
        xlab = "Class Year Group", 
        ylab = "Mean Number of Early Classes",
        ylim = c(0, max(mean_values$EarlyClass) + 0.2))

The boxplot is showing mean value for number of Early Classes between Freshman/Sophomore vs. Junior/Senior group

Question 3: Chronotype and Cognition: Do students who identify as “larks” (early risers) have significantly better cognitive skills (cognition z-score) compared to “owls” (late risers)?

sleepStudy$LarkOwlGroup <- ifelse(sleepStudy$LarkOwl %in% c("Lark"), 
                                "Lark", 
                                ifelse(sleepStudy$LarkOwl %in% c("Owl"), 
                                       "Owl", NA))
sleepStudyFiltered <- sleepStudy[!is.na(sleepStudy$LarkOwlGroup), ]
testResult <- t.test(CognitionZscore ~ LarkOwlGroup, data = sleepStudyFiltered)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  CognitionZscore by LarkOwlGroup
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
##  -0.1893561  0.4465786
## sample estimates:
## mean in group Lark  mean in group Owl 
##         0.09024390        -0.03836735
# Create a boxplot to show the distribution of cognition z-scores by group
boxplot(CognitionZscore ~ LarkOwlGroup, 
        data = sleepStudyFiltered, 
        main = "Cognition Z-Score by Lark/Owl Group", 
        xlab = "Group", 
        ylab = "Cognition Z-Score", 
        col = c("lightblue", "lightgreen"),
        names = c("Lark", "Owl"))

The boxplot is showing mean value for Cognition Z-Score for Lark(early riser) and Owl(night owl)

Question 4: Do students with at least one early class (EarlyClass=1) miss a significantly different number of classes in a semester compared to those with no early classes (EarlyClass=0)?

testResult <- t.test(ClassesMissed ~ EarlyClass, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  ClassesMissed by EarlyClass
## t = 1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.2233558  1.5412830
## sample estimates:
## mean in group 0 mean in group 1 
##        2.647059        1.988095
boxplot(ClassesMissed ~ EarlyClass, 
        data = sleepStudy, 
        col = c("lightblue", "lightgreen"), 
        main = "Classes Missed by Early Class Status", 
        ylab = "Number of Classes Missed", 
        xlab = "Early Class Status")

The boxplot is showing mean value for number of classes missed for students with at least one early class(1) vs. no early class(0)

Question 5: Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

sleepStudy$DepressionStatusGroup <- ifelse(sleepStudy$DepressionStatus %in% c("normal"), "Normal", "Depressed")

boxplot(Happiness ~ DepressionStatusGroup, 
        data = sleepStudy, 
        col = c("lightblue", "lightgreen"), 
        main = "Happiness Level by Depression Status", 
        ylab = "Happiness", 
        xlab = "Depression Status")

testResult <- t.test(Happiness ~ DepressionStatusGroup, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  Happiness by DepressionStatusGroup
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means between group Depressed and group Normal is not equal to 0
## 95 percent confidence interval:
##  -7.379724 -3.507836
## sample estimates:
## mean in group Depressed    mean in group Normal 
##                21.61364                27.05742

The boxplot is showing mean value for Happines Level of students who are Depressed versus those feeling Normal.

Question 6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

testResult <- t.test(PoorSleepQuality ~ AllNighter, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -1.9456958  0.1608449
## sample estimates:
## mean in group 0 mean in group 1 
##        6.136986        7.029412
boxplot(PoorSleepQuality ~ AllNighter, 
        data = sleepStudy, 
        col = c("lightcoral", "lightseagreen"), 
        main = "Sleep Quality by All-Nighter Status",
        ylab = "Sleep Quality", 
        xlab = "All-Nighter Status", 
        names = c("No All-Nighter (0)", "At Least One All-Nighter (1)"))

The boxplot is showing mean value for Sleep Quality of student who pulls all nighter and ones who don’t.

Question 7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

sleepStudy$AlcoholUseGroup <- ifelse(sleepStudy$AlcoholUse %in% c("Abstain", "Light"), "Low Use", "High Use")
testResult <- t.test(StressScore ~ AlcoholUseGroup, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  StressScore by AlcoholUseGroup
## t = 0.24753, df = 248.92, p-value = 0.8047
## alternative hypothesis: true difference in means between group High Use and group Low Use is not equal to 0
## 95 percent confidence interval:
##  -1.722125  2.217223
## sample estimates:
## mean in group High Use  mean in group Low Use 
##               9.580882               9.333333
boxplot(StressScore ~ AlcoholUseGroup, 
        data = sleepStudy, 
        col = c("skyblue", "orange"), 
        main = "Stress Score by Alcohol Use Group", 
        ylab = "Stress Score", 
        xlab = "Alcohol Use Group", 
        names = c("Low Use", "High Use"))

The boxplot is showing mean value for Stress Score of students with low alcohol use and high alcohol use

Question 8: Is there a significant difference in the average number of drinks per week between students of different genders?

testResult <- t.test(Drinks ~ Gender, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -4.360009 -2.241601
## sample estimates:
## mean in group 0 mean in group 1 
##        4.238411        7.539216
boxplot(Drinks ~ Gender, 
        data = sleepStudy, 
        col = c("lightpink", "lightblue"), 
        main = "Number of Drinks per Week by Gender", 
        ylab = "Number of Drinks", 
        xlab = "Gender", 
        names = c("Female", "Male"))

The boxplot is showing mean value for Number of Drinks students have per week between Male and Female

Question 9: Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?

testResult <- t.test(WeekdayBed ~ Stress, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
##  -0.4856597  0.1447968
## sample estimates:
##   mean in group high mean in group normal 
##             24.71500             24.88543
boxplot(WeekdayBed ~ Stress, 
        data = sleepStudy, 
        col = c("lightgreen", "lightcoral"), 
        main = "Weekday Bedtime by Stress Level", 
        ylab = "Weekday Bedtime (24-hour format)", 
        xlab = "Stress Level", 
        names = c("High Stress", "Normal Stress"))

The boxplot is showing mean value for Weekday Bedtime for students with High Stress versus Normal Stress

Question 10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

sleepStudy$ClassGroup <- ifelse(sleepStudy$ClassYear %in% c(1, 2), 
                                "Lowerclassman", 
                                ifelse(sleepStudy$ClassYear %in% c(3, 4), 
                                       "Upperclassman", NA))
testResult <- t.test(WeekendSleep ~ ClassGroup, data = sleepStudy)
testResult
## 
##  Welch Two Sample t-test
## 
## data:  WeekendSleep by ClassGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group Lowerclassman and group Upperclassman is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean in group Lowerclassman mean in group Upperclassman 
##                    8.213592                    8.221892
boxplot(WeekendSleep ~ ClassGroup, 
        data = sleepStudy, 
        col = c("lightblue", "lightgreen"), 
        main = "Weekend Sleep by Class Year", 
        ylab = "Weekend Sleep Hours", 
        xlab = "Class Group", 
        names = c("Lowerclassman", "Upperclassman"))

The boxplot is showing mean value for Number of Hour Slept on the weekend for Freshman/Sophomore versus Junior/Senior.

4. Summary

This analysis explored several key aspects of college students’ sleep patterns, mental health, and lifestyle choices using the “SleepStudy” dataset. The findings from the statistical tests and visualizations are summarized below:

Gender and GPA:

A significant difference in average GPA between male and female students was found, with females exhibiting higher GPAs on average. This was confirmed by a two-sample t-test, showing a p-value of less than 0.05.

Early Classes and Class Year:

The comparison of early class attendance between lowerclassmen (1st and 2nd year) and upperclassmen (3rd and 4th year) revealed a significant difference, with lowerclassmen attending more early classes (t = 2.3233, p = 0.02106).

Chronotype and Cognition:

There was no significant difference in cognition scores between “Lark” and “Owl” students (t = 0.80571, p = 0.4229). This suggests that a student’s preference for being active in the morning or evening does not have a notable impact on their cognition scores.

Early Classes and Missed Classes:

The analysis revealed no significant difference in the number of classes missed between students with early classes (1) and those without (0) (t = 1.4755, p = 0.1421). This suggests that having early classes does not significantly affect the number of classes students miss.

Depression and Happiness:

The comparison of happiness levels between students classified as “Depressed” and those classified as “Normal” revealed a significant difference (t = -5.6339, p = 6.057e-07). Students in the “Depressed” group reported significantly lower happiness scores (mean = 21.61) compared to those in the “Normal” group (mean = 27.06).

All-nighters and Sleep Quality:

The comparison of sleep quality between students who reported having at least one all-nighter (AllNighter=1) and those who did not (AllNighter=0) showed no significant difference (t = -1.7068, p = 0.09479). Students who had all-nighters (mean = 7.03) had slightly worse sleep quality compared to those who did not (mean = 6.14), but the difference was not statistically significant.

Alcohol Use and Stress:

There was no statistically significant difference in stress scores between high alcohol users and low alcohol users. The p-value of 0.8047 suggests that alcohol consumption does not have a meaningful impact on stress levels in this sample.

Gender and Alcohol Consumption:

Males reported significantly higher alcohol consumption compared to females, with a statistically significant difference between the two groups. This result highlights a notable gender disparity in drinking behavior.

Stress and Weekday Bedtime:

There was no significant difference in the weekday bedtime between students with high stress and those with normal stress levels. The p-value of 0.2855 indicates that stress levels do not appear to influence weekday bedtimes in this sample.

Class Year and Weekend Sleep:

There was no significant difference in weekend sleep duration between lowerclassmen and upperclassmen. The p-value of 0.9618 suggests that class group does not have a meaningful effect on weekend sleep in this sample.

The findings of this study provide valuable insights into the sleep and lifestyle patterns of college students. They highlight the complex relationships between mental health, alcohol consumption, stress, and sleep. The results can be used to inform future interventions aimed at improving student well-being, particularly addressing issues like depression and stress that significantly affect sleep and overall health.

Reference

References Lock5Data. (n.d.). SleepStudy dataset. Retrieved from Lock5Data. R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Retrieved from https://www.r-project.org.