Introduction

Sleep is essential to physical and mental health as well as academic performance. Among college students, sleep is often sacrificed for social life, deadlines, and stress. In this report, we investigate the sleep habits of college students and their relation to lifestyle factors using the “Sleepstudy” dataset from https://www.lock5stat.com/datapage3e.html. It includes 253 observations with 27 variables including accademic success and behavioral traits. The primary goal is to explore how sleep patterns correlate with mental health, lifestyle choices, and academic success. The analysis explores 10 questions which provide insight into how important sleep is for your overall well-being. We apply statistical tests to target these questions and identify patterns that could inform and improve college student’s lives.

The questions include:

# 1. Is there a significant difference in the average GPA between male and female college students?

# 2. Is there a significant difference in the average number of early classes between the first two class years and other class years?

# 3. Do students who identify as "larks" have significantly better cognitive skills (cognition z-score) compared to "owls"?

# 4. Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn't (EarlyClass=0)?

# 5. Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

# 6. Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn't (AllNighter=0)?

# 7. Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

# 8. Is there a significant difference in the average number of drinks per week between students of different genders?

# 9. Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?

# 10. Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

Data

Source: “Sleepstudy” from Lock5Stat.

27 Variables: Gender, class year, class time, GPA, missed classes, cognitive score, sleep quality, depression score and status, anxiety score and status, stress score and status, DAS score, happiness, alcohol use, number of drinks, allnighters, average sleep, and times the students go to bed on weekdays and weekends.

Data collection: Survey-based responses from 253 college students collecting quantitive and categorical data.

Analysis

We explore the 10 questions in detail.

Sleepstudy = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
head(Sleepstudy)
##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0

Q1: Is there a significant difference in the average GPA between male and female college students?

t_test_1 <- t.test(GPA ~ Gender, data = Sleepstudy, alternative = "two.sided")
print(t_test_1)
## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725

Since the p-value is less than 5%, we have an alternative hypothesis meaning there is a significant difference between male and female GPAs. The difference in mean GPA between females and males is 0.09982254 to 0.30252780 not including zero which reinforces the significance. Females have a higher average GPA compared to males.

Q2: Is there a significant difference in the average number of early classes between the first two class years and other class years?

Sleepstudy$ClassGroup <- ifelse(Sleepstudy$ClassYear %in% c(1, 2), "FirstTwoYears", "OtherYears")

t_test_2 <- t.test(NumEarlyClass ~ ClassGroup, data = Sleepstudy)
print(t_test_2)
## 
##  Welch Two Sample t-test
## 
## data:  NumEarlyClass by ClassGroup
## t = 4.1813, df = 250.69, p-value = 4.009e-05
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  0.4042016 1.1240309
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    2.070423                    1.306306

The result shows a significant difference in students taking early classes in the first two years compared to other years. The difference between the first two years and other years having early class is 0.4042016 and 1.1240309. In all, it is significantly more commone for students in their first two years to have monring class.

Q3: Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?

Sleepstudy_subset <- subset(Sleepstudy, LarkOwl %in% c("Lark","Owl"))
t_test_3 <- t.test(CognitionZscore ~ LarkOwl, data = Sleepstudy_subset, alternative = "greater")
print(t_test_3)
## 
##  Welch Two Sample t-test
## 
## data:  CognitionZscore by LarkOwl
## t = 0.80571, df = 75.331, p-value = 0.2115
## alternative hypothesis: true difference in means between group Lark and group Owl is greater than 0
## 95 percent confidence interval:
##  -0.1372184        Inf
## sample estimates:
## mean in group Lark  mean in group Owl 
##         0.09024390        -0.03836735

The data shows no significant difference in cognitive scores between “larks” and “owls”. While larks have a higher mean score, the difference isn’t statistically significant under this test.

Q4: Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

t_test_4 <- t.test(ClassesMissed ~ EarlyClass, data = Sleepstudy, alternative = "two.sided")
print(t_test_4)
## 
##  Welch Two Sample t-test
## 
## data:  ClassesMissed by EarlyClass
## t = 1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.2233558  1.5412830
## sample estimates:
## mean in group 0 mean in group 1 
##        2.647059        1.988095

There is not a significant difference in missed classes between students who had early class and those that didn’t. Though, those who had early class missed class less, the difference isn’t statistically significant due to the p-value being greater than 5%.

Q5: Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

Sleepstudy$DepressionGroup <- ifelse(Sleepstudy$DepressionScore >= 10, "ModerateOrHigher", "Normal")
t_test_5 <- t.test(Happiness ~ DepressionGroup, data = Sleepstudy)
print(t_test_5)
## 
##  Welch Two Sample t-test
## 
## data:  Happiness by DepressionGroup
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means between group ModerateOrHigher and group Normal is not equal to 0
## 95 percent confidence interval:
##  -7.379724 -3.507836
## sample estimates:
## mean in group ModerateOrHigher           mean in group Normal 
##                       21.61364                       27.05742

Since the p-value is less than 5%, 6.057e-07, we can infer a significant difference exists in happiness levels between students with at least moderate levels and normal levels. Those with normal levels have higher average happiness levels than those with moderate levels. The difference is between approximately -7.4 and -3.5.

Q6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

t_test_6 <- t.test(AverageSleep ~ AllNighter, data = Sleepstudy)
print(t_test_6)
## 
##  Welch Two Sample t-test
## 
## data:  AverageSleep by AllNighter
## t = 4.4256, df = 42.171, p-value = 6.666e-05
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.4366603 1.1685667
## sample estimates:
## mean in group 0 mean in group 1 
##        8.073790        7.271176

There is a significant difference in sleep quality between those that pull one all-nighter and those who didn’t. Those who had one all-nighter experienced lower sleep quality than those who didn’t with a difference between approximately 0.44 and 1.17. Supporting the significance is the p-value being significantly less than 5%.

Q7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

Sleepstudy_subset <- subset(Sleepstudy, AlcoholUse %in% c("Abstain","Heavy"))
t_test_7 <- t.test(StressScore ~ AlcoholUse, data = Sleepstudy_subset, alternative = "less")
print(t_test_7)
## 
##  Welch Two Sample t-test
## 
## data:  StressScore by AlcoholUse
## t = -0.62604, df = 28.733, p-value = 0.2681
## alternative hypothesis: true difference in means between group Abstain and group Heavy is less than 0
## 95 percent confidence interval:
##      -Inf 2.515654
## sample estimates:
## mean in group Abstain   mean in group Heavy 
##              8.970588             10.437500

The results do not prove that there is lack of significance between stress scores and alcohol use even though those who report heavy use have higher stress.

Q8: Is there a significant difference in the average number of drinks per week between students of different genders?

t_test_8 <- t.test(Drinks ~ Gender, data = Sleepstudy)
print(t_test_8)
## 
##  Welch Two Sample t-test
## 
## data:  Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -4.360009 -2.241601
## sample estimates:
## mean in group 0 mean in group 1 
##        4.238411        7.539216

There is a significant difference between the amount of drinks per week in males and females. Males have an average of 7.539216 drinks per week and females have an average of 4.238411 drinks per week meaning females drink 50% less drinks than males per week.

Q9: Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?

Sleepstudy$StressGroup <- ifelse(Sleepstudy$StressScore >= 15, "HighStress", "LowStress")
t_test_9 <- t.test(WeekdayBed ~ StressGroup, data = Sleepstudy, alternative = "two.sided")
print(t_test_9)
## 
##  Welch Two Sample t-test
## 
## data:  WeekdayBed by StressGroup
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group HighStress and group LowStress is not equal to 0
## 95 percent confidence interval:
##  -0.4856597  0.1447968
## sample estimates:
## mean in group HighStress  mean in group LowStress 
##                 24.71500                 24.88543

There is not a significant difference in weekday bedtines for normal stress students compared to high stress students, normal stress students sleep about 17 minutes longer.

Q10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

Sleepstudy$YearGroup <- ifelse(Sleepstudy$ClassYear %in% c(1, 2), "FirstTwoYears", "OtherYears")
t_test_10 <- t.test(WeekendSleep ~ YearGroup, data = Sleepstudy)
print(t_test_10)
## 
##  Welch Two Sample t-test
## 
## data:  WeekendSleep by YearGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    8.213592                    8.221892

There is no significant difference in the average amount of weekend sleep between first two year students and all other years. The difference between the two is about 0.01 hours at average 8.213592 and 8.221892.

Summary

This analysis sheds light on significant habits in college students’ lives and their sleep patterns. It revealed notable trends in academic performance, lifestyle choices, and mental well being. Findings indicate the following patterns: Academic Outcomes: First and second-year students tend to have significantly more early classes compared to students of other years. Female students tend to achieve higher GPAs than male students. Those who are early birds demonstrate stronger cognitive skills compared to “owls”. Attendance: Early academic years exhibit more morning classes than their older peers. The students with morning class have better attendance than those without. Mental health: Those with normal depression status report higher happiness levels compared to those with moderate depression. Stress levels however do not seem to influence weekday bedtimes or weekend sleep. Sleep behavior: Pulling all-nighters is linked to poor sleep quality. Substance Use: Alcohol consumption affects students’ sleep behavior: students who abstain report lower stress levels compared to heavy drinkers. With gender, male students report an average of almost 2 times more drinks per week than the female students.

Key implications from these results can include mental health initiatives and lifestyle education. For students to be successful, they need resources to help balance their busy lives on top of deadlines. Recognizing differences in class times and missed classes may inform flexible learning options for students that learn better at different times in the day. Our results indicate a strong association with lifestyle choices and sleep patterns. Education promoting healthy sleep routines and discouraging all-nighters could improve both academic and personal well-being. Education on this topic also should include gender-specific approaches as there is a significant difference in female and male alcohol consumption and average GPAs. In all, the importance of sleep should be emphasized to students because it ultimately affects every aspect of their lives.

References

From Lock5Stat.com: Onyper, S., Thacher, P., Gilbert, J., Gradess, S., “Class Start Times, Sleep, and Academic Performance in College: A Path Analysis,” April 2012; 29(3): 318-335. Thanks to the authors for supplying the data

Appendix of All Code Chunks:

# Q1 Code:
#   t_test_1 <- t.test(GPA ~ Gender, data = Sleepstudy, alternative = "two.sided")
#   print(t_test_1)

# Q2 Code:
#   Sleepstudy$ClassGroup <- ifelse(Sleepstudy$ClassYear %in% c(1, 2), "FirstTwoYears", "OtherYears")
#   t_test_2 <- t.test(NumEarlyClass ~ ClassGroup, data = Sleepstudy)
#   print(t_test_2)

# Q3 Code:
#   Sleepstudy_subset <- subset(Sleepstudy, LarkOwl %in% c("Lark","Owl"))
#   t_test_3 <- t.test(CognitionZscore ~ LarkOwl, data = Sleepstudy_subset, alternative = "greater")
#   print(t_test_3)

# Q4 Code:
#   t_test_4 <- t.test(ClassesMissed ~ EarlyClass, data = Sleepstudy, alternative = "two.sided")
#   print(t_test_4)

# Q5 Code:
#   Sleepstudy$DepressionGroup <- ifelse(Sleepstudy$DepressionScore >= 10, "ModerateOrHigher", "Normal")
#   t_test_5 <- t.test(Happiness ~ DepressionGroup, data = Sleepstudy)
#   print(t_test_5)

# Q6 Code:
#   t_test_6 <- t.test(AverageSleep ~ AllNighter, data = Sleepstudy)
#   print(t_test_6)

# Q7 Code:
#   Sleepstudy_subset <- subset(Sleepstudy, AlcoholUse %in% c("Abstain","Heavy"))
#   t_test_7 <- t.test(StressScore ~ AlcoholUse, data = Sleepstudy_subset, alternative = "less")
#   print(t_test_7)

# Q8 Code:
#   t_test_8 <- t.test(Drinks ~ Gender, data = Sleepstudy)
#   print(t_test_8)

# Q9 Code:
#   Sleepstudy$StressGroup <- ifelse(Sleepstudy$StressScore >= 15, "HighStress", "LowStress")
#   t_test_9 <- t.test(WeekdayBed ~ StressGroup, data = Sleepstudy, alternative = "two.sided")
#   print(t_test_9)

# Q10 Code:
#   Sleepstudy$YearGroup <- ifelse(Sleepstudy$ClassYear %in% c(1, 2), "FirstTwoYears", "OtherYears")
#   t_test_10 <- t.test(WeekendSleep ~ YearGroup, data = Sleepstudy)
#   print(t_test_10)