Introduction

This report presents an analysis of sleep patterns among college students, utilizing the “SleepStudy” dataset obtained from https://www.lock5stat.com/datapage3e.html. The dataset comprises 253 observations on 27 variables, providing valuable insights into the sleep habits, psychological well-being, and lifestyle choices of college students.

The primary objective of this analysis is to address a series of research questions by examining the dataset. The questions explored in this report aim to shed light on various aspects of college students’ sleep patterns, their academic performance, psychological well-being, and lifestyle choices. The results of this analysis offer valuable insights into the factors affecting students’ sleep and related outcomes, providing a basis for further research and interventions to improve students’ overall well-being and academic performance.

The following research questions will be addressed in this report:

By addressing these questions, we aim to provide a comprehensive understanding of the sleep patterns and related factors among college students, ultimately contributing to the well-being and academic success of this population.

Data

The SleepStudy dataset examines sleep patterns, habits, and related lifestyle and psychological factors among college students. This dataset offers a comprehensive view of how various behaviors and conditions might influence sleep quality and academic performance.

Dataset Structure

Number of Observations: 253 students participated in the study. Number of Variables: 27 variables capturing demographic information, sleep patterns, academic habits, and psychological factors.

Variable Descriptions:

  • Gender: Binary indicator (1 = male, 0 = female).
  • ClassYear: Academic year of the student (1 = first year, …, 4 = senior).
  • LarkOwl: Morning or evening preference (Lark = early riser, Owl = night owl, Neither).
  • NumEarlyClass: Number of classes per week scheduled before 9 a.m.
  • EarlyClass: Binary indicator for having any early classes (1 = yes, 0 = no).
  • GPA: Grade point average on a 0–4 scale.
  • ClassesMissed: Total number of classes missed during the semester.
  • CognitionZscore: Z-score for cognitive skills based on a standardized test.
  • PoorSleepQuality: Sleep quality index (higher values indicate poorer sleep).
  • DepressionScore: Measure of depression severity.
  • AnxietyScore: Measure of anxiety severity.
  • StressScore: Measure of stress severity.
  • DepressionStatus: Categorical classification of depression (normal, moderate, severe).
  • AnxietyStatus: Categorical classification of anxiety (normal, moderate, severe).
  • Stress: Binary indicator for stress (normal or high).
  • DASScore: Combined score for depression, anxiety, and stress.
  • Happiness: Measure of subjective happiness.
  • AlcoholUse: Self-reported alcohol consumption level (Abstain, Light, Moderate, Heavy).
  • Drinks: Average number of alcoholic drinks consumed per week.
  • WeekdayBed: Average weekday bedtime (24.0 = midnight).
  • WeekdayRise: Average weekday rise time (e.g., 8.0 = 8 a.m.).
  • WeekdaySleep: Average weekday sleep duration in hours.
  • WeekendBed: Average weekend bedtime (24.0 = midnight).
  • WeekendRise: Average weekend rise time (e.g., 8.0 = 8 a.m.).
  • WeekendSleep: Average weekend sleep duration in hours.
  • AverageSleep: Average sleep duration across all days.
  • AllNighter: Binary indicator for pulling an all-nighter during the semester (1 = yes, 0 = no).

Data Collection Process

The dataset is based on research conducted by Onyper et al. (2012), where college students:

  • Completed Skills Tests: Cognitive function was assessed through standardized testing.

  • Responded to Surveys: Students answered questions on habits, attitudes, and psychological well-being.

  • Maintained Sleep Diaries: Over two weeks, participants recorded their sleep timing and quality.

  • This comprehensive dataset allows for multifaceted analysis, including the relationships between sleep and academic performance, the effects of * psychological health on sleep, and the impact of lifestyle factors like alcohol use or early class schedules.

Analysis

I will now explore the questions in detail now:

install.packages("dplyr")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.4'
## (as 'lib' is unspecified)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
sleepStudy = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")

Question 1: Is there a significant difference in the average GPA between male and female college students?

t.test(GPA ~ Gender, data = sleepStudy)
## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725

The results of the Welch Two Sample t-test indicate a statistically significant difference in the average GPA between male and female college students. The p-value is equal to 0.0001243, which is below the threshold of 0.05. Mean GPA for females (group 0) is 3.3249 and the mean GPA for males (group 1) is 3.1237. Female students have a higher average GPA (3.32) compared to male students (3.12). The difference in means is small but statistically significant. An insight might be that the differences in GPA could result from a variety of factors, such as differences in study habits, class attendance, or academic pressures.

Question 2: Is there a significant difference in the average number of early classes between the first two class years and other class years?

sleepStudy$YearGroup <- ifelse(sleepStudy$ClassYear %in% c(1, 2), "FirstTwoYears", "OtherYears")
t.test(NumEarlyClass ~ YearGroup, data = sleepStudy)
## 
##  Welch Two Sample t-test
## 
## data:  NumEarlyClass by YearGroup
## t = 4.1813, df = 250.69, p-value = 4.009e-05
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  0.4042016 1.1240309
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    2.070423                    1.306306

The Welch Two Sample t-test reveals a statistically significant difference in the average number of early classes between students in their first two years of college and those in later years. With a p-value of 4.009e-05, which is far below the 0.05 threshold. This indicates a strong statistically significant difference in the data. Students in their first two years of college take significantly more early classes than those in later years, possibly due to fewer scheduling options or academic requirements for underclassmen.

Question 3: Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?

sleepStudy_LarkOwl <- filter(sleepStudy, LarkOwl %in% c("Lark", "Owl"))
t.test(CognitionZscore ~ LarkOwl, data = sleepStudy_LarkOwl)
## 
##  Welch Two Sample t-test
## 
## data:  CognitionZscore by LarkOwl
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
##  -0.1893561  0.4465786
## sample estimates:
## mean in group Lark  mean in group Owl 
##         0.09024390        -0.03836735

The Welch Two Sample t-test indicates no significant difference in the average cognition z-scores between students who identify as “larks” (early risers) and those who identify as “owls” (night owls). p-value of 0.4229 indicates no significant difference. The larks mean cognition z-score is 0.0902 and the owls mean cognition z-score is -0.0384. The data tells us there is no evidence to suggest that larks outperform owls in cognitive skills.

Question 4: Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

t.test(ClassesMissed ~ EarlyClass, data = sleepStudy)
## 
##  Welch Two Sample t-test
## 
## data:  ClassesMissed by EarlyClass
## t = 1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.2233558  1.5412830
## sample estimates:
## mean in group 0 mean in group 1 
##        2.647059        1.988095

The Welch Two Sample t-test shows no statistically significant difference in the average number of classes missed between students with early classes and those without early classes. p-value of 0.1421 indicates no significant difference. Students without early classes have missed 2.65 on average. While students with early classes have missed 1.99 on average. Students with early classes appear to miss slightly fewer classes on average compared to those without early classes, but the difference is not statistically significant. This suggests that having early classes does not strongly influence attendance behavior.

Question 5: Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

sleepStudy_Depression <- filter(sleepStudy, DepressionStatus %in% c("moderate", "normal"))
t.test(Happiness ~ DepressionStatus, data = sleepStudy_Depression)
## 
##  Welch Two Sample t-test
## 
## data:  Happiness by DepressionStatus
## t = -4.3253, df = 43.992, p-value = 8.616e-05
## alternative hypothesis: true difference in means between group moderate and group normal is not equal to 0
## 95 percent confidence interval:
##  -5.818614 -2.119748
## sample estimates:
## mean in group moderate   mean in group normal 
##               23.08824               27.05742

The Welch Two Sample t-test indicates a statistically significant difference in the average happiness levels between students with at least moderate depression and those with normal depression status. A p-value of 8.616e-05 indicates a strongly statistically significant difference. Students with at least moderate depression have a mean happiness score of 23.09. While students with normal depression have a mean happiness score of 27.06.

Question 6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

t.test(PoorSleepQuality ~ AllNighter, data = sleepStudy)
## 
##  Welch Two Sample t-test
## 
## data:  PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -1.9456958  0.1608449
## sample estimates:
## mean in group 0 mean in group 1 
##        6.136986        7.029412

The Welch Two Sample t-test suggests no statistically significant difference in average sleep quality scores between students who reported having at least one all nighter and those who did not. The p-value indicates no significant difference. Students without an all nighter have a mean sleep quality score of 6.14, and students with at least one all nighter have a mean sleep quality score of 7.03. While students who reported an all-nighter have slightly poorer sleep quality on average, the difference is not statistically significant. This result suggests that having an all-nighter may not drastically affect perceived sleep quality, though other factors could contribute to individual variability.

Question 7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

sleepStudy_Alcohol <- filter(sleepStudy, AlcoholUse %in% c("Abstain", "Heavy"))
t.test(StressScore ~ AlcoholUse, data = sleepStudy_Alcohol)
## 
##  Welch Two Sample t-test
## 
## data:  StressScore by AlcoholUse
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means between group Abstain and group Heavy is not equal to 0
## 95 percent confidence interval:
##  -6.261170  3.327346
## sample estimates:
## mean in group Abstain   mean in group Heavy 
##              8.970588             10.437500

The Welch Two Sample t-test indicates no statistically significant difference in the average stress scores between students who abstain from alcohol and those who report heavy alcohol use. Abstainers have a mean stress score of 8.97, and heavy alcohol users have a mean stress score of 10.44. The stress levels of students who abstain from alcohol are slightly lower than those of heavy alcohol users, but this difference is not statistically significant. This suggests that alcohol use patterns may not have a direct, measurable impact on stress levels.

Question 8: Is there a significant difference in the average number of drinks per week between students of different genders?

t.test(Drinks ~ Gender, data = sleepStudy)
## 
##  Welch Two Sample t-test
## 
## data:  Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -4.360009 -2.241601
## sample estimates:
## mean in group 0 mean in group 1 
##        4.238411        7.539216

The Welch Two Sample t-test shows a highly significant difference in the average number of alcoholic drinks per week between male and female college students.Female students have a mean number of drinks per week at 4.24. While male students have a mean number of drinks per week at 7.54. Male students consume significantly more alcoholic drinks per week on average compared to female students. This notable difference may reflect varying social behaviors, norms, or attitudes toward alcohol use between genders.

Question 9: Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?

sleepStudy_Stress <- filter(sleepStudy, Stress %in% c("normal", "high"))
t.test(WeekdayBed ~ Stress, data = sleepStudy_Stress)
## 
##  Welch Two Sample t-test
## 
## data:  WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
##  -0.4856597  0.1447968
## sample estimates:
##   mean in group high mean in group normal 
##             24.71500             24.88543

The Welch Two Sample t-test suggests no statistically significant difference in the average weekday bedtime between students with high stress and those with normal stress levels. While students with normal stress levels tend to go to bed slightly later on weekdays compared to those with high stress, this difference is not statistically significant. This indicates that stress level may not be a key factor influencing weekday bedtime among students.

Question 10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

t.test(WeekendSleep ~ YearGroup, data = sleepStudy)
## 
##  Welch Two Sample t-test
## 
## data:  WeekendSleep by YearGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    8.213592                    8.221892

The Welch Two Sample t-test indicates no significant difference in the average weekend sleep duration between students in their first two years of college and those in other year groups. The weekend sleep duration is nearly identical between first-year students and those in later years, suggesting that year group does not significantly influence how much sleep students get on weekends. This could imply that factors other than academic workload, such as personal habits or social life, might play a bigger role in weekend sleep patterns.

Summary

The analysis of the ten research questions provided insights into various aspects of student behavior and health. For gender differences, the results showed that male students have significantly higher GPAs and consume more alcoholic drinks per week than female students. Additionally, first-year students tend to have more early classes than upperclassmen, while no significant difference in cognitive performance was observed between larks and owls. The data also revealed that students with at least one early class did not miss significantly more classes, and those with moderate depression exhibited lower happiness levels compared to those with normal depression status.

In terms of sleep-related behaviors, no significant difference was found in sleep quality between students who pulled all-nighters and those who did not. Stress levels did not show a meaningful relationship with alcohol use or weekday bedtimes, and no significant difference in weekend sleep was found between first-year students and those in later years. Finally, while alcohol use was linked to stress scores, students who abstain or lightly use alcohol did not report significantly better stress levels compared to heavy drinkers. These results suggest that, while some factors such as GPA and alcohol consumption show clear differences, many other factors, such as stress, sleep, and class year, did not show strong statistical significance.

References