Introduction

This report presents an analysis of sleep patterns, academic performance, well-being, and lifestyle behaviors among college students using the SleepStudy dataset. The objective of this report is to answer the following ten research questions:

  1. Is there a significant difference in average GPA between male and female students?
  2. Is there a significant difference in the number of early classes between underclassmen (Years 1-2) and upperclassmen (Years 3-4)?
  3. Do students who identify as larks have significantly better cognitive z-scores than owls?
  4. Do students with at least one early class miss more classes than those with none?
  5. Is average happiness lower among students with at least moderate depression compared to those with normal depression status?
  6. Do students who pulled an all-nighter report worse sleep quality than those who did not?
  7. Are stress scores significantly different between alcohol abstainers and heavy drinkers?
  8. Do students of different genders differ in drinks per week?
  9. Do students with high stress have later weekday bedtimes compared to those with normal stress?
  10. Do underclassmen get less weekend sleep than upperclassmen?

Data

The SleepStudy dataset includes 253 observations on the following 27 variables:

Demographics: Gender, ClassYear Sleep habits: WeekdayBed, WeekdayRise, WeekdaySleep, WeekendBed, WeekendRise, WeekendSleep, AverageSleep, AllNighter Psychological variables: StressScore, DepressionScore, AnxietyScore, DepressionStatus, AnxietyStatus, Stress, DASScore Academic variables: GPA, NumEarlyClass, EarlyClass, ClassesMissed, CognitionZscore, PoorSleepQuality, LarkOwl, Happiness Lifestyle variables: AlcoholUse, Drinks

The data were obtained from students who completed cognitive skills tests, filled out surveys on attitudes and habits, and kept a two-week sleep diary recording sleep timing and quality.

Analysis

Here we will analyze the questions in further detail using R.

library(lessR)
## 
## lessR 4.4.5                         feedback: gerbing@pdx.edu 
## --------------------------------------------------------------
## > d <- Read("")  Read data file, many formats available, e.g., Excel
##   d is default data frame, data= in analysis routines optional
## 
## Many examples of reading, writing, and manipulating data, 
## graphics, testing means and proportions, regression, factor analysis,
## customization, forecasting, and aggregation from pivot tables
##   Enter: browseVignettes("lessR")
## 
## View lessR updates, now including time series forecasting
##   Enter: news(package="lessR")
## 
## Interactive data analysis
##   Enter: interact()
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:lessR':
## 
##     order_by, recode, rename
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
sleep = Read("https://www.lock5stat.com/datasets3e/SleepStudy.csv", quiet=TRUE)
head(sleep)
##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0
names(sleep)
##  [1] "Gender"           "ClassYear"        "LarkOwl"          "NumEarlyClass"   
##  [5] "EarlyClass"       "GPA"              "ClassesMissed"    "CognitionZscore" 
##  [9] "PoorSleepQuality" "DepressionScore"  "AnxietyScore"     "StressScore"     
## [13] "DepressionStatus" "AnxietyStatus"    "Stress"           "DASScore"        
## [17] "Happiness"        "AlcoholUse"       "Drinks"           "WeekdayBed"      
## [21] "WeekdayRise"      "WeekdaySleep"     "WeekendBed"       "WeekendRise"     
## [25] "WeekendSleep"     "AverageSleep"     "AllNighter"
sleep <- sleep %>%
  mutate(
    Gender = factor(Gender, levels = c(0, 1), labels = c("Female", "Male")),
    LowerYear = ifelse(ClassYear %in% c(1,2), "Lower", "Upper"),
    AllNighter = as.factor(AllNighter),
    EarlyClass = as.factor(EarlyClass),
    Stress = factor(Stress),
    DepressionStatus = as.character(DepressionStatus),
    DepressionScore = as.numeric(DepressionScore),
    AlcoholUse = factor(AlcoholUse),
    LarkOwl = factor(LarkOwl)
)
sleep_larkowl <- sleep %>% filter(LarkOwl %in% c("Lark", "Owl"))
sleep$DepBin <- ifelse(sleep$DepressionStatus == "normal", "Normal", "AtLeastModerate")
sleep_alc <- sleep %>% filter(AlcoholUse %in% c("Abstain", "Heavy"))

Q1: GPA difference between genders

t.test(GPA ~ Gender, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group Female   mean in group Male 
##             3.324901             3.123725
boxplot(GPA ~ Gender, data = sleep,
        ylab = "GPA", xlab = "Gender",
        main = "GPA by Gender",
        col = c("steelblue", "salmon"))

Q2: Early classes among lower vs upper class year

t.test(NumEarlyClass ~ LowerYear, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  NumEarlyClass by LowerYear
## t = 4.1813, df = 250.69, p-value = 0.00004009
## alternative hypothesis: true difference in means between group Lower and group Upper is not equal to 0
## 95 percent confidence interval:
##  0.4042016 1.1240309
## sample estimates:
## mean in group Lower mean in group Upper 
##            2.070423            1.306306
boxplot(NumEarlyClass ~ LowerYear, data = sleep,
        ylab = "Number of Early Classes",
        xlab = "Class Year Group",
        main = "Early Classes by Class Year",
        col = c("lightgreen", "orange"))

Q3: Cognition Z-score for Lark vs Owl

t.test(CognitionZscore ~ LarkOwl, data = sleep_larkowl)
## 
##  Welch Two Sample t-test
## 
## data:  CognitionZscore by LarkOwl
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
##  -0.1893561  0.4465786
## sample estimates:
## mean in group Lark  mean in group Owl 
##         0.09024390        -0.03836735
boxplot(CognitionZscore ~ LarkOwl, data = sleep_larkowl,
        ylab = "Cognition Z-score",
        xlab = "Chronotype",
        main = "Cognition Z-score: Lark vs Owl",
        col = c("gold", "purple"))

Q4: Early class indicator vs classes missed

t.test(ClassesMissed ~ EarlyClass, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  ClassesMissed by EarlyClass
## t = 1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.2233558  1.5412830
## sample estimates:
## mean in group 0 mean in group 1 
##        2.647059        1.988095
boxplot(ClassesMissed ~ EarlyClass, data = sleep,
        ylab = "Classes Missed",
        xlab = "Early Class (0=No, 1=Yes)",
        main = "Classes Missed vs Early Class",
        col = c("grey80", "lightblue"))

Q5: Happiness: normal vs at least moderate depression

t.test(Happiness ~ DepBin, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  Happiness by DepBin
## t = -5.6339, df = 55.594, p-value = 0.0000006057
## alternative hypothesis: true difference in means between group AtLeastModerate and group Normal is not equal to 0
## 95 percent confidence interval:
##  -7.379724 -3.507836
## sample estimates:
## mean in group AtLeastModerate          mean in group Normal 
##                      21.61364                      27.05742
boxplot(Happiness ~ DepBin, data = sleep,
        ylab = "Happiness",
        xlab = "Depression Status",
        main = "Happiness by Depression Status",
        col = c("lightblue", "coral"))

Q6: All-nighter vs sleep quality

t.test(PoorSleepQuality ~ AllNighter, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -1.9456958  0.1608449
## sample estimates:
## mean in group 0 mean in group 1 
##        6.136986        7.029412
boxplot(PoorSleepQuality ~ AllNighter, data = sleep,
        ylab = "Poor Sleep Quality Score",
        xlab = "All-Nighter (0=No, 1=Yes)",
        main = "Sleep Quality vs All-Nighter",
        col = c("lightblue", "red"))

Q7: Stress score: abstainers vs heavy drinkers

t.test(StressScore ~ AlcoholUse, data = sleep_alc)
## 
##  Welch Two Sample t-test
## 
## data:  StressScore by AlcoholUse
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means between group Abstain and group Heavy is not equal to 0
## 95 percent confidence interval:
##  -6.261170  3.327346
## sample estimates:
## mean in group Abstain   mean in group Heavy 
##              8.970588             10.437500
boxplot(StressScore ~ AlcoholUse, data = sleep_alc,
        ylab = "Stress Score",
        xlab = "Alcohol Use",
        main = "Stress Scores: Abstain vs Heavy",
        col = c("aquamarine", "orchid"))

Q8: Drinks per week by gender

t.test(Drinks ~ Gender, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 0.000000007002
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -4.360009 -2.241601
## sample estimates:
## mean in group Female   mean in group Male 
##             4.238411             7.539216
boxplot(Drinks ~ Gender, data = sleep,
        ylab = "Drinks per Week",
        xlab = "Gender",
        main = "Alcohol Consumption by Gender",
        col = c("steelblue", "salmon"))

Q9: Weekday bedtime and stress group

t.test(WeekdayBed ~ Stress, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
##  -0.4856597  0.1447968
## sample estimates:
##   mean in group high mean in group normal 
##             24.71500             24.88543
boxplot(WeekdayBed ~ Stress, data = sleep,
        ylab = "Weekday Bedtime (24 = Midnight)",
        xlab = "Stress Status",
        main = "Bedtime by Stress Group",
        col = c("skyblue", "darkred"))

Q10: Weekend sleep between lower and upper years

t.test(WeekendSleep ~ LowerYear, data = sleep)
## 
##  Welch Two Sample t-test
## 
## data:  WeekendSleep by LowerYear
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group Lower and group Upper is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean in group Lower mean in group Upper 
##            8.213592            8.221892
boxplot(WeekendSleep ~ LowerYear, data = sleep,
        ylab = "Weekend Sleep Hours",
        xlab = "Class Year Group",
        main = "Weekend Sleep by Class Year",
        col = c("lightgreen", "orange"))

Summary

1. GPA by Gender

Female students report a significantly higher average GPA (mean = 3.32) than male students (mean = 3.12). The difference is statistically significant (p = 0.0001), and the confidence interval excludes zero, suggesting a robust difference.

2. Number of Early Classes by Class Year

Underclassmen (years 1-2, mean = 2.07) have significantly more early classes than upperclassmen (years 3-4, mean = 1.31). The difference is highly significant (p < 0.0001), indicating a strong association between class year and early scheduling.

3. Cognition Z-Score by Lark/Owl Type

Larks have a slightly higher average cognition z-score (mean = 0.09) than Owls (mean = -0.04), but the difference is not statistically significant (p = 0.42), suggesting no meaningful cognitive advantage between the two.

4. Classes Missed by EarlyClass Status

There is no statistically significant difference in average number of classes missed between students with early classes (mean = 1.99) and those without (mean = 2.65), (p = 0.14). The confidence interval includes zero, indicating weak evidence of an association.

5. Happiness by Depression Status

Students classified with at least moderate depression report much lower happiness scores (mean = 21.6) compared to those with normal status (mean = 27.1). This difference is highly significant (p < 0.000001), highlighting a strong relationship between depression severity and happiness.

6. Sleep Quality by AllNighter Status

Those who have pulled at least one all-nighter have slightly poorer sleep quality (mean = 7.03) than those who have not (mean = 6.14), but this difference is not statistically significant (p = 0.095), suggesting only weak evidence for an association.

7. Stress Scores by Alcohol Use

Average stress scores are higher for heavy drinkers (mean = 10.44) than abstainers (mean = 8.97), but the difference is not statistically significant (p = 0.54), indicating no clear relationship between alcohol abstention and stress levels.

8. Drinks per Week by Gender

Male students consume significantly more alcoholic drinks per week (mean = 7.54) than female students (mean = 4.24). This difference is highly significant (p < 0.0000001), demonstrating a strong gender disparity in drinking behaviors.

9. Weekday Bedtime by Stress Level

Weekday bedtimes are similar between students with high stress (mean = 24.72) and those with normal stress (mean = 24.89); the difference is not statistically significant (p = 0.29), showing no clear link between stress level and weekday bedtime.

10. Weekend Sleep by Class Year

Underclassmen and upperclassmen report nearly identical hours of sleep on weekends (means = 8.21 and 8.22, respectively). This difference is not statistically significant (p = 0.96), indicating no association between class year and weekend sleep duration.

References