Introduction

This report is an analyssi on sleep patterns and how a students academics are affected.

The primary objective of this analysis is to address several research questions related to:
- Gender differences in academic performance (GPA)
- The influence of early class schedules on attendance and sleep habits
- The impact of chronotype (i.e., “larks” vs. “owls”) on cognitive performance
- Relationships between depression, happiness, alcohol use, and stress
- Differences in sleep patterns across student class years

Data

SleepStudy <- read_excel("~/Downloads/SleepStudy.xlsx")
str(SleepStudy)
## tibble [253 × 27] (S3: tbl_df/tbl/data.frame)
##  $ Gender          : num [1:253] 0 0 0 0 0 1 1 0 0 0 ...
##  $ ClassYear       : num [1:253] 4 4 4 1 4 4 2 2 1 4 ...
##  $ LarkOwl         : chr [1:253] "Neither" "Neither" "Owl" "Lark" ...
##  $ NumEarlyClass   : num [1:253] 0 2 0 5 0 0 2 0 2 2 ...
##  $ EarlyClass      : num [1:253] 0 1 0 1 0 0 1 0 1 1 ...
##  $ GPA             : num [1:253] 3.6 3.24 2.97 3.76 3.2 3.5 3.35 3 4 2.9 ...
##  $ ClassesMissed   : num [1:253] 0 0 12 0 4 0 2 0 0 0 ...
##  $ CognitionZscore : num [1:253] -0.26 1.39 0.38 1.39 1.22 -0.04 0.41 -0.59 1.03 0.72 ...
##  $ PoorSleepQuality: num [1:253] 4 6 18 9 9 6 2 10 5 2 ...
##  $ DepressionScore : num [1:253] 4 1 18 1 7 14 1 2 12 6 ...
##  $ AnxietyScore    : num [1:253] 3 0 18 4 25 8 0 2 16 11 ...
##  $ StressScore     : num [1:253] 8 3 9 6 14 28 1 3 20 31 ...
##  $ DepressionStatus: chr [1:253] "normal" "normal" "moderate" "normal" ...
##  $ AnxietyStatus   : chr [1:253] "normal" "normal" "severe" "normal" ...
##  $ Stress          : chr [1:253] "normal" "normal" "normal" "normal" ...
##  $ DASScore        : num [1:253] 15 4 45 11 46 50 2 7 48 48 ...
##  $ Happiness       : num [1:253] 28 25 17 32 15 22 25 29 29 30 ...
##  $ AlcoholUse      : chr [1:253] "Moderate" "Moderate" "Light" "Light" ...
##  $ Drinks          : num [1:253] 10 6 3 2 4 0 6 3 3 6 ...
##  $ WeekdayBed      : num [1:253] 25.8 25.7 27.4 23.5 25.9 ...
##  $ WeekdayRise     : num [1:253] 8.7 8.2 6.55 7.17 8.67 8.95 8.48 9.07 8.75 8 ...
##  $ WeekdaySleep    : num [1:253] 7.7 6.8 3 6.77 6.09 9.05 7.73 9.02 8.25 6.6 ...
##  $ WeekendBed      : num [1:253] 25.8 26 28 27 23.8 ...
##  $ WeekendRise     : num [1:253] 9.5 10 12.6 8 9.5 ...
##  $ WeekendSleep    : num [1:253] 5.88 7.25 10.09 7.25 7 ...
##  $ AverageSleep    : num [1:253] 7.18 6.93 5.02 6.9 6.35 9.04 7.52 9.01 8.54 6.68 ...
##  $ AllNighter      : num [1:253] 0 0 0 0 0 0 1 0 0 0 ...
print(names(SleepStudy))
##  [1] "Gender"           "ClassYear"        "LarkOwl"          "NumEarlyClass"   
##  [5] "EarlyClass"       "GPA"              "ClassesMissed"    "CognitionZscore" 
##  [9] "PoorSleepQuality" "DepressionScore"  "AnxietyScore"     "StressScore"     
## [13] "DepressionStatus" "AnxietyStatus"    "Stress"           "DASScore"        
## [17] "Happiness"        "AlcoholUse"       "Drinks"           "WeekdayBed"      
## [21] "WeekdayRise"      "WeekdaySleep"     "WeekendBed"       "WeekendRise"     
## [25] "WeekendSleep"     "AverageSleep"     "AllNighter"
print(table(SleepStudy$ClassYear, useNA = "ifany"))
## 
##  1  2  3  4 
## 47 95 54 57
print(table(SleepStudy$LarkOwl, useNA = "ifany"))
## 
##    Lark Neither     Owl 
##      41     163      49
print(table(SleepStudy$DepressionStatus, useNA = "ifany"))
## 
## moderate   normal   severe 
##       34      209       10
if(!is.factor(SleepStudy$ClassYear)) {
  SleepStudy$ClassYear <- factor(SleepStudy$ClassYear)
}
if(!is.factor(SleepStudy$Gender)) {
  SleepStudy$Gender <- factor(SleepStudy$Gender)
}


if(is.factor(SleepStudy$EarlyClass)) {
  SleepStudy$EarlyClass <- as.numeric(as.character(SleepStudy$EarlyClass))
}


if(!("StressScore" %in% names(SleepStudy)) && "Stress" %in% names(SleepStudy)) {
  if(is.factor(SleepStudy$Stress)) {
    SleepStudy$StressScore <- as.numeric(as.character(SleepStudy$Stress))
  } else {
    SleepStudy$StressScore <- SleepStudy$Stress
  }
}


if(is.numeric(SleepStudy$Stress)) {
  SleepStudy$StressCat <- ifelse(SleepStudy$Stress >= median(SleepStudy$Stress, na.rm = TRUE),
                                 "High", "Normal")
  SleepStudy$StressCat <- factor(SleepStudy$StressCat, levels = c("Normal", "High"))
}


if(any(tolower(as.character(SleepStudy$ClassYear)) %in% c("freshman", "sophomore"))) {
  SleepStudy$Group <- ifelse(tolower(as.character(SleepStudy$ClassYear)) %in% c("freshman", "sophomore"),
                             "Underclassmen", "Upperclassmen")
} else if(is.numeric(as.numeric(as.character(SleepStudy$ClassYear)))) {
  # If ClassYear is numeric (e.g., 1, 2, 3, 4), consider values <=2 as underclassmen.
  class_year_numeric <- as.numeric(as.character(SleepStudy$ClassYear))
  SleepStudy$Group <- ifelse(class_year_numeric <= 2, "Underclassmen", "Upperclassmen")
} else {
  
  levs <- sort(unique(SleepStudy$ClassYear))
  cutoff_index <- ceiling(length(levs) / 2)
  lower_levels <- levs[1:cutoff_index]
  SleepStudy$Group <- ifelse(SleepStudy$ClassYear %in% lower_levels, "Underclassmen", "Upperclassmen")
}
SleepStudy$Group <- factor(SleepStudy$Group, levels = c("Underclassmen", "Upperclassmen"))


print(table(SleepStudy$ClassYear))
## 
##  1  2  3  4 
## 47 95 54 57
print(table(SleepStudy$Group))
## 
## Underclassmen Upperclassmen 
##           142           111

Analysis

For each research question, if the intended grouping variable does not have the required two levels, the code now prints a message and outputs descriptive statistics for that single group rather than skipping the analysis entirely.

Question 1: Gender Differences in GPA

boxplot(GPA ~ Gender, data = SleepStudy,
        main = "GPA Distribution by Gender",
        xlab = "Gender", ylab = "GPA")

t_test_gender <- t.test(GPA ~ Gender, data = SleepStudy)
print(t_test_gender)
## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725

Question 2: Early Classes Across Class Years

SleepStudy_dropped <- droplevels(SleepStudy)
group_counts <- table(SleepStudy_dropped$Group)
print(group_counts)
## 
## Underclassmen Upperclassmen 
##           142           111
if(length(group_counts[group_counts > 0]) < 2) {
  message("Only one group present for ClassYear (", names(group_counts)[group_counts > 0],
          "). Displaying summary for NumEarlyClass:")
  print(summary(SleepStudy_dropped$NumEarlyClass))
} else {
  boxplot(NumEarlyClass ~ Group, data = SleepStudy_dropped,
          main = "Number of Early Classes by Class Year Group",
          xlab = "Class Year Group", ylab = "Number of Early Classes")
  anova_early <- aov(NumEarlyClass ~ Group, data = SleepStudy_dropped)
  print(summary(anova_early))
}

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## Group         1   36.4   36.38   16.34 7.06e-05 ***
## Residuals   251  558.9    2.23                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Question 3: Chronotype and Cognitive Performance

SleepStudy$LarkOwl <- factor(SleepStudy$LarkOwl)
if(length(levels(SleepStudy$LarkOwl)) != 2) {
  message("LarkOwl has ", length(levels(SleepStudy$LarkOwl)), " level(s). Displaying summary for CognitionZscore:")
  print(summary(SleepStudy$CognitionZscore))
} else {
  boxplot(CognitionZscore ~ LarkOwl, data = SleepStudy,
          main = "Cognition (Z-Score) by Chronotype",
          xlab = "Chronotype (Lark/Owl)", ylab = "Cognition Z-Score")
  t_test_chrono <- t.test(CognitionZscore ~ LarkOwl, data = SleepStudy)
  print(t_test_chrono)
}
## LarkOwl has 3 level(s). Displaying summary for CognitionZscore:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -1.62e+00 -4.80e-01 -1.00e-02 -3.95e-05  4.40e-01  1.96e+00

Question 4: Impact of Early Classes on Attendance

if(length(unique(SleepStudy$EarlyClass)) < 2) {
  message("EarlyClass has only one value. Displaying summary for ClassesMissed:")
  print(summary(SleepStudy$ClassesMissed))
} else {
  boxplot(ClassesMissed ~ EarlyClass, data = SleepStudy,
          main = "Classes Missed by Early Class Attendance",
          xlab = "Early Class (0 = None, 1 = At Least One)", ylab = "Classes Missed")
  t_test_attendance <- t.test(ClassesMissed ~ EarlyClass, data = SleepStudy)
  print(t_test_attendance)
}

## 
##  Welch Two Sample t-test
## 
## data:  ClassesMissed by EarlyClass
## t = 1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.2233558  1.5412830
## sample estimates:
## mean in group 0 mean in group 1 
##        2.647059        1.988095

Question 5: Depression Status and Happiness

SleepStudy$DepressionStatus <- factor(SleepStudy$DepressionStatus)
if(length(levels(SleepStudy$DepressionStatus)) != 2) {
  message("DepressionStatus has ", length(levels(SleepStudy$DepressionStatus)), " level(s). Displaying summary for Happiness:")
  print(summary(SleepStudy$Happiness))
} else {
  boxplot(Happiness ~ DepressionStatus, data = SleepStudy,
          main = "Happiness by Depression Status",
          xlab = "Depression Status", ylab = "Happiness")
  t_test_happiness <- t.test(Happiness ~ DepressionStatus, data = SleepStudy)
  print(t_test_happiness)
}
## DepressionStatus has 3 level(s). Displaying summary for Happiness:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   24.00   28.00   26.11   30.00   35.00

Question 6: All-Nighter Experience and Sleep Quality

if(length(unique(SleepStudy$AllNighter)) < 2) {
  message("AllNighter has only one level. Displaying summary for PoorSleepQuality:")
  print(summary(SleepStudy$PoorSleepQuality))
} else {
  boxplot(PoorSleepQuality ~ AllNighter, data = SleepStudy,
          main = "Sleep Quality by All-Nighter Experience",
          xlab = "All-Nighter (0 = None, 1 = At Least One)", ylab = "Poor Sleep Quality")
  t_test_sleepquality <- t.test(PoorSleepQuality ~ AllNighter, data = SleepStudy)
  print(t_test_sleepquality)
}

## 
##  Welch Two Sample t-test
## 
## data:  PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -1.9456958  0.1608449
## sample estimates:
## mean in group 0 mean in group 1 
##        6.136986        7.029412

Question 7: Alcohol Abstinence and Stress Scores

dat <- subset(SleepStudy, !is.na(StressScore) & !is.na(AlcoholUse) & is.finite(StressScore))
if(nrow(dat) == 0) {
  message("No valid observations for StressScore and AlcoholUse. Cannot perform analysis.")
} else {
  dat$AlcoholUse <- factor(dat$AlcoholUse, levels = c(0, 1), labels = c("Abstinent", "Heavy Use"))
  if(length(unique(dat$AlcoholUse)) < 2) {
    message("Only one group in AlcoholUse present. Displaying summary for StressScore:")
    print(summary(dat$StressScore))
  } else {
    stress_range <- range(dat$StressScore[is.finite(dat$StressScore)], na.rm = TRUE)
    boxplot(StressScore ~ AlcoholUse, data = dat,
            main = "Stress Scores by Alcohol Use",
            xlab = "Alcohol Use", ylab = "Stress Score", ylim = stress_range)
    t_test_stress <- t.test(StressScore ~ AlcoholUse, data = dat)
    print(t_test_stress)
  }
}
## Only one group in AlcoholUse present. Displaying summary for StressScore:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   3.000   8.000   9.466  14.000  37.000

Question 8: Gender Differences in Weekly Alcohol Consumption

if(length(unique(SleepStudy$Gender)) < 2) {
  message("Only one gender level present. Displaying summary for Drinks:")
  print(summary(SleepStudy$Drinks))
} else {
  boxplot(Drinks ~ Gender, data = SleepStudy,
          main = "Weekly Alcohol Consumption by Gender",
          xlab = "Gender", ylab = "Number of Drinks")
  t_test_alcohol <- t.test(Drinks ~ Gender, data = SleepStudy)
  print(t_test_alcohol)
}

## 
##  Welch Two Sample t-test
## 
## data:  Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -4.360009 -2.241601
## sample estimates:
## mean in group 0 mean in group 1 
##        4.238411        7.539216

Question 9: Stress Levels and Weekday Bedtime

if(length(unique(SleepStudy$Stress)) < 2) {
  message("Only one stress level present (", paste(levels(SleepStudy$Stress), collapse=", "),
          "). Displaying summary for WeekdayBed:")
  print(summary(SleepStudy$WeekdayBed))
} else {
  boxplot(WeekdayBed ~ Stress, data = SleepStudy,
          main = "Weekday Bedtime by Stress Level",
          xlab = "Stress Level (Normal vs. High)", ylab = "Weekday Bedtime")
  t_test_bedtime <- t.test(WeekdayBed ~ Stress, data = SleepStudy)
  print(t_test_bedtime)
}

## 
##  Welch Two Sample t-test
## 
## data:  WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
##  -0.4856597  0.1447968
## sample estimates:
##   mean in group high mean in group normal 
##             24.71500             24.88543

Question 10: Class Year and Weekend Sleep Hours

SleepStudy$Group <- droplevels(SleepStudy$Group)
if(nlevels(SleepStudy$Group) < 2) {
  message("Only one level present in Group (", paste(levels(SleepStudy$Group), collapse=", "),
          "). Displaying summary for WeekendSleep:")
  print(summary(SleepStudy$WeekendSleep))
} else {
  boxplot(WeekendSleep ~ Group, data = SleepStudy,
          main = "Weekend Sleep Hours by Class Year Group",
          xlab = "Class Year Group", ylab = "Weekend Sleep Hours")
  anova_weekend_sleep <- aov(WeekendSleep ~ Group, data = SleepStudy)
  print(summary(anova_weekend_sleep))
}

##              Df Sum Sq Mean Sq F value Pr(>F)
## Group         1    0.0  0.0043   0.002  0.962
## Residuals   251  470.8  1.8755

Summary

The analysis of the SleepStudy dataset provided insights into several aspects of college students’ sleep and related behaviors: - Gender Differences in GPA: A t-test examines whether male and female students differ significantly in GPA. - Early Classes Across Class Years: Using NumEarlyClass and the grouping variable (Underclassmen vs. Upperclassmen), we compare early class attendance. If only one group is present, descriptive summaries are provided. - Chronotype and Cognitive Performance: Cognitive performance (CognitionZscore) is compared between Larks and Owls; otherwise, a summary is shown. - Impact of Early Classes on Attendance: The relationship between early class attendance and ClassesMissed is evaluated, with descriptive stats if only one value exists. - Depression Status and Happiness: Happiness scores are compared across depression statuses (or summarized if only one level exists). - All-Nighter Experience and Sleep Quality: The impact of pulling an all‑nighter on poor sleep quality is analyzed. - Alcohol Use and Stress: Stress scores are compared between abstinent and heavy alcohol users; if only one group exists, summaries are provided. - Gender and Alcohol Consumption: Weekly alcohol consumption (Drinks) is compared by gender. - Stress Levels and Weekday Bedtime: Weekday bedtimes are analyzed based on stress levels (Stress). - Class Year and Weekend Sleep Hours: Weekend sleep hours are compared between underclassmen and upperclassmen; if only one group is present, descriptive statistics are displayed.

References