This report presents an analysis of sleep patterns, academic performance, well-being, and lifestyle behaviors among college students using the SleepStudy dataset. The objective of this report is to answer the following ten research questions:
Demographics: Gender, ClassYear Sleep habits: WeekdayBed, WeekdayRise, WeekdaySleep, WeekendBed, WeekendRise, WeekendSleep, AverageSleep, AllNighter Psychological variables: StressScore, DepressionScore, AnxietyScore, DepressionStatus, AnxietyStatus, Stress, DASScore Academic variables: GPA, NumEarlyClass, EarlyClass, ClassesMissed, CognitionZscore, PoorSleepQuality, LarkOwl, Happiness Lifestyle variables: AlcoholUse, Drinks
The data were obtained from students who completed cognitive skills tests, filled out surveys on attitudes and habits, and kept a two-week sleep diary recording sleep timing and quality.
Here we will analyze the questions in further detail using R.
library(lessR)
##
## lessR 4.4.5 feedback: gerbing@pdx.edu
## --------------------------------------------------------------
## > d <- Read("") Read data file, many formats available, e.g., Excel
## d is default data frame, data= in analysis routines optional
##
## Many examples of reading, writing, and manipulating data,
## graphics, testing means and proportions, regression, factor analysis,
## customization, forecasting, and aggregation from pivot tables
## Enter: browseVignettes("lessR")
##
## View lessR updates, now including time series forecasting
## Enter: news(package="lessR")
##
## Interactive data analysis
## Enter: interact()
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:lessR':
##
## order_by, recode, rename
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
sleep = Read("https://www.lock5stat.com/datasets3e/SleepStudy.csv", quiet=TRUE)
head(sleep)
## Gender ClassYear LarkOwl NumEarlyClass EarlyClass GPA ClassesMissed
## 1 0 4 Neither 0 0 3.60 0
## 2 0 4 Neither 2 1 3.24 0
## 3 0 4 Owl 0 0 2.97 12
## 4 0 1 Lark 5 1 3.76 0
## 5 0 4 Owl 0 0 3.20 4
## 6 1 4 Neither 0 0 3.50 0
## CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1 -0.26 4 4 3 8
## 2 1.39 6 1 0 3
## 3 0.38 18 18 18 9
## 4 1.39 9 1 4 6
## 5 1.22 9 7 25 14
## 6 -0.04 6 14 8 28
## DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1 normal normal normal 15 28 Moderate 10
## 2 normal normal normal 4 25 Moderate 6
## 3 moderate severe normal 45 17 Light 3
## 4 normal normal normal 11 32 Light 2
## 5 normal severe normal 46 15 Moderate 4
## 6 moderate moderate high 50 22 Abstain 0
## WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1 25.75 8.70 7.70 25.75 9.50 5.88
## 2 25.70 8.20 6.80 26.00 10.00 7.25
## 3 27.44 6.55 3.00 28.00 12.59 10.09
## 4 23.50 7.17 6.77 27.00 8.00 7.25
## 5 25.90 8.67 6.09 23.75 9.50 7.00
## 6 23.80 8.95 9.05 26.00 10.75 9.00
## AverageSleep AllNighter
## 1 7.18 0
## 2 6.93 0
## 3 5.02 0
## 4 6.90 0
## 5 6.35 0
## 6 9.04 0
names(sleep)
## [1] "Gender" "ClassYear" "LarkOwl" "NumEarlyClass"
## [5] "EarlyClass" "GPA" "ClassesMissed" "CognitionZscore"
## [9] "PoorSleepQuality" "DepressionScore" "AnxietyScore" "StressScore"
## [13] "DepressionStatus" "AnxietyStatus" "Stress" "DASScore"
## [17] "Happiness" "AlcoholUse" "Drinks" "WeekdayBed"
## [21] "WeekdayRise" "WeekdaySleep" "WeekendBed" "WeekendRise"
## [25] "WeekendSleep" "AverageSleep" "AllNighter"
sleep <- sleep %>%
mutate(
Gender = factor(Gender, levels = c(0, 1), labels = c("Female", "Male")),
LowerYear = ifelse(ClassYear %in% c(1,2), "Lower", "Upper"),
AllNighter = as.factor(AllNighter),
EarlyClass = as.factor(EarlyClass),
Stress = factor(Stress),
DepressionStatus = as.character(DepressionStatus),
DepressionScore = as.numeric(DepressionScore),
AlcoholUse = factor(AlcoholUse),
LarkOwl = factor(LarkOwl)
)
sleep_larkowl <- sleep %>% filter(LarkOwl %in% c("Lark", "Owl"))
sleep$DepBin <- ifelse(sleep$DepressionStatus == "normal", "Normal", "AtLeastModerate")
sleep_alc <- sleep %>% filter(AlcoholUse %in% c("Abstain", "Heavy"))
t.test(GPA ~ Gender, data = sleep)
##
## Welch Two Sample t-test
##
## data: GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
## 0.09982254 0.30252780
## sample estimates:
## mean in group Female mean in group Male
## 3.324901 3.123725
boxplot(GPA ~ Gender, data = sleep,
ylab = "GPA", xlab = "Gender",
main = "GPA by Gender",
col = c("steelblue", "salmon"))
t.test(NumEarlyClass ~ LowerYear, data = sleep)
##
## Welch Two Sample t-test
##
## data: NumEarlyClass by LowerYear
## t = 4.1813, df = 250.69, p-value = 0.00004009
## alternative hypothesis: true difference in means between group Lower and group Upper is not equal to 0
## 95 percent confidence interval:
## 0.4042016 1.1240309
## sample estimates:
## mean in group Lower mean in group Upper
## 2.070423 1.306306
boxplot(NumEarlyClass ~ LowerYear, data = sleep,
ylab = "Number of Early Classes",
xlab = "Class Year Group",
main = "Early Classes by Class Year",
col = c("lightgreen", "orange"))
t.test(CognitionZscore ~ LarkOwl, data = sleep_larkowl)
##
## Welch Two Sample t-test
##
## data: CognitionZscore by LarkOwl
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
## -0.1893561 0.4465786
## sample estimates:
## mean in group Lark mean in group Owl
## 0.09024390 -0.03836735
boxplot(CognitionZscore ~ LarkOwl, data = sleep_larkowl,
ylab = "Cognition Z-score",
xlab = "Chronotype",
main = "Cognition Z-score: Lark vs Owl",
col = c("gold", "purple"))
t.test(ClassesMissed ~ EarlyClass, data = sleep)
##
## Welch Two Sample t-test
##
## data: ClassesMissed by EarlyClass
## t = 1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.2233558 1.5412830
## sample estimates:
## mean in group 0 mean in group 1
## 2.647059 1.988095
boxplot(ClassesMissed ~ EarlyClass, data = sleep,
ylab = "Classes Missed",
xlab = "Early Class (0=No, 1=Yes)",
main = "Classes Missed vs Early Class",
col = c("grey80", "lightblue"))
t.test(Happiness ~ DepBin, data = sleep)
##
## Welch Two Sample t-test
##
## data: Happiness by DepBin
## t = -5.6339, df = 55.594, p-value = 0.0000006057
## alternative hypothesis: true difference in means between group AtLeastModerate and group Normal is not equal to 0
## 95 percent confidence interval:
## -7.379724 -3.507836
## sample estimates:
## mean in group AtLeastModerate mean in group Normal
## 21.61364 27.05742
boxplot(Happiness ~ DepBin, data = sleep,
ylab = "Happiness",
xlab = "Depression Status",
main = "Happiness by Depression Status",
col = c("lightblue", "coral"))
t.test(PoorSleepQuality ~ AllNighter, data = sleep)
##
## Welch Two Sample t-test
##
## data: PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -1.9456958 0.1608449
## sample estimates:
## mean in group 0 mean in group 1
## 6.136986 7.029412
boxplot(PoorSleepQuality ~ AllNighter, data = sleep,
ylab = "Poor Sleep Quality Score",
xlab = "All-Nighter (0=No, 1=Yes)",
main = "Sleep Quality vs All-Nighter",
col = c("lightblue", "red"))
t.test(StressScore ~ AlcoholUse, data = sleep_alc)
##
## Welch Two Sample t-test
##
## data: StressScore by AlcoholUse
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means between group Abstain and group Heavy is not equal to 0
## 95 percent confidence interval:
## -6.261170 3.327346
## sample estimates:
## mean in group Abstain mean in group Heavy
## 8.970588 10.437500
boxplot(StressScore ~ AlcoholUse, data = sleep_alc,
ylab = "Stress Score",
xlab = "Alcohol Use",
main = "Stress Scores: Abstain vs Heavy",
col = c("aquamarine", "orchid"))
t.test(Drinks ~ Gender, data = sleep)
##
## Welch Two Sample t-test
##
## data: Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 0.000000007002
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
## -4.360009 -2.241601
## sample estimates:
## mean in group Female mean in group Male
## 4.238411 7.539216
boxplot(Drinks ~ Gender, data = sleep,
ylab = "Drinks per Week",
xlab = "Gender",
main = "Alcohol Consumption by Gender",
col = c("steelblue", "salmon"))
t.test(WeekdayBed ~ Stress, data = sleep)
##
## Welch Two Sample t-test
##
## data: WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
## -0.4856597 0.1447968
## sample estimates:
## mean in group high mean in group normal
## 24.71500 24.88543
boxplot(WeekdayBed ~ Stress, data = sleep,
ylab = "Weekday Bedtime (24 = Midnight)",
xlab = "Stress Status",
main = "Bedtime by Stress Group",
col = c("skyblue", "darkred"))
t.test(WeekendSleep ~ LowerYear, data = sleep)
##
## Welch Two Sample t-test
##
## data: WeekendSleep by LowerYear
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group Lower and group Upper is not equal to 0
## 95 percent confidence interval:
## -0.3497614 0.3331607
## sample estimates:
## mean in group Lower mean in group Upper
## 8.213592 8.221892
boxplot(WeekendSleep ~ LowerYear, data = sleep,
ylab = "Weekend Sleep Hours",
xlab = "Class Year Group",
main = "Weekend Sleep by Class Year",
col = c("lightgreen", "orange"))
Female students report a significantly higher average GPA (mean = 3.32) than male students (mean = 3.12). The difference is statistically significant (p = 0.0001), and the confidence interval excludes zero, suggesting a robust difference.
Underclassmen (years 1-2, mean = 2.07) have significantly more early classes than upperclassmen (years 3-4, mean = 1.31). The difference is highly significant (p < 0.0001), indicating a strong association between class year and early scheduling.
Larks have a slightly higher average cognition z-score (mean = 0.09) than Owls (mean = -0.04), but the difference is not statistically significant (p = 0.42), suggesting no meaningful cognitive advantage between the two.
There is no statistically significant difference in average number of classes missed between students with early classes (mean = 1.99) and those without (mean = 2.65), (p = 0.14). The confidence interval includes zero, indicating weak evidence of an association.
Students classified with at least moderate depression report much lower happiness scores (mean = 21.6) compared to those with normal status (mean = 27.1). This difference is highly significant (p < 0.000001), highlighting a strong relationship between depression severity and happiness.
Those who have pulled at least one all-nighter have slightly poorer sleep quality (mean = 7.03) than those who have not (mean = 6.14), but this difference is not statistically significant (p = 0.095), suggesting only weak evidence for an association.
Average stress scores are higher for heavy drinkers (mean = 10.44) than abstainers (mean = 8.97), but the difference is not statistically significant (p = 0.54), indicating no clear relationship between alcohol abstention and stress levels.
Male students consume significantly more alcoholic drinks per week (mean = 7.54) than female students (mean = 4.24). This difference is highly significant (p < 0.0000001), demonstrating a strong gender disparity in drinking behaviors.
Weekday bedtimes are similar between students with high stress (mean = 24.72) and those with normal stress (mean = 24.89); the difference is not statistically significant (p = 0.29), showing no clear link between stress level and weekday bedtime.
Underclassmen and upperclassmen report nearly identical hours of sleep on weekends (means = 8.21 and 8.22, respectively). This difference is not statistically significant (p = 0.96), indicating no association between class year and weekend sleep duration.