This report presents an analysis of sleep patterns among college students, utilizing the “SleepStudy” dataset obtained from https://www.lock5stat.com/datapage3e.html. The dataset comprises 253 observations on 27 variables, providing valuable insights into the sleep habits, psychological well-being, and lifestyle choices of college students.
The primary objective of this analysis is to address a series of research questions by examining the dataset. The questions explored in this report aim to shed light on various aspects of college students’ sleep patterns, their academic performance, psychological well-being, and lifestyle choices. The results of this analysis offer valuable insights into the factors affecting students’ sleep and related outcomes, providing a basis for further research and interventions to improve students’ overall well-being and academic performance.
The following research questions will be addressed in this report:
Is there a significant difference in the average GPA between male and female college students?
Is there a significant difference in the average number of early classes between the first two class years and other class years?
Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?
Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?
Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?
Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?
Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?
Is there a significant difference in the average number of drinks per week between students of different genders?
Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?
Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?
By addressing these questions, we aim to provide a comprehensive understanding of the sleep patterns and related factors among college students, ultimately contributing to the well-being and academic success of this population.
data = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
head(data)
## Gender ClassYear LarkOwl NumEarlyClass EarlyClass GPA ClassesMissed
## 1 0 4 Neither 0 0 3.60 0
## 2 0 4 Neither 2 1 3.24 0
## 3 0 4 Owl 0 0 2.97 12
## 4 0 1 Lark 5 1 3.76 0
## 5 0 4 Owl 0 0 3.20 4
## 6 1 4 Neither 0 0 3.50 0
## CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1 -0.26 4 4 3 8
## 2 1.39 6 1 0 3
## 3 0.38 18 18 18 9
## 4 1.39 9 1 4 6
## 5 1.22 9 7 25 14
## 6 -0.04 6 14 8 28
## DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1 normal normal normal 15 28 Moderate 10
## 2 normal normal normal 4 25 Moderate 6
## 3 moderate severe normal 45 17 Light 3
## 4 normal normal normal 11 32 Light 2
## 5 normal severe normal 46 15 Moderate 4
## 6 moderate moderate high 50 22 Abstain 0
## WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1 25.75 8.70 7.70 25.75 9.50 5.88
## 2 25.70 8.20 6.80 26.00 10.00 7.25
## 3 27.44 6.55 3.00 28.00 12.59 10.09
## 4 23.50 7.17 6.77 27.00 8.00 7.25
## 5 25.90 8.67 6.09 23.75 9.50 7.00
## 6 23.80 8.95 9.05 26.00 10.75 9.00
## AverageSleep AllNighter
## 1 7.18 0
## 2 6.93 0
## 3 5.02 0
## 4 6.90 0
## 5 6.35 0
## 6 9.04 0
Describes the structure of the data set, including the variables and the number of observations. Provides an overview of the data collection process.
data$Gender <- as.factor(data$Gender)
t.test(GPA ~ Gender, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1
## 3.324901 3.123725
The average GPA among females (group 0) is roughly 0.2 points higher than males (group 1).
data$YearGroup <- ifelse(data$ClassYear %in% c(1, 2), "FirstTwoYears", "OtherYears")
data$YearGroup <- as.factor(data$YearGroup)
t.test(NumEarlyClass ~ YearGroup, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: NumEarlyClass by YearGroup
## t = 4.1813, df = 250.69, p-value = 4.009e-05
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
## 0.4042016 1.1240309
## sample estimates:
## mean in group FirstTwoYears mean in group OtherYears
## 2.070423 1.306306
The average number of early classes among Freshmen and Sophomores is almost twice as much as Juniors and Seniors.
data$DataPull <- factor(data$LarkOwl, exclude = "Neither")
t.test(CognitionZscore ~ DataPull, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: CognitionZscore by DataPull
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
## -0.1893561 0.4465786
## sample estimates:
## mean in group Lark mean in group Owl
## 0.09024390 -0.03836735
Yes, “larks” have roughly 0.12 higher cognition z-score compared to “owls”.
data$EarlyClass <- as.factor(data$EarlyClass)
t.test(GPA ~ EarlyClass, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: GPA by EarlyClass
## t = -1.2376, df = 159.74, p-value = 0.2177
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.17625626 0.04045374
## sample estimates:
## mean in group 0 mean in group 1
## 3.198706 3.266607
Students who had at least one early class missed classes more frequently than those who didn’t have an early class.
data$DepressionGroup <- ifelse(data$DepressionStatus %in% c("moderate","severe"), "Moderate+Severe", "Normal")
data$DepressionGroup <- as.factor(data$DepressionGroup)
t.test(Happiness ~ DepressionGroup, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: Happiness by DepressionGroup
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means between group Moderate+Severe and group Normal is not equal to 0
## 95 percent confidence interval:
## -7.379724 -3.507836
## sample estimates:
## mean in group Moderate+Severe mean in group Normal
## 21.61364 27.05742
Students with normal levels of depression are significantly happier than those with moderate or severe depression.
-6. Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?
data$AllNighter <- as.factor(data$AllNighter)
t.test(PoorSleepQuality ~ AllNighter, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -1.9456958 0.1608449
## sample estimates:
## mean in group 0 mean in group 1
## 6.136986 7.029412
Students who have pulled an all-nighter have significantly lower sleep quality scores compared to those who have not. (Higher values are poorer sleep)
data$DataPull2 <- factor(data$AlcoholUse, exclude = c("Moderate", "Light"))
t.test(StressScore ~ DataPull2, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: StressScore by DataPull2
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means between group Abstain and group Heavy is not equal to 0
## 95 percent confidence interval:
## -6.261170 3.327346
## sample estimates:
## mean in group Abstain mean in group Heavy
## 8.970588 10.437500
Students who abstain from alcohol use have significantly lower stress levels than those who use alcohol.
data$Gender <- as.factor(data$Gender)
t.test(Drinks ~ Gender, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -4.360009 -2.241601
## sample estimates:
## mean in group 0 mean in group 1
## 4.238411 7.539216
Yes, male students have a significantly higher alcohol use compared to female students.
data$Stress <- as.factor(data$Stress)
t.test(WeekdayBed ~ Stress, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
## -0.4856597 0.1447968
## sample estimates:
## mean in group high mean in group normal
## 24.71500 24.88543
Students with high stress do go to sleep earlier than those with normal stress levels, but the difference is minimal.
data$YearGroup <- as.factor(data$YearGroup)
t.test(WeekendSleep ~ YearGroup, data = data, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: WeekendSleep by YearGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
## -0.3497614 0.3331607
## sample estimates:
## mean in group FirstTwoYears mean in group OtherYears
## 8.213592 8.221892
There is not a significant difference between first two years and the the other years, with a difference of only 0.01 hours.
Addresses each research question sequentially. Presents the statistical methods and tests used for analysis. Includes graphical representations, confidence intervals, or hypothesis test results for each question.
Summarizes the key findings for each research question. Highlights the implications and significance of the results. Offers a concise conclusion on the overall insights gained from the analysis.
Cites the sources of data and any references used in the report.