Project #2

Introduction

This report presents an analysis of sleep patterns among college students, utilizing the “SleepStudy” dataset obtained from https://www.lock5stat.com/datapage3e.html. The dataset comprises 253 observations on 27 variables, providing valuable insights into the sleep habits, psychological well-being, and lifestyle choices of college students.

The primary objective of this analysis is to address a series of research questions by examining the dataset. The questions explored in this report aim to shed light on various aspects of college students’ sleep patterns, their academic performance, psychological well-being, and lifestyle choices. The results of this analysis offer valuable insights into the factors affecting students’ sleep and related outcomes, providing a basis for further research and interventions to improve students’ overall well-being and academic performance.

The following research questions will be addressed in this report:

Is there a significant difference in the average GPA between male and female college students?
Is there a significant difference in the average number of early classes between the first two class years and other class years?
Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?
Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?
Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?
Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?
Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?
Is there a significant difference in the average number of drinks per week between students of different genders?
Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?
Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

By addressing these questions, we aim to provide a comprehensive understanding of the sleep patterns and related factors among college students, ultimately contributing to the well-being and academic success of this population.

Data:

data = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
head(data)

##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0

Describes the structure of the data set, including the variables and the number of observations. Provides an overview of the data collection process.

Analysis:

Is there a significant difference in the average GPA between male and female college students?

data$Gender <- as.factor(data$Gender)
t.test(GPA ~ Gender, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725

The average GPA among females (group 0) is roughly 0.2 points higher than males (group 1).

Is there a significant difference in the average number of early classes between the first two class years and other class years?

data$YearGroup <- ifelse(data$ClassYear %in% c(1, 2), "FirstTwoYears", "OtherYears")
data$YearGroup <- as.factor(data$YearGroup)
t.test(NumEarlyClass ~ YearGroup, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  NumEarlyClass by YearGroup
## t = 4.1813, df = 250.69, p-value = 4.009e-05
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  0.4042016 1.1240309
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    2.070423                    1.306306

The average number of early classes among Freshmen and Sophomores is almost twice as much as Juniors and Seniors.

Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?

data$DataPull <- factor(data$LarkOwl, exclude = "Neither")
t.test(CognitionZscore ~ DataPull, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  CognitionZscore by DataPull
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
##  -0.1893561  0.4465786
## sample estimates:
## mean in group Lark  mean in group Owl 
##         0.09024390        -0.03836735

Yes, “larks” have roughly 0.12 higher cognition z-score compared to “owls”.

Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

data$EarlyClass <- as.factor(data$EarlyClass)
t.test(GPA ~ EarlyClass, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  GPA by EarlyClass
## t = -1.2376, df = 159.74, p-value = 0.2177
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.17625626  0.04045374
## sample estimates:
## mean in group 0 mean in group 1 
##        3.198706        3.266607

Students who had at least one early class missed classes more frequently than those who didn’t have an early class.

Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

data$DepressionGroup <- ifelse(data$DepressionStatus %in% c("moderate","severe"), "Moderate+Severe", "Normal")
data$DepressionGroup <- as.factor(data$DepressionGroup)
t.test(Happiness ~ DepressionGroup, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  Happiness by DepressionGroup
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means between group Moderate+Severe and group Normal is not equal to 0
## 95 percent confidence interval:
##  -7.379724 -3.507836
## sample estimates:
## mean in group Moderate+Severe          mean in group Normal 
##                      21.61364                      27.05742

Students with normal levels of depression are significantly happier than those with moderate or severe depression.

-6. Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

data$AllNighter <- as.factor(data$AllNighter)
t.test(PoorSleepQuality ~ AllNighter, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -1.9456958  0.1608449
## sample estimates:
## mean in group 0 mean in group 1 
##        6.136986        7.029412

Students who have pulled an all-nighter have significantly lower sleep quality scores compared to those who have not. (Higher values are poorer sleep)

Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

data$DataPull2 <- factor(data$AlcoholUse, exclude = c("Moderate", "Light"))
t.test(StressScore ~ DataPull2, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  StressScore by DataPull2
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means between group Abstain and group Heavy is not equal to 0
## 95 percent confidence interval:
##  -6.261170  3.327346
## sample estimates:
## mean in group Abstain   mean in group Heavy 
##              8.970588             10.437500

Students who abstain from alcohol use have significantly lower stress levels than those who use alcohol.

Is there a significant difference in the average number of drinks per week between students of different genders?

data$Gender <- as.factor(data$Gender)
t.test(Drinks ~ Gender, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  Drinks by Gender
## t = -6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -4.360009 -2.241601
## sample estimates:
## mean in group 0 mean in group 1 
##        4.238411        7.539216

Yes, male students have a significantly higher alcohol use compared to female students.

Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?

data$Stress <- as.factor(data$Stress)
t.test(WeekdayBed ~ Stress, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
##  -0.4856597  0.1447968
## sample estimates:
##   mean in group high mean in group normal 
##             24.71500             24.88543

Students with high stress do go to sleep earlier than those with normal stress levels, but the difference is minimal.

Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

data$YearGroup <- as.factor(data$YearGroup)
t.test(WeekendSleep ~ YearGroup, data = data, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  WeekendSleep by YearGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    8.213592                    8.221892

There is not a significant difference between first two years and the the other years, with a difference of only 0.01 hours.

Addresses each research question sequentially. Presents the statistical methods and tests used for analysis. Includes graphical representations, confidence intervals, or hypothesis test results for each question.

Summary:

Summarizes the key findings for each research question. Highlights the implications and significance of the results. Offers a concise conclusion on the overall insights gained from the analysis.

References:

Cites the sources of data and any references used in the report.