Introduction

This report presents an analysis of sleep patterns among college students, utilizing the “SleepStudy” dataset obtained from https://www.lock5stat.com/datapage3e.html. The dataset comprises 253 observations on 27 variables, providing valuable insights into the sleep habits, psychological well-being, and lifestyle choices of college students. The primary objective of this analysis is to address a series of research questions by examining the dataset. The questions explored in this report aim to shed light on various aspects of college students’ sleep patterns, their academic performance, psychological well-being, and lifestyle choices. The results of this analysis offer valuable insights into the factors affecting students’ sleep and related outcomes, providing a basis for further research and interventions to improve students’ overall well-being and academic performance.

Research Questions

The following research questions will be addressed in this report:

Is there a significant difference in the average GPA between male and female college students?
Is there a significant difference in the average number of early classes between the first two class years and other class years?
Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?
Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?
Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?
Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?
Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?
Is there a significant difference in the average number of drinks per week between students of different genders?
Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?
Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

Data

The data was obtained from a sample of students who did skills tests to measure cognitive function, completed a survey that asked many questions about attitudes and habits, and kept a sleep diary to record time and quality of sleep over a two week period. The provided data set gives the data from the study of sleep patterns for the college students consisting of 253 observations on the following 27 variables. The variables are classified below.

Gender 1=male, 0=female
ClassYear: Year in school, 1=first year, …, 4=senior
LarkOwl: Early riser or night owl? Lark, Neither, or Owl
NumEarlyClass: Number of classes per week before 9 am
EarlyClass: Indicator for any early classes
GPA: Gradepoint average (0-4 scale)
ClassesMissed: Number of classes missed in a semester
CognitionZscore: Z-score on a test of cognitive skills
PoorSleepQuality: Measure of sleep quality (higher values are poorer sleep)
DepressionScore: Measure of degree of depression
AnxietyScore: Measure of amount of anxiety
StressScore: Measure of amount of stress
DepressionStatus: Coded depression score: normal, moderate, or severe
AnxietyStatus: Coded anxiety score: normal, moderate, or severe
Stress: Coded stress score: normal or high
DASScore: Combined score for depression, anxiety and stress
Happiness: Measure of degree of happiness
AlcoholUse: Self-reported: Abstain, Light, Moderate, or Heavy
Drinks: Number of alcoholic drinks per week
WeekdayBed: Average weekday bedtime (24.0=midnight)
WeekdayRise: Average weekday rise time (8.0=8 am)
WeekdaySleep: Average hours of sleep on weekdays
WeekendBed: Average weekend bedtime (24.0=midnight)
WeekendRise: Average weekend rise time (8.0=8 am)
WeekendSleep: Average hours of sleep on weekends
AverageSleep: Average hours of sleep for all days
AllNighter: Had an all-nighter this semester? 1=yes, 0=no

The dataset with values is provided in the data table below.

library(DT)
SleepStudy = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
datatable(SleepStudy, options = list(scrollX = TRUE, pageLength = 15))

Analysis/Results

1. Is there a significant difference in GPA between males and females?

The t-test compares GPA between genders. The p-value = 0.0001243 (significant at p < 0.05), suggesting a significant difference in GPA. The mean GPA for males (3.32) is higher than for females (3.12). The 95% CI [0.0998, 0.3025] does not include 0, further supporting this conclusion.

t_test_result <- t.test(GPA ~ Gender, data = SleepStudy) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725

2. Is there a significant difference in the number of early classes between first/second year students and others?

The t-test examines the number of early classes for the first two years versus other years. The p-value = 0.0043 (significant), indicating a significant difference. First-year/second-year students have more early classes on average (2.36 vs. 1.59). The 95% CI [0.2506, 1.2883] does not include 0, further supporting this.

SleepStudy$ClassYearGroup <- cut(SleepStudy$ClassYear, breaks = c(0, 2, Inf), labels = 
c("FirstTwoYears", "OtherYears"), right = FALSE) 
t_test_result <- t.test(NumEarlyClass ~ ClassYearGroup, data = subset(SleepStudy)) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  NumEarlyClass by ClassYearGroup
## t = 2.9623, df = 64.352, p-value = 0.004275
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  0.2506028 1.2883354
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    2.361702                    1.592233

3. Do “larks” have better cognitive skills than “owls”?

The t-test compares cognition z-scores for “larks” and “owls.” The p-value = 0.4229 (not significant), meaning no evidence of a difference in cognitive skills. The 95% CI [-0.1894, 0.4466] includes 0, confirming no significant difference.

larkdata <- SleepStudy[SleepStudy$LarkOwl == "Lark", ] 
owldata <- SleepStudy[SleepStudy$LarkOwl == "Owl", ] 
t_test_result <- t.test(larkdata$CognitionZscore, owldata$CognitionZscore) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  larkdata$CognitionZscore and owldata$CognitionZscore
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1893561  0.4465786
## sample estimates:
##   mean of x   mean of y 
##  0.09024390 -0.03836735

4. Is there a significant difference in classes missed between students with early classes and those without?

The t-test compares classes missed between students with early classes and those without. The p-value = 0.1421 (not significant), suggesting no evidence of a difference. The 95% CI [-1.5413, 0.2234] includes 0, confirming this conclusion

earlyclass <- subset(SleepStudy, EarlyClass == 1) 
noearlyclass <- subset(SleepStudy, EarlyClass == 0) 
t_test_result <- t.test(earlyclass$ClassesMissed, noearlyclass$ClassesMissed) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  earlyclass$ClassesMissed and noearlyclass$ClassesMissed
## t = -1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.5412830  0.2233558
## sample estimates:
## mean of x mean of y 
##  1.988095  2.647059

5. Is there a significant difference in happiness levels between students with moderate/severe and normal depression?

The t-test examines happiness levels. The p-value = 6.057e-07 (significant), indicating students with moderate/severe depression are significantly less happy (21.61 vs. 27.06). Again, the 95% CI [-7.3797, -3.5078] does not include 0 further supporting this result

moderatedepressedstatus <- subset(SleepStudy, DepressionStatus %in% c("moderate", "severe")) 
normaldepressedstatus <- subset(SleepStudy, DepressionStatus == "normal") 
t_test_result <- t.test(moderatedepressedstatus$Happiness, normaldepressedstatus$Happiness) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  moderatedepressedstatus$Happiness and normaldepressedstatus$Happiness
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -7.379724 -3.507836
## sample estimates:
## mean of x mean of y 
##  21.61364  27.05742

6. Is there a significant difference in sleep quality between students who pulled an all-nighter and those who didn’t?

The t-test compares poor sleep quality scores. The p-value = 0.0948 (not significant), indicating no evidence of a difference in sleep quality. The 95% CI [-0.1608, 1.9457] includes 0, supporting this conclusion.

allnighter <- subset(SleepStudy, AllNighter == 1) 
noallnighter <- subset(SleepStudy, AllNighter == 0) 
t_test_result <- t.test(allnighter$PoorSleepQuality, 
noallnighter$PoorSleepQuality) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  allnighter$PoorSleepQuality and noallnighter$PoorSleepQuality
## t = 1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1608449  1.9456958
## sample estimates:
## mean of x mean of y 
##  7.029412  6.136986

7. Do students who abstain from alcohol have better stress scores than heavy drinkers?

The t-test compares stress scores. The p-value = 0.5362 (not significant), indicating no evidence of a difference in stress scores. The 95% CI [-6.2612, 3.3273] includes 0, confirming this result.

abstain <- subset(SleepStudy, AlcoholUse == "Abstain") 
heavyalcohol <- subset(SleepStudy, AlcoholUse == "Heavy") 
t_test_result <- t.test(abstain$StressScore, heavyalcohol$StressScore) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  abstain$StressScore and heavyalcohol$StressScore
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.261170  3.327346
## sample estimates:
## mean of x mean of y 
##  8.970588 10.437500

8. Is there a significant difference in weekly drinks between genders?

The t-test compares the number of drinks per week between genders. The p-value = 7.002e-09 (significant), suggesting a significant difference. Males drink more on average (7.54 vs. 4.24). The 95% CI [-4.3600, -2.2416] does not include zero further supporting this conclusion.

maledrinks <- subset(SleepStudy, Gender == 1) 
femaledrinks <- subset(SleepStudy, Gender == 0) 
t_test_result <- t.test(maledrinks$Drinks, femaledrinks$Drinks) 
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  maledrinks$Drinks and femaledrinks$Drinks
## t = 6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.241601 4.360009
## sample estimates:
## mean of x mean of y 
##  7.539216  4.238411

9. Is there a significant difference in weekday bedtimes between students with high and normal stress?

The t-test examines weekday bedtimes. The p-value = 0.2855 (not significant), indicating no evidence of a difference. The 95% CI [-0.4857, 0.1448] includes 0, confirming this result.

stresslevelbedtime=subset(SleepStudy,Stress %in% c("high","normal")) 
t.test(WeekdayBed ~ Stress,data=stresslevelbedtime,alternative = "two.sided")

## 
##  Welch Two Sample t-test
## 
## data:  WeekdayBed by Stress
## t = -1.0746, df = 87.048, p-value = 0.2855
## alternative hypothesis: true difference in means between group high and group normal is not equal to 0
## 95 percent confidence interval:
##  -0.4856597  0.1447968
## sample estimates:
##   mean in group high mean in group normal 
##             24.71500             24.88543

10. Is there a significant difference in weekend sleep hours between first-year/second-year students and others?

The t-test compares average amount of sleeping hours on the weekend for first and second year students compared to the others. The p-value = 0.9618 (not significant), meaning no evidence of a difference. The 95% CI [-0.3498, 0.3332] includes 0, supporting this conclusion.

SleepStudy$WeekendSleepComparison=ifelse(SleepStudy$ClassYear <3,"FirstTwoYears","OtherYears") 
t_test_Averageclass <-t.test(WeekendSleep ~ WeekendSleepComparison,data=SleepStudy) 
print(t_test_Averageclass)

## 
##  Welch Two Sample t-test
## 
## data:  WeekendSleep by WeekendSleepComparison
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherYears is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean in group FirstTwoYears    mean in group OtherYears 
##                    8.213592                    8.221892

Conclusion/Summary

This report provided a detailed statistical analysis of the sleep patterns and associated factors among college students, based on the “SleepStudy” dataset. The tool we utilized to perform this analysis was the t-test which is a statistical hypothesis test used to determine if there is a significant difference between the means of two groups. It is particularly useful when comparing data from two independent groups to understand whether their means are different, or whether any observed difference could be due to random chance. Thus, we utilized various t-tests to explore the research questions and to determine whether there were significant differences between different groups in terms of GPA, sleep habits, cognitive skills, happiness, stress, and other factors. The results from the t-tests offer valuable insights into how different lifestyle factors and psychological traits influence students’ well-being and academic performance. For each test, a p-value was computed to help determine whether the differences observed are statistically significant.

The p-value is a key outcome of the t-test. It represents the probability that the observed difference between groups happened by chance. In general, a p-value less than 0.05, written as p < 0.05, suggests that the difference between the groups is statistically significant, and we can reject the null hypothesis (the assumption that there is no difference). A p-value greater than 0.05 indicates that there is no significant difference between the groups, and we fail to reject the null hypothesis. Additionally, the confidence interval (CI) further supports this interpretation. If the confidence interval includes 0, it indicates that the difference in means could be due to random variation, and the result is not statistically significant. Conversely, if the confidence interval does not include 0, it suggests that the observed difference is unlikely to be due to chance, and the result is significant.

Thus, by utilizing the t-test we found that there was a significant difference in weekly drinks between genders with males averaging 7.54 drinks per week vs. females averaging 4.24 per week. Additionally, we found there to be a a significant difference in GPA between males and females with males having an average GPA of 3.32 vs. females having an average GPA of 3.12. Finally, we found a significant difference in the number of early classes between first/second year students and others, with first/second year students having 2.36 early classes on average vs. other students having 1.59 early classes on average. The rest of the analysis indicated that there was not a significant difference when comparing the two groups for each applicable question.

By performing this analysis it is easy to see that T-tests are crucial tools for data analysis in many fields, including engineering, psychology, healthcare, and economics, as they provide a simple way to test hypotheses about differences between groups. By using t-tests, engineers and researchers can identify meaningful differences whether it’s comparing the effectiveness of two products, evaluating the impact of a new technology, or assessing the effects of different interventions.This allows them to make data-driven decisions by using statistical tests to guide decisions, ensuring that actions are based on robust evidence rather than assumptions or anecdotal observations.

Stat 353 Project 2

Justin Lewis

2024-11-26