Introduction

This report analyzes sleep patterns and related factors among college students using a sleep study dataset. The goal is to explore differences in student behaviors and academic performance, focusing on gender and other variables.

The research questions that will be answered in this report include:

Q1: Is there a significant difference in the average GPA between male and female college students?

Q2: Is there a significant difference in the average number of early classes between the first two class years and other class years?

Q3: Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?

Q4: Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

Q5: Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

Q6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

Q7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

Q8: Is there a significant difference in the average number of drinks per week between students of different genders?

Q9: Is there a significant relationship between the amount of weekly sleep (AverageSleep) and the number of alcoholic drinks consumed per week (Drinks)?

Q10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

Data

The data set used in this analysis contains data from a study of the sleeping habits of college students. It contains 253 observations and 27 variables that measure a variety of factors impacting students’ sleeping habits, mental health, and academic achievement. These include demographic information on gender, academic performance, mental health and various sleep factors.

The dataset includes the following variables:

Gender: 1=male, 0=female

ClassYear: Year in school, 1=first year, …, 4=senior

LarkOwl: Early riser or night owl? Lark, Neither, or Owl

NumEarlyClass: Number of classes per week before 9 am

EarlyClass: Indicator for any early classes

GPA: Grade point average (0-4 scale)

ClassesMissed: Number of classes missed in a semester

CognitionZscore: Z-score on a test of cognitive skills

PoorSleepQuality: Measure of sleep quality (higher values are poorer sleep)

DepressionScore: Measure of degree of depression

AnxietyScore: Measure of amount of anxiety

StressScore: Measure of amount of stress

DepressionStatus: Coded depression score: normal, moderate, or severe

AnxietyStatus: Coded anxiety score: normal, moderate, or severe

Stress: Coded stress score: normal or high

DASScore: Combined score for depression, anxiety and stress

Happiness: Measure of degree of happiness

AlcoholUse: Self-reported: Abstain, Light, Moderate, or Heavy

Drinks: Number of alcoholic drinks per week

WeekdayBed: Average weekday bedtime (24.0=midnight)

WeekdayRise: Average weekday rise time (8.0=8 am)

WeekdaySleep: Average hours of sleep on weekdays

WeekendBed: Average weekend bedtime (24.0=midnight)

WeekendRise: Average weekend rise time (8.0=8 am)

WeekendSleep: Average weekend bedtime (24.0=midnight)

AverageSleep: Average hours of sleep for all days

AllNighter: Had an all-nighter this semester? 1=yes, 0=no

Analysis

Q1: Is there a significant difference in the average GPA between male and female college students?

To examine this, we conducted a Welch’s t-test, which compares the means of two groups without assuming equal variances. The null hypothesis (H0H0​) states that there is no significant difference in GPA, while the alternative hypothesis (H1H1​) suggests a difference.

## 
##  Welch Two Sample t-test
## 
## data:  GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725
## Confidence Interval for the Difference in Means:  0.09982254 0.3025278

The Welch’s t-test revealed a statistically significant difference in average GPA between male and female college students (t = 3.91, p < 0.001), with males (M = 3.32) scoring higher on average than females (M = 3.12), and a 95% confidence interval for the difference in means ranging from 0.10 to 0.30.

Q2: Is there a significant difference in the average number of early classes between the first two class years and other class years?

For this question, we aim to determine if there is a significant difference in the number of early classes between students in their first two years of college versus those in later years.

## 
##  Welch Two Sample t-test
## 
## data:  NumEarlyClass by ClassYearGroup
## t = 4.1813, df = 250.69, p-value = 4.009e-05
## alternative hypothesis: true difference in means between group First Two Years and group Other Years is not equal to 0
## 95 percent confidence interval:
##  0.4042016 1.1240309
## sample estimates:
## mean in group First Two Years     mean in group Other Years 
##                      2.070423                      1.306306
## Confidence Interval for the Difference in Means:  0.4042016 1.124031

This test showed a statistically significant difference in the average number of early classes between the first two class years and other class years (t = 4.18, p < 0.001), with first- and second-year students (M = 2.07) having more early classes on average than upperclassmen (M = 1.31), and a 95% confidence interval for the difference in means ranging from 0.40 to 1.12.

Q3: Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?

For this question, we aim to examine whether there is a significant difference in cognitive skills—measured by the Cognition Z-score—between students who identify as “larks” and those who identify as “owls.” Specifically, we want to determine if morning-oriented students (larks) perform better on cognitive tasks compared to night-oriented students (owls).

## 
## Lark  Owl 
##   41   49
## Lark Mean: 0.0902439  | SD: 0.8295676
## Owl Mean: -0.03836735  | SD: 0.6527421

## 
##  Welch Two Sample t-test
## 
## data:  CognitionZscore by LarkOwl
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
##  -0.1893561  0.4465786
## sample estimates:
## mean in group Lark  mean in group Owl 
##         0.09024390        -0.03836735

The results showed no significant difference in means, with a t-value of 0.806, degrees of freedom of approximately 75.33, and a p-value of 0.423. The 95% confidence interval for the difference in means ranges from -0.189 to 0.447, which includes zero, further indicating no significant effect.

Q4: Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

To determine if students with early classes missed significantly fewer or more classes compared to those without early classes, a Welch two-sample t-test was conducted.

## [1] 1.988095
## [1] 2.647059
## [1] 3.101068
## [1] 3.476814

## 
##  Welch Two Sample t-test
## 
## data:  early_yes and early_no
## t = -1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.5412830  0.2233558
## sample estimates:
## mean of x mean of y 
##  1.988095  2.647059

The 95% confidence interval for the true difference in means ranged from -1.54 to 0.22, which includes zero, further supporting that there is no strong evidence of a significant difference between the two groups.

Q5:Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

To address this question, we conducted a Welch two-sample t-test comparing the average happiness scores of students with moderate or severe depression to those with normal depression status. This test is appropriate because we are comparing means between two independent groups and cannot assume equal variances.

## 
##  Welch Two Sample t-test
## 
## data:  depressed and normal
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -7.379724 -3.507836
## sample estimates:
## mean of x mean of y 
##  21.61364  27.05742

Students with moderate or severe depression had a lower mean happiness score (M = 21.61) compared to those with normal depression status (M = 27.06), suggesting that higher levels of depression are associated with reduced happiness.

Q6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

To explore whether pulling an all-nighter is associated with differences in sleep quality, we compared the average PoorSleepQuality scores between students who reported having at least one all-nighter and those who did not.

## 
##  Welch Two Sample t-test
## 
## data:  allnighter_yes and allnighter_no
## t = 1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1608449  1.9456958
## sample estimates:
## mean of x mean of y 
##  7.029412  6.136986

The results of the t-test showed no statistically significant difference in sleep quality scores between students who had an all-nighter (mean = 7.03) and those who did not (mean = 6.14), t(44.71) = 1.71, p = 0.095. The 95% confidence interval for the difference in means ranged from -0.16 to 1.95, indicating that the true difference could be slightly negative or as high as nearly 2 points in favor of all-nighter students reporting worse sleep.

Q7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

To investigate whether students who abstain from alcohol report significantly different stress levels compared to those who report heavy alcohol use, we conducted a Welch two-sample t-test.

## 
##  Welch Two Sample t-test
## 
## data:  abstain and heavy
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.261170  3.327346
## sample estimates:
## mean of x mean of y 
##  8.970588 10.437500

The results showed no statistically significant difference between the two groups, t(28.73) = -0.63, p = 0.536. The 95% confidence interval for the difference in means ranged from -6.26 to 3.33. Students who abstained from alcohol had a mean stress score of 8.97, while heavy drinkers had a mean of 10.44.

Q8: Is there a significant difference in the average number of drinks per week between students of different genders?

To explore whether students’ average number of alcoholic drinks per week varies by gender, we conducted a Welch two-sample t-test comparing male and female students.

## 
##  Welch Two Sample t-test
## 
## data:  drinks_male and drinks_female
## t = 6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.241601 4.360009
## sample estimates:
## mean of x mean of y 
##  7.539216  4.238411

The test revealed a statistically significant difference, t(142.75) = 6.16, p < 0.001. The 95% confidence interval for the difference in means ranged from 2.24 to 4.36. Male students reported an average of 7.54 drinks per week, while female students reported an average of 4.24.

Q9: Is there a significant relationship between the amount of weekly sleep (AverageSleep) and the number of alcoholic drinks consumed per week (Drinks)?

For this question, we aim to examine the relationship between the average number of drinks consumed per week and the average hours of sleep per week.

## 
##  Pearson's product-moment correlation
## 
## data:  sleepData$AverageSleep and sleepData$Drinks
## t = -0.59079, df = 251, p-value = 0.5552
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.15985784  0.08646079
## sample estimates:
##         cor 
## -0.03726453

Based on the Pearson correlation test, there is no significant correlation between the average number of drinks consumed per week and the average hours of sleep per week.

Q10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

For this question, we were investigating whether there is a significant difference in the average hours of sleep on weekends between first-year and second-year students (grouped together as “First Two Years”) and third-year and fourth-year students (grouped together as “Other Students”).

## 
##  Welch Two Sample t-test
## 
## data:  WeekendSleep by YearGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherStudents is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean in group FirstTwoYears mean in group OtherStudents 
##                    8.213592                    8.221892

Based on the results of the Welch Two Sample t-test, we conclude that there is no significant difference in the average hours of sleep on weekends between first two-year students and other students. Both groups sleep approximately the same number of hours on weekends, with a very small difference in their means (8.21 vs 8.22 hours).

Summary

The analysis revealed several notable findings with important implications. Males having significantly higher GPAs than females suggests potential gender disparities in academic performance. The greater number of early classes taken by students in their first two years indicates that academic habits may change over time. However, the lack of significant difference in cognitive skills between “larks” and “owls” suggests that being a morning or night person does not necessarily correlate with cognitive abilities. The absence of a relationship between early classes and missed classes implies that early start times might not be as disruptive to attendance as expected. The stark difference in happiness levels between students with moderate or severe depression and those with normal levels underscores the importance of mental health support for students struggling with depression. Similarly, the lack of difference in sleep quality between students who pulled all-nighters and those who didn’t highlights the complex nature of sleep patterns and their potential impact on performance. The finding that males drink more than females could suggest differing social or cultural dynamics around alcohol consumption. The absence of a significant correlation between alcohol consumption and sleep duration calls into question any direct relationship between these factors. Finally, the lack of a difference in weekend sleep hours between first-year and upper-year students suggests that sleep habits might remain consistent throughout college, despite differing academic pressures. In conclusion, these findings offer valuable insights into student behaviors and academic experiences, and further research is needed to get a better understanding of these variables.

References

The data being referenced in this report:

##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter  ClassYearGroup     YearGroup
## 1         7.18          0     Other Years OtherStudents
## 2         6.93          0     Other Years OtherStudents
## 3         5.02          0     Other Years OtherStudents
## 4         6.90          0 First Two Years FirstTwoYears
## 5         6.35          0     Other Years OtherStudents
## 6         9.04          0     Other Years OtherStudents

This data was collected from https://www.lock5stat.com/datapage3e.html

Appendix

Listed below is the R code used to answer the questions in this report.

Q1: Is there a significant difference in the average GPA between male and female college students?

#t-test for GPA by Gender
t_test_result <- t.test(GPA ~ Gender, data = sleepData, var.equal = FALSE)

#results of the t-test
print(t_test_result)

#confidence interval from the t-test result
confidence_interval <- t_test_result$conf.int

# Print the confidence interval
cat("Confidence Interval for the Difference in Means: ", confidence_interval, "\n")

Q2: Is there a significant difference in the average number of early classes between the first two class years and other class years?

#new variable for class year group (first two years vs other years)
sleepData$ClassYearGroup <- ifelse(sleepData$ClassYear %in% c(1, 2), "First Two Years", "Other Years")

#t-test
t_test_result <- t.test(NumEarlyClass ~ ClassYearGroup, data = sleepData, var.equal = FALSE)

#t-test result
print(t_test_result)

#confidence interval
confidence_interval <- t_test_result$conf.int
cat("Confidence Interval for the Difference in Means: ", confidence_interval, "\n")

#Boxplot 
boxplot(NumEarlyClass ~ ClassYearGroup, data = sleepData,
        main = "Early Classes by Class Year",
        xlab = "Class Year Group",
        ylab = "Number of Early Classes",
        col = c("lightblue", "lightgreen"))

Q3: Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?

# Filter data
lark_owl <- subset(sleepData, LarkOwl %in% c("Lark", "Owl"))

# Check how many are in each group
table(lark_owl$LarkOwl)

# Means and standard deviations
lark_mean <- mean(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Lark"], na.rm = TRUE)
lark_sd <- sd(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Lark"], na.rm = TRUE)

owl_mean <- mean(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Owl"], na.rm = TRUE)
owl_sd <- sd(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Owl"], na.rm = TRUE)

cat("Lark Mean:", lark_mean, " | SD:", lark_sd, "\n")
cat("Owl Mean:", owl_mean, " | SD:", owl_sd, "\n")

# Boxplot to visualize differences
boxplot(CognitionZscore ~ LarkOwl, data = lark_owl,
        main = "Cognition Z-Score by Chronotype",
        xlab = "Chronotype", ylab = "Cognition Z-Score",
        col = c("lightblue", "lightgreen"))

# Histograms for both groups
hist(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Lark"],
     main = "Lark Cognition Z-Score", xlab = "Z-Score", col = "skyblue", breaks = 15)

hist(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Owl"],
     main = "Owl Cognition Z-Score", xlab = "Z-Score", col = "lightgreen", breaks = 15)

# Two-sample t-test 
test_result <- t.test(CognitionZscore ~ LarkOwl, data = lark_owl)

# Display t-test result
print(test_result)

Q4: Is there a significant difference in the average number of classes missed in a semester betweencstudents who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

# Subset data into two groups
early_yes <- subset(sleepData, EarlyClass == 1)$ClassesMissed
early_no <- subset(sleepData, EarlyClass == 0)$ClassesMissed

# Summary statistics
mean(early_yes)
mean(early_no)
sd(early_yes)
sd(early_no)


# Histograms for each group
hist(early_yes, main = "Classes Missed (Early Class = 1)",
     xlab = "Classes Missed", col = "lightgreen")
hist(early_no, main = "Classes Missed (Early Class = 0)",
     xlab = "Classes Missed", col = "skyblue")

# Perform two-sample t-test 
t.test(early_yes, early_no)

Q5:Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

# Subset the data 
depressed <- subset(sleepData, DepressionStatus %in% c("moderate", "severe"))$Happiness
normal <- subset(sleepData, DepressionStatus == "normal")$Happiness

# Calculate means
mean_depressed <- mean(depressed, na.rm = TRUE)
mean_normal <- mean(normal, na.rm = TRUE)

# Two Sample t-test
t.test(depressed, normal, var.equal = FALSE)

# Boxplot
boxplot(depressed, normal,
        names = c("Moderate/Severe", "Normal"),
        main = "Happiness by Depression Status",
        ylab = "Happiness Score",
        col = c("lightblue", "lightgreen"))

Q6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

# Subset the data into two groups 
allnighter_yes <- sleepData$PoorSleepQuality[sleepData$AllNighter == 1]
allnighter_no <- sleepData$PoorSleepQuality[sleepData$AllNighter == 0]

# Two Sample t-test
t.test(allnighter_yes, allnighter_no)

# Boxplot
boxplot(allnighter_yes, allnighter_no,
        names = c("All-Nighter", "No All-Nighter"),
        ylab = "Poor Sleep Quality Score",
        main = "Sleep Quality by All-Nighter Status",
        col = c("tomato", "skyblue"))

Q7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

# Subset data
abstain <- sleepData$StressScore[sleepData$AlcoholUse == "Abstain"]
heavy <- sleepData$StressScore[sleepData$AlcoholUse == "Heavy"]

#Two Sample t-test
t.test_result <- t.test(abstain, heavy)
print(t.test_result)

# Create a new data vector
stress_scores <- c(abstain, heavy)
group_labels <- c(rep("Abstain", length(abstain)), rep("Heavy", length(heavy)))

# Boxplot
boxplot(stress_scores ~ group_labels,
        main = "Stress Scores by Alcohol Use",
        xlab = "Alcohol Use Group",
        ylab = "Stress Score",
        col = c("lightblue", "lightcoral"),
        border = "darkblue")

Q8: Is there a significant difference in the average number of drinks per week between students of different genders?

# Subset data 
drinks_male <- sleepData$Drinks[sleepData$Gender == 1]
drinks_female <- sleepData$Drinks[sleepData$Gender == 0]

#Two Sample t-test
t.test_result <- t.test(drinks_male, drinks_female)
print(t.test_result)

# Boxplot
drinks_all <- c(drinks_male, drinks_female)
gender_labels <- c(rep("Male", length(drinks_male)), rep("Female", length(drinks_female)))

boxplot(drinks_all ~ gender_labels,
        main = "Average Number of Drinks per Week by Gender",
        xlab = "Gender",
        ylab = "Drinks per Week",
        col = c("lightblue", "lightpink"),
        border = "gray40")

Q9: Is there a significant relationship between the amount of weekly sleep (AverageSleep) and the number of alcoholic drinks consumed per week (Drinks)?

#correlation test
cor_test <- cor.test(sleepData$AverageSleep, sleepData$Drinks)

# Print correlation test results
print(cor_test)

#scatter plot 
plot(sleepData$AverageSleep, sleepData$Drinks, 
     main="Scatter Plot of Average Sleep vs. Drinks per Week",
     xlab="Average Sleep (hours per week)", ylab="Drinks per Week", 
     pch=19, col=rgb(0.1, 0.2, 0.5, 0.6)) 

#regression line
abline(lm(Drinks ~ AverageSleep, data=sleepData), col="red")

Q10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

# Create a new variable to categorize students into two groups
sleepData$YearGroup <- ifelse(sleepData$ClassYear <= 2, "FirstTwoYears", "OtherStudents")

# Perform t-test 
t_test_result <- t.test(WeekendSleep ~ YearGroup, data = sleepData)

# Print the results 
print(t_test_result)

# Boxplot 
boxplot(WeekendSleep ~ YearGroup, data = sleepData, 
        main = "Weekend Sleep Hours by Student Year",
        xlab = "Student Group", ylab = "Average Weekend Sleep Hours", 
        col = c("lightblue", "lightgreen"))