This report analyzes sleep patterns and related factors among college students using a sleep study dataset. The goal is to explore differences in student behaviors and academic performance, focusing on gender and other variables.
The research questions that will be answered in this report include:
Q1: Is there a significant difference in the average GPA between male and female college students?
Q2: Is there a significant difference in the average number of early classes between the first two class years and other class years?
Q3: Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?
Q4: Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?
Q5: Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?
Q6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?
Q7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?
Q8: Is there a significant difference in the average number of drinks per week between students of different genders?
Q9: Is there a significant relationship between the amount of weekly sleep (AverageSleep) and the number of alcoholic drinks consumed per week (Drinks)?
Q10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?
The data set used in this analysis contains data from a study of the sleeping habits of college students. It contains 253 observations and 27 variables that measure a variety of factors impacting students’ sleeping habits, mental health, and academic achievement. These include demographic information on gender, academic performance, mental health and various sleep factors.
The dataset includes the following variables:
Gender: 1=male, 0=female
ClassYear: Year in school, 1=first year, …, 4=senior
LarkOwl: Early riser or night owl? Lark, Neither, or Owl
NumEarlyClass: Number of classes per week before 9 am
EarlyClass: Indicator for any early classes
GPA: Grade point average (0-4 scale)
ClassesMissed: Number of classes missed in a semester
CognitionZscore: Z-score on a test of cognitive skills
PoorSleepQuality: Measure of sleep quality (higher values are poorer sleep)
DepressionScore: Measure of degree of depression
AnxietyScore: Measure of amount of anxiety
StressScore: Measure of amount of stress
DepressionStatus: Coded depression score: normal, moderate, or severe
AnxietyStatus: Coded anxiety score: normal, moderate, or severe
Stress: Coded stress score: normal or high
DASScore: Combined score for depression, anxiety and stress
Happiness: Measure of degree of happiness
AlcoholUse: Self-reported: Abstain, Light, Moderate, or Heavy
Drinks: Number of alcoholic drinks per week
WeekdayBed: Average weekday bedtime (24.0=midnight)
WeekdayRise: Average weekday rise time (8.0=8 am)
WeekdaySleep: Average hours of sleep on weekdays
WeekendBed: Average weekend bedtime (24.0=midnight)
WeekendRise: Average weekend rise time (8.0=8 am)
WeekendSleep: Average weekend bedtime (24.0=midnight)
AverageSleep: Average hours of sleep for all days
AllNighter: Had an all-nighter this semester? 1=yes, 0=no
To examine this, we conducted a Welch’s t-test, which compares the means of two groups without assuming equal variances. The null hypothesis (H0H0) states that there is no significant difference in GPA, while the alternative hypothesis (H1H1) suggests a difference.
##
## Welch Two Sample t-test
##
## data: GPA by Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1
## 3.324901 3.123725
## Confidence Interval for the Difference in Means: 0.09982254 0.3025278
The Welch’s t-test revealed a statistically significant difference in average GPA between male and female college students (t = 3.91, p < 0.001), with males (M = 3.32) scoring higher on average than females (M = 3.12), and a 95% confidence interval for the difference in means ranging from 0.10 to 0.30.
For this question, we aim to determine if there is a significant difference in the number of early classes between students in their first two years of college versus those in later years.
##
## Welch Two Sample t-test
##
## data: NumEarlyClass by ClassYearGroup
## t = 4.1813, df = 250.69, p-value = 4.009e-05
## alternative hypothesis: true difference in means between group First Two Years and group Other Years is not equal to 0
## 95 percent confidence interval:
## 0.4042016 1.1240309
## sample estimates:
## mean in group First Two Years mean in group Other Years
## 2.070423 1.306306
## Confidence Interval for the Difference in Means: 0.4042016 1.124031
This test showed a statistically significant difference in the average number of early classes between the first two class years and other class years (t = 4.18, p < 0.001), with first- and second-year students (M = 2.07) having more early classes on average than upperclassmen (M = 1.31), and a 95% confidence interval for the difference in means ranging from 0.40 to 1.12.
For this question, we aim to examine whether there is a significant difference in cognitive skills—measured by the Cognition Z-score—between students who identify as “larks” and those who identify as “owls.” Specifically, we want to determine if morning-oriented students (larks) perform better on cognitive tasks compared to night-oriented students (owls).
##
## Lark Owl
## 41 49
## Lark Mean: 0.0902439 | SD: 0.8295676
## Owl Mean: -0.03836735 | SD: 0.6527421
##
## Welch Two Sample t-test
##
## data: CognitionZscore by LarkOwl
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
## -0.1893561 0.4465786
## sample estimates:
## mean in group Lark mean in group Owl
## 0.09024390 -0.03836735
The results showed no significant difference in means, with a t-value of 0.806, degrees of freedom of approximately 75.33, and a p-value of 0.423. The 95% confidence interval for the difference in means ranges from -0.189 to 0.447, which includes zero, further indicating no significant effect.
To determine if students with early classes missed significantly fewer or more classes compared to those without early classes, a Welch two-sample t-test was conducted.
## [1] 1.988095
## [1] 2.647059
## [1] 3.101068
## [1] 3.476814
##
## Welch Two Sample t-test
##
## data: early_yes and early_no
## t = -1.4755, df = 152.78, p-value = 0.1421
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.5412830 0.2233558
## sample estimates:
## mean of x mean of y
## 1.988095 2.647059
The 95% confidence interval for the true difference in means ranged from -1.54 to 0.22, which includes zero, further supporting that there is no strong evidence of a significant difference between the two groups.
To address this question, we conducted a Welch two-sample t-test comparing the average happiness scores of students with moderate or severe depression to those with normal depression status. This test is appropriate because we are comparing means between two independent groups and cannot assume equal variances.
##
## Welch Two Sample t-test
##
## data: depressed and normal
## t = -5.6339, df = 55.594, p-value = 6.057e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -7.379724 -3.507836
## sample estimates:
## mean of x mean of y
## 21.61364 27.05742
Students with moderate or severe depression had a lower mean happiness score (M = 21.61) compared to those with normal depression status (M = 27.06), suggesting that higher levels of depression are associated with reduced happiness.
To explore whether pulling an all-nighter is associated with differences in sleep quality, we compared the average PoorSleepQuality scores between students who reported having at least one all-nighter and those who did not.
##
## Welch Two Sample t-test
##
## data: allnighter_yes and allnighter_no
## t = 1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1608449 1.9456958
## sample estimates:
## mean of x mean of y
## 7.029412 6.136986
The results of the t-test showed no statistically significant difference in sleep quality scores between students who had an all-nighter (mean = 7.03) and those who did not (mean = 6.14), t(44.71) = 1.71, p = 0.095. The 95% confidence interval for the difference in means ranged from -0.16 to 1.95, indicating that the true difference could be slightly negative or as high as nearly 2 points in favor of all-nighter students reporting worse sleep.
To investigate whether students who abstain from alcohol report significantly different stress levels compared to those who report heavy alcohol use, we conducted a Welch two-sample t-test.
##
## Welch Two Sample t-test
##
## data: abstain and heavy
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.261170 3.327346
## sample estimates:
## mean of x mean of y
## 8.970588 10.437500
The results showed no statistically significant difference between the two groups, t(28.73) = -0.63, p = 0.536. The 95% confidence interval for the difference in means ranged from -6.26 to 3.33. Students who abstained from alcohol had a mean stress score of 8.97, while heavy drinkers had a mean of 10.44.
To explore whether students’ average number of alcoholic drinks per week varies by gender, we conducted a Welch two-sample t-test comparing male and female students.
##
## Welch Two Sample t-test
##
## data: drinks_male and drinks_female
## t = 6.1601, df = 142.75, p-value = 7.002e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.241601 4.360009
## sample estimates:
## mean of x mean of y
## 7.539216 4.238411
The test revealed a statistically significant difference, t(142.75) = 6.16, p < 0.001. The 95% confidence interval for the difference in means ranged from 2.24 to 4.36. Male students reported an average of 7.54 drinks per week, while female students reported an average of 4.24.
For this question, we aim to examine the relationship between the average number of drinks consumed per week and the average hours of sleep per week.
##
## Pearson's product-moment correlation
##
## data: sleepData$AverageSleep and sleepData$Drinks
## t = -0.59079, df = 251, p-value = 0.5552
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.15985784 0.08646079
## sample estimates:
## cor
## -0.03726453
Based on the Pearson correlation test, there is no significant correlation between the average number of drinks consumed per week and the average hours of sleep per week.
For this question, we were investigating whether there is a significant difference in the average hours of sleep on weekends between first-year and second-year students (grouped together as “First Two Years”) and third-year and fourth-year students (grouped together as “Other Students”).
##
## Welch Two Sample t-test
##
## data: WeekendSleep by YearGroup
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means between group FirstTwoYears and group OtherStudents is not equal to 0
## 95 percent confidence interval:
## -0.3497614 0.3331607
## sample estimates:
## mean in group FirstTwoYears mean in group OtherStudents
## 8.213592 8.221892
Based on the results of the Welch Two Sample t-test, we conclude that there is no significant difference in the average hours of sleep on weekends between first two-year students and other students. Both groups sleep approximately the same number of hours on weekends, with a very small difference in their means (8.21 vs 8.22 hours).
The analysis revealed several notable findings with important implications. Males having significantly higher GPAs than females suggests potential gender disparities in academic performance. The greater number of early classes taken by students in their first two years indicates that academic habits may change over time. However, the lack of significant difference in cognitive skills between “larks” and “owls” suggests that being a morning or night person does not necessarily correlate with cognitive abilities. The absence of a relationship between early classes and missed classes implies that early start times might not be as disruptive to attendance as expected. The stark difference in happiness levels between students with moderate or severe depression and those with normal levels underscores the importance of mental health support for students struggling with depression. Similarly, the lack of difference in sleep quality between students who pulled all-nighters and those who didn’t highlights the complex nature of sleep patterns and their potential impact on performance. The finding that males drink more than females could suggest differing social or cultural dynamics around alcohol consumption. The absence of a significant correlation between alcohol consumption and sleep duration calls into question any direct relationship between these factors. Finally, the lack of a difference in weekend sleep hours between first-year and upper-year students suggests that sleep habits might remain consistent throughout college, despite differing academic pressures. In conclusion, these findings offer valuable insights into student behaviors and academic experiences, and further research is needed to get a better understanding of these variables.
The data being referenced in this report:
## Gender ClassYear LarkOwl NumEarlyClass EarlyClass GPA ClassesMissed
## 1 0 4 Neither 0 0 3.60 0
## 2 0 4 Neither 2 1 3.24 0
## 3 0 4 Owl 0 0 2.97 12
## 4 0 1 Lark 5 1 3.76 0
## 5 0 4 Owl 0 0 3.20 4
## 6 1 4 Neither 0 0 3.50 0
## CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1 -0.26 4 4 3 8
## 2 1.39 6 1 0 3
## 3 0.38 18 18 18 9
## 4 1.39 9 1 4 6
## 5 1.22 9 7 25 14
## 6 -0.04 6 14 8 28
## DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1 normal normal normal 15 28 Moderate 10
## 2 normal normal normal 4 25 Moderate 6
## 3 moderate severe normal 45 17 Light 3
## 4 normal normal normal 11 32 Light 2
## 5 normal severe normal 46 15 Moderate 4
## 6 moderate moderate high 50 22 Abstain 0
## WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1 25.75 8.70 7.70 25.75 9.50 5.88
## 2 25.70 8.20 6.80 26.00 10.00 7.25
## 3 27.44 6.55 3.00 28.00 12.59 10.09
## 4 23.50 7.17 6.77 27.00 8.00 7.25
## 5 25.90 8.67 6.09 23.75 9.50 7.00
## 6 23.80 8.95 9.05 26.00 10.75 9.00
## AverageSleep AllNighter ClassYearGroup YearGroup
## 1 7.18 0 Other Years OtherStudents
## 2 6.93 0 Other Years OtherStudents
## 3 5.02 0 Other Years OtherStudents
## 4 6.90 0 First Two Years FirstTwoYears
## 5 6.35 0 Other Years OtherStudents
## 6 9.04 0 Other Years OtherStudents
This data was collected from https://www.lock5stat.com/datapage3e.html
Listed below is the R code used to answer the questions in this report.
#t-test for GPA by Gender
t_test_result <- t.test(GPA ~ Gender, data = sleepData, var.equal = FALSE)
#results of the t-test
print(t_test_result)
#confidence interval from the t-test result
confidence_interval <- t_test_result$conf.int
# Print the confidence interval
cat("Confidence Interval for the Difference in Means: ", confidence_interval, "\n")
#new variable for class year group (first two years vs other years)
sleepData$ClassYearGroup <- ifelse(sleepData$ClassYear %in% c(1, 2), "First Two Years", "Other Years")
#t-test
t_test_result <- t.test(NumEarlyClass ~ ClassYearGroup, data = sleepData, var.equal = FALSE)
#t-test result
print(t_test_result)
#confidence interval
confidence_interval <- t_test_result$conf.int
cat("Confidence Interval for the Difference in Means: ", confidence_interval, "\n")
#Boxplot
boxplot(NumEarlyClass ~ ClassYearGroup, data = sleepData,
main = "Early Classes by Class Year",
xlab = "Class Year Group",
ylab = "Number of Early Classes",
col = c("lightblue", "lightgreen"))
# Filter data
lark_owl <- subset(sleepData, LarkOwl %in% c("Lark", "Owl"))
# Check how many are in each group
table(lark_owl$LarkOwl)
# Means and standard deviations
lark_mean <- mean(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Lark"], na.rm = TRUE)
lark_sd <- sd(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Lark"], na.rm = TRUE)
owl_mean <- mean(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Owl"], na.rm = TRUE)
owl_sd <- sd(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Owl"], na.rm = TRUE)
cat("Lark Mean:", lark_mean, " | SD:", lark_sd, "\n")
cat("Owl Mean:", owl_mean, " | SD:", owl_sd, "\n")
# Boxplot to visualize differences
boxplot(CognitionZscore ~ LarkOwl, data = lark_owl,
main = "Cognition Z-Score by Chronotype",
xlab = "Chronotype", ylab = "Cognition Z-Score",
col = c("lightblue", "lightgreen"))
# Histograms for both groups
hist(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Lark"],
main = "Lark Cognition Z-Score", xlab = "Z-Score", col = "skyblue", breaks = 15)
hist(lark_owl$CognitionZscore[lark_owl$LarkOwl == "Owl"],
main = "Owl Cognition Z-Score", xlab = "Z-Score", col = "lightgreen", breaks = 15)
# Two-sample t-test
test_result <- t.test(CognitionZscore ~ LarkOwl, data = lark_owl)
# Display t-test result
print(test_result)
# Subset data into two groups
early_yes <- subset(sleepData, EarlyClass == 1)$ClassesMissed
early_no <- subset(sleepData, EarlyClass == 0)$ClassesMissed
# Summary statistics
mean(early_yes)
mean(early_no)
sd(early_yes)
sd(early_no)
# Histograms for each group
hist(early_yes, main = "Classes Missed (Early Class = 1)",
xlab = "Classes Missed", col = "lightgreen")
hist(early_no, main = "Classes Missed (Early Class = 0)",
xlab = "Classes Missed", col = "skyblue")
# Perform two-sample t-test
t.test(early_yes, early_no)
# Subset the data
depressed <- subset(sleepData, DepressionStatus %in% c("moderate", "severe"))$Happiness
normal <- subset(sleepData, DepressionStatus == "normal")$Happiness
# Calculate means
mean_depressed <- mean(depressed, na.rm = TRUE)
mean_normal <- mean(normal, na.rm = TRUE)
# Two Sample t-test
t.test(depressed, normal, var.equal = FALSE)
# Boxplot
boxplot(depressed, normal,
names = c("Moderate/Severe", "Normal"),
main = "Happiness by Depression Status",
ylab = "Happiness Score",
col = c("lightblue", "lightgreen"))
# Subset the data into two groups
allnighter_yes <- sleepData$PoorSleepQuality[sleepData$AllNighter == 1]
allnighter_no <- sleepData$PoorSleepQuality[sleepData$AllNighter == 0]
# Two Sample t-test
t.test(allnighter_yes, allnighter_no)
# Boxplot
boxplot(allnighter_yes, allnighter_no,
names = c("All-Nighter", "No All-Nighter"),
ylab = "Poor Sleep Quality Score",
main = "Sleep Quality by All-Nighter Status",
col = c("tomato", "skyblue"))
# Subset data
abstain <- sleepData$StressScore[sleepData$AlcoholUse == "Abstain"]
heavy <- sleepData$StressScore[sleepData$AlcoholUse == "Heavy"]
#Two Sample t-test
t.test_result <- t.test(abstain, heavy)
print(t.test_result)
# Create a new data vector
stress_scores <- c(abstain, heavy)
group_labels <- c(rep("Abstain", length(abstain)), rep("Heavy", length(heavy)))
# Boxplot
boxplot(stress_scores ~ group_labels,
main = "Stress Scores by Alcohol Use",
xlab = "Alcohol Use Group",
ylab = "Stress Score",
col = c("lightblue", "lightcoral"),
border = "darkblue")
# Subset data
drinks_male <- sleepData$Drinks[sleepData$Gender == 1]
drinks_female <- sleepData$Drinks[sleepData$Gender == 0]
#Two Sample t-test
t.test_result <- t.test(drinks_male, drinks_female)
print(t.test_result)
# Boxplot
drinks_all <- c(drinks_male, drinks_female)
gender_labels <- c(rep("Male", length(drinks_male)), rep("Female", length(drinks_female)))
boxplot(drinks_all ~ gender_labels,
main = "Average Number of Drinks per Week by Gender",
xlab = "Gender",
ylab = "Drinks per Week",
col = c("lightblue", "lightpink"),
border = "gray40")
#correlation test
cor_test <- cor.test(sleepData$AverageSleep, sleepData$Drinks)
# Print correlation test results
print(cor_test)
#scatter plot
plot(sleepData$AverageSleep, sleepData$Drinks,
main="Scatter Plot of Average Sleep vs. Drinks per Week",
xlab="Average Sleep (hours per week)", ylab="Drinks per Week",
pch=19, col=rgb(0.1, 0.2, 0.5, 0.6))
#regression line
abline(lm(Drinks ~ AverageSleep, data=sleepData), col="red")
# Create a new variable to categorize students into two groups
sleepData$YearGroup <- ifelse(sleepData$ClassYear <= 2, "FirstTwoYears", "OtherStudents")
# Perform t-test
t_test_result <- t.test(WeekendSleep ~ YearGroup, data = sleepData)
# Print the results
print(t_test_result)
# Boxplot
boxplot(WeekendSleep ~ YearGroup, data = sleepData,
main = "Weekend Sleep Hours by Student Year",
xlab = "Student Group", ylab = "Average Weekend Sleep Hours",
col = c("lightblue", "lightgreen"))