This report is an analysis of the sleep patterns of college students. The data we are analyzing was found in a sleep study, and obtained through this link: https://www.lock5stat.com/datasets3e/SleepStudy.csv This dataset observes 253 individuals, and 27 different variables including GPA, average sleep per night, and number of early classes per week.
The goal of this report and the following 10 questions, aims to explore the relationships between college student’s lifestyle decisions, and their academics
The 10 research questions: 1. What is the average GPA of the dataset? 2. What proportion of students are male vs female? 3. Is there a correlation between average sleep and happiness? 4. Is there a correlation between poor sleep quality and stress? 5. What is the standard deviation of weekday sleep? 6. Which class year of students drink the most? 7. Which class year of students get the most average sleep? 8. What portion of students identify as a lark, night owl, or neither? 9. Do males or females miss more classes in a semester? 10. What is the average GPA of students across different alcohol uses
The data set is seen below.
Using the data, and R code, we will answer 10 research questions proposed above
sleep = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
head(sleep)
## Gender ClassYear LarkOwl NumEarlyClass EarlyClass GPA ClassesMissed
## 1 0 4 Neither 0 0 3.60 0
## 2 0 4 Neither 2 1 3.24 0
## 3 0 4 Owl 0 0 2.97 12
## 4 0 1 Lark 5 1 3.76 0
## 5 0 4 Owl 0 0 3.20 4
## 6 1 4 Neither 0 0 3.50 0
## CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1 -0.26 4 4 3 8
## 2 1.39 6 1 0 3
## 3 0.38 18 18 18 9
## 4 1.39 9 1 4 6
## 5 1.22 9 7 25 14
## 6 -0.04 6 14 8 28
## DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1 normal normal normal 15 28 Moderate 10
## 2 normal normal normal 4 25 Moderate 6
## 3 moderate severe normal 45 17 Light 3
## 4 normal normal normal 11 32 Light 2
## 5 normal severe normal 46 15 Moderate 4
## 6 moderate moderate high 50 22 Abstain 0
## WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1 25.75 8.70 7.70 25.75 9.50 5.88
## 2 25.70 8.20 6.80 26.00 10.00 7.25
## 3 27.44 6.55 3.00 28.00 12.59 10.09
## 4 23.50 7.17 6.77 27.00 8.00 7.25
## 5 25.90 8.67 6.09 23.75 9.50 7.00
## 6 23.80 8.95 9.05 26.00 10.75 9.00
## AverageSleep AllNighter
## 1 7.18 0
## 2 6.93 0
## 3 5.02 0
## 4 6.90 0
## 5 6.35 0
## 6 9.04 0
mean(sleep$GPA, na.rm = TRUE)
## [1] 3.243794
Across the 253 students, the average GPA is 3.24, which comes out to be a B average.
gender_props <- (table(sleep$Gender))
gender_props
##
## 0 1
## 151 102
barplot(gender_props,
main = "Number of Students by Gender",
ylab = "Number of students",
xlab = "Gender",
col = c("pink", "skyblue"),
names.arg = c("Female", "Male"))
The bar chart shows that a majority of the students in this study were
female. 151 female, and 102 male, comes out to about 60% female and 40%
male.
cor(sleep$AverageSleep, sleep$Happiness, use = "complete.obs")
## [1] 0.1038736
The correlation came out to be 0.1, which is very surprising to me. I had assumed That more sleep would have led to more happiness, but the dataset suggests that there is little to no correlation between the average amount of sleep a student gets and their happiness.
cor(sleep$PoorSleepQuality, sleep$StressScore, use = "complete.obs")
## [1] 0.3275876
The correlation came out to be about 0.33 this time. This suggests that there is a moderate positive linear relationship between the two. The worse someone’s sleep quality is, the more likely they are to be stressed.
sd(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 1.167788
mean(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 7.866008
The standard deviation of the weekday sleep is 1.17 hours, with the average being roughly 7.87 hours. This means that about 60% of the students sleep between 6.7 and 9.04 hours.
avg_drinks <- tapply(sleep$Drinks, sleep$ClassYear, mean, na.rm = TRUE)
barplot(avg_drinks,
main = "Average Drinks per Week by Class Year",
xlab = "Class Year",
ylab = "Average Drinks per Week",
col = "lightgreen",
names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))
This bar chart shows that of the four classes, Freshmen drink the least, with an average around 4.5 drinks a week, while Sophomores drink the most at around 6 drinks per week.
avg_sleep <- tapply(sleep$AverageSleep, sleep$ClassYear, mean, na.rm = TRUE)
barplot(avg_sleep,
main = "Average Hours of sleep per week by Class Year",
xlab = "Class Year",
ylab = "Average Hours of sleep per Week",
col = "lightgreen",
names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))
The bar chart shows that Sophomores get more average sleep than the other classes, but only slightly and not by any significant amount. I find this funny because the 2nd year students are also the ones who do the most drinking on average.
lark_counts <- table(sleep$LarkOwl)
lark_percent <- round(100 * prop.table(lark_counts), 1)
pie(lark_counts,
labels = paste(names(lark_counts), "(", lark_percent, "%)", sep=""),
main = "Student Sleep Habit Distribution",
col = c("lightblue", "tan", "darkblue"))
Based on the pie chart, it would seem almost 20% of students identify as night owls, a little over 15% identify as larks (morning people) and the remaining 65% identify as neither a night owl or a lark. Me personally identify myself as a lark.
avg_missed <- tapply(sleep$ClassesMissed, sleep$Gender, mean, na.rm = TRUE)
avg_missed
## 0 1
## 1.860927 2.725490
barplot(avg_missed,
main = "Average Classes Missed by Gender",
ylab = "Average Number of Classes Missed",
xlab = "Gender",
col = c("pink", "lightblue"),
names.arg = c("Female", "Male"))
The bar plot shows that on average females miss 1.86 classes a semester, while the males miss 2.72 classes on average.
#Change order
sleep$AlcoholUse <- factor(sleep$AlcoholUse,
levels = c("Abstain", "Light", "Moderate", "Heavy"))
avg_gpa_alcohol <- tapply(sleep$GPA, sleep$AlcoholUse, mean, na.rm = TRUE)
avg_gpa_alcohol
## Abstain Light Moderate Heavy
## 3.321471 3.280482 3.208750 3.151250
barplot(avg_gpa_alcohol,
main = "Average GPA by Alcohol Use",
ylab = "Average GPA",
xlab = "Alcohol Use Category",
col = c("lightgreen", "yellow", "orange", "red"))
The data shows students that abstain average GPA of 3.32, students who drink lightly average a GPA of 3.28, students that drink moderately average a GPA of 3.21, and students who drink heavily average a GPA of 3.15. The bar plot shows the slight decline in average grades as the level of drinking increases.
Overall we found that contrary to popular belief, more sleep does not necessarily make you more happy, but less quality sleep is more likely to make you more stressed. We also found that even though Sophomores drink the most, and Freshmen drink the least, all the classes get about the same amount of sleep on average, being 8 hours. Most students also identified themselves as neither a lark or a night owl. We found that men tend to miss more classes per semester than women, and that most of the students participating in the study were women. What we can take away from this analysis is that there are a vast number of factors that determine a students success. It’s not as simple as sleeping more. In future I would like to explore more relationships, such as lark GPA vs owl GPA, the GPAs and stress levels of students who have early morining classes, and male GPA vs female GPA.
# Question 1:
mean(sleep$GPA, na.rm = TRUE)
## [1] 3.243794
# Question 2:
gender_props <- (table(sleep$Gender))
gender_props
##
## 0 1
## 151 102
barplot(gender_props,
main = "Number of Students by Gender",
ylab = "Number of students",
xlab = "Gender",
col = c("pink", "skyblue"),
names.arg = c("Female", "Male"))
# Question 3:
cor(sleep$AverageSleep, sleep$Happiness, use = "complete.obs")
## [1] 0.1038736
# Question 4:
cor(sleep$PoorSleepQuality, sleep$StressScore, use = "complete.obs")
## [1] 0.3275876
# Question 5:
sd(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 1.167788
mean(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 7.866008
# Question 6:
avg_drinks <- tapply(sleep$Drinks, sleep$ClassYear, mean, na.rm = TRUE)
barplot(avg_drinks,
main = "Average Drinks per Week by Class Year",
xlab = "Class Year",
ylab = "Average Drinks per Week",
col = "lightgreen",
names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))
# Question 7:
avg_sleep <- tapply(sleep$AverageSleep, sleep$ClassYear, mean, na.rm = TRUE)
barplot(avg_sleep,
main = "Average Hours of sleep per week by Class Year",
xlab = "Class Year",
ylab = "Average Hours of sleep per Week",
col = "lightgreen",
names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))
# Question 8:
lark_counts <- table(sleep$LarkOwl)
lark_percent <- round(100 * prop.table(lark_counts), 1)
pie(lark_counts,
labels = paste(names(lark_counts), "(", lark_percent, "%)", sep=""),
main = "Student Sleep Habit Distribution",
col = c("lightblue", "tan", "darkblue"))
# Question 9:
avg_missed <- tapply(sleep$ClassesMissed, sleep$Gender, mean, na.rm = TRUE)
avg_missed
## 0 1
## 1.860927 2.725490
barplot(avg_missed,
main = "Average Classes Missed by Gender",
ylab = "Average Number of Classes Missed",
xlab = "Gender",
col = c("pink", "lightblue"),
names.arg = c("Female", "Male"))
# Question 10:
#Change order
sleep$AlcoholUse <- factor(sleep$AlcoholUse,
levels = c("Abstain", "Light", "Moderate", "Heavy"))
avg_gpa_alcohol <- tapply(sleep$GPA, sleep$AlcoholUse, mean, na.rm = TRUE)
avg_gpa_alcohol
## Abstain Light Moderate Heavy
## 3.321471 3.280482 3.208750 3.151250
barplot(avg_gpa_alcohol,
main = "Average GPA by Alcohol Use",
ylab = "Average GPA",
xlab = "Alcohol Use Category",
col = c("lightgreen", "yellow", "orange", "red"))