Introduction

This report is an analysis of the sleep patterns of college students. The data we are analyzing was found in a sleep study, and obtained through this link: https://www.lock5stat.com/datasets3e/SleepStudy.csv This dataset observes 253 individuals, and 27 different variables including GPA, average sleep per night, and number of early classes per week.

The goal of this report and the following 10 questions, aims to explore the relationships between college student’s lifestyle decisions, and their academics

The 10 research questions: 1. What is the average GPA of the dataset? 2. What proportion of students are male vs female? 3. Is there a correlation between average sleep and happiness? 4. Is there a correlation between poor sleep quality and stress? 5. What is the standard deviation of weekday sleep? 6. Which class year of students drink the most? 7. Which class year of students get the most average sleep? 8. What portion of students identify as a lark, night owl, or neither? 9. Do males or females miss more classes in a semester? 10. What is the average GPA of students across different alcohol uses

Analysis

The data set is seen below.

Using the data, and R code, we will answer 10 research questions proposed above

sleep = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
head(sleep)
##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0

Question 1: What is the average GPA of the dataset?

mean(sleep$GPA, na.rm = TRUE)
## [1] 3.243794

Across the 253 students, the average GPA is 3.24, which comes out to be a B average.

Question 2: What proportion of students are male vs female?

gender_props <- (table(sleep$Gender))
gender_props
## 
##   0   1 
## 151 102
barplot(gender_props,
        main = "Number of Students by Gender",
        ylab = "Number of students",
        xlab = "Gender",
        col = c("pink", "skyblue"),
        names.arg = c("Female", "Male"))

The bar chart shows that a majority of the students in this study were female. 151 female, and 102 male, comes out to about 60% female and 40% male.

Question 3: Is there a correlation between average sleep and happiness?

cor(sleep$AverageSleep, sleep$Happiness, use = "complete.obs")
## [1] 0.1038736

The correlation came out to be 0.1, which is very surprising to me. I had assumed That more sleep would have led to more happiness, but the dataset suggests that there is little to no correlation between the average amount of sleep a student gets and their happiness.

Question 4: Is there a correlation between poor sleep quality and stress?

cor(sleep$PoorSleepQuality, sleep$StressScore, use = "complete.obs")
## [1] 0.3275876

The correlation came out to be about 0.33 this time. This suggests that there is a moderate positive linear relationship between the two. The worse someone’s sleep quality is, the more likely they are to be stressed.

Qustion 5: What is the standard deviation of weekday sleep?

sd(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 1.167788
mean(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 7.866008

The standard deviation of the weekday sleep is 1.17 hours, with the average being roughly 7.87 hours. This means that about 60% of the students sleep between 6.7 and 9.04 hours.

Question 6: Which class year of students drink the most?

avg_drinks <- tapply(sleep$Drinks, sleep$ClassYear, mean, na.rm = TRUE)

barplot(avg_drinks,
        main = "Average Drinks per Week by Class Year",
        xlab = "Class Year",
        ylab = "Average Drinks per Week",
        col = "lightgreen",
        names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))

This bar chart shows that of the four classes, Freshmen drink the least, with an average around 4.5 drinks a week, while Sophomores drink the most at around 6 drinks per week.

Question 7: Which class year of students get the most average sleep?

avg_sleep <- tapply(sleep$AverageSleep, sleep$ClassYear, mean, na.rm = TRUE)

barplot(avg_sleep,
        main = "Average Hours of sleep per week by Class Year",
        xlab = "Class Year",
        ylab = "Average Hours of sleep per Week",
        col = "lightgreen",
        names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))

The bar chart shows that Sophomores get more average sleep than the other classes, but only slightly and not by any significant amount. I find this funny because the 2nd year students are also the ones who do the most drinking on average.

Question 8: What portion of students identify as a lark, night owl, or neither?

lark_counts <- table(sleep$LarkOwl)
lark_percent <- round(100 * prop.table(lark_counts), 1)

pie(lark_counts,
    labels = paste(names(lark_counts), "(", lark_percent, "%)", sep=""),
    main = "Student Sleep Habit Distribution",
    col = c("lightblue", "tan", "darkblue"))

Based on the pie chart, it would seem almost 20% of students identify as night owls, a little over 15% identify as larks (morning people) and the remaining 65% identify as neither a night owl or a lark. Me personally identify myself as a lark.

Question 9: Do males or females miss more classes in a semester?

avg_missed <- tapply(sleep$ClassesMissed, sleep$Gender, mean, na.rm = TRUE)
avg_missed
##        0        1 
## 1.860927 2.725490
barplot(avg_missed,
        main = "Average Classes Missed by Gender",
        ylab = "Average Number of Classes Missed",
        xlab = "Gender",
        col = c("pink", "lightblue"),
        names.arg = c("Female", "Male"))

The bar plot shows that on average females miss 1.86 classes a semester, while the males miss 2.72 classes on average.

Question 10: What is the average GPA of students across different alcohol uses

#Change order
sleep$AlcoholUse <- factor(sleep$AlcoholUse,
                           levels = c("Abstain", "Light", "Moderate", "Heavy"))

avg_gpa_alcohol <- tapply(sleep$GPA, sleep$AlcoholUse, mean, na.rm = TRUE)
avg_gpa_alcohol
##  Abstain    Light Moderate    Heavy 
## 3.321471 3.280482 3.208750 3.151250
barplot(avg_gpa_alcohol,
        main = "Average GPA by Alcohol Use",
        ylab = "Average GPA",
        xlab = "Alcohol Use Category",
        col = c("lightgreen", "yellow", "orange", "red"))

The data shows students that abstain average GPA of 3.32, students who drink lightly average a GPA of 3.28, students that drink moderately average a GPA of 3.21, and students who drink heavily average a GPA of 3.15. The bar plot shows the slight decline in average grades as the level of drinking increases.

Conclusion

Overall we found that contrary to popular belief, more sleep does not necessarily make you more happy, but less quality sleep is more likely to make you more stressed. We also found that even though Sophomores drink the most, and Freshmen drink the least, all the classes get about the same amount of sleep on average, being 8 hours. Most students also identified themselves as neither a lark or a night owl. We found that men tend to miss more classes per semester than women, and that most of the students participating in the study were women. What we can take away from this analysis is that there are a vast number of factors that determine a students success. It’s not as simple as sleeping more. In future I would like to explore more relationships, such as lark GPA vs owl GPA, the GPAs and stress levels of students who have early morining classes, and male GPA vs female GPA.

Appendix

# Question 1:
mean(sleep$GPA, na.rm = TRUE)
## [1] 3.243794
# Question 2:
gender_props <- (table(sleep$Gender))
gender_props
## 
##   0   1 
## 151 102
barplot(gender_props,
        main = "Number of Students by Gender",
        ylab = "Number of students",
        xlab = "Gender",
        col = c("pink", "skyblue"),
        names.arg = c("Female", "Male"))

# Question 3:
cor(sleep$AverageSleep, sleep$Happiness, use = "complete.obs")
## [1] 0.1038736
# Question 4:
cor(sleep$PoorSleepQuality, sleep$StressScore, use = "complete.obs")
## [1] 0.3275876
# Question 5:
sd(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 1.167788
mean(sleep$WeekdaySleep, na.rm = TRUE)
## [1] 7.866008
# Question 6:
avg_drinks <- tapply(sleep$Drinks, sleep$ClassYear, mean, na.rm = TRUE)
barplot(avg_drinks,
        main = "Average Drinks per Week by Class Year",
        xlab = "Class Year",
        ylab = "Average Drinks per Week",
        col = "lightgreen",
        names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))

# Question 7:
avg_sleep <- tapply(sleep$AverageSleep, sleep$ClassYear, mean, na.rm = TRUE)
barplot(avg_sleep,
        main = "Average Hours of sleep per week by Class Year",
        xlab = "Class Year",
        ylab = "Average Hours of sleep per Week",
        col = "lightgreen",
        names.arg = c("Freshman", "Sophomore", "Junior", "Senior"))

# Question 8:
lark_counts <- table(sleep$LarkOwl)
lark_percent <- round(100 * prop.table(lark_counts), 1)

pie(lark_counts,
    labels = paste(names(lark_counts), "(", lark_percent, "%)", sep=""),
    main = "Student Sleep Habit Distribution",
    col = c("lightblue", "tan", "darkblue"))

# Question 9:
avg_missed <- tapply(sleep$ClassesMissed, sleep$Gender, mean, na.rm = TRUE)
avg_missed
##        0        1 
## 1.860927 2.725490
barplot(avg_missed,
        main = "Average Classes Missed by Gender",
        ylab = "Average Number of Classes Missed",
        xlab = "Gender",
        col = c("pink", "lightblue"),
        names.arg = c("Female", "Male"))

# Question 10:
#Change order
sleep$AlcoholUse <- factor(sleep$AlcoholUse,
                           levels = c("Abstain", "Light", "Moderate", "Heavy"))

avg_gpa_alcohol <- tapply(sleep$GPA, sleep$AlcoholUse, mean, na.rm = TRUE)
avg_gpa_alcohol
##  Abstain    Light Moderate    Heavy 
## 3.321471 3.280482 3.208750 3.151250
barplot(avg_gpa_alcohol,
        main = "Average GPA by Alcohol Use",
        ylab = "Average GPA",
        xlab = "Alcohol Use Category",
        col = c("lightgreen", "yellow", "orange", "red"))