Introduction

We use the data from lock5stat.com

I propose the following 10 questions

  1. Is there a significant difference in the average GPA between male and female college students?

  2. Is there a significant different in the average number of early classes between the first two class years and the other class years?

  3. Do students who identify as “larks” have significantly better cognitive skills (cognitiion z-score) compared to “owls”?

  4. Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

  5. Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

  6. Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

  7. Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

  8. Is there a significant difference in the average number of drinks per week between students of different genders?

  9. Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?

  10. Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

Analysis

We will explore the questions here.

sleep= read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
head(sleep)
##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0
sleep$LarkOwlGroup <- ifelse(sleep$LarkOwl %in% c("Lark", "Owl"), 1,2)
sleep$LarkOwlGroup
##   [1] 2 2 1 1 1 2 1 1 2 2 2 2 2 2 2 1 2 1 2 1 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2
##  [38] 2 1 2 2 2 2 2 2 1 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 1 2 2 1 2 1 2 2 2 2 2 1 1
##  [75] 2 1 2 2 1 1 2 2 1 2 1 2 2 1 1 2 1 2 2 2 2 1 2 2 1 1 1 2 1 2 2 1 1 2 2 2 1
## [112] 1 2 2 2 1 1 1 2 2 2 2 1 2 2 1 2 2 2 2 2 1 2 2 2 2 1 1 2 2 2 1 2 2 1 1 1 1
## [149] 1 2 2 2 2 2 2 1 2 2 2 1 2 1 1 1 1 1 2 1 1 2 2 2 1 2 1 2 1 2 2 2 2 2 2 2 2
## [186] 2 1 2 2 1 1 1 1 1 2 2 2 2 1 1 2 2 1 2 1 2 2 1 2 1 2 2 2 1 1 2 2 1 2 1 1 2
## [223] 2 1 2 1 2 1 1 1 2 2 2 2 1 2 2 2 2 1 2 1 2 2 1 1 2 1 2 2 2 2 2
sleep$AlcoholUseGroup <- ifelse(sleep$AlcoholUse %in% c("Abstain", "Light"), 1, 2)
sleep$AlcoholUseGroup
##   [1] 2 2 1 1 2 1 2 1 1 2 2 2 2 1 1 1 2 2 2 2 1 1 2 1 1 1 2 2 1 2 2 1 1 2 1 1 1
##  [38] 1 2 2 2 2 1 1 1 2 1 1 2 1 2 1 1 2 2 1 2 2 1 2 1 1 2 2 1 1 1 2 2 1 1 1 2 2
##  [75] 1 1 1 1 2 2 1 2 1 1 1 1 1 1 1 2 2 2 1 2 2 2 2 2 2 2 1 1 1 1 1 2 2 2 2 2 1
## [112] 1 1 2 2 2 2 1 1 2 2 2 1 2 1 1 1 2 2 1 2 1 2 1 2 1 2 1 2 1 1 2 1 1 1 1 2 2
## [149] 2 2 1 2 2 1 2 2 2 1 1 2 2 1 2 2 1 1 2 2 2 1 1 1 2 2 2 2 2 2 2 1 2 1 1 2 2
## [186] 2 2 1 2 2 2 1 2 2 1 2 1 2 2 2 2 1 1 2 2 2 2 1 2 2 1 1 2 1 1 1 1 1 2 2 2 2
## [223] 2 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 2 1 1 2 2 2 2 1 1 1 1 2 2 2
sleep$DepressionStatusGroup <- ifelse(sleep$AlcoholUse %in% c("Normal", "Moderate"), 1, 2)
sleep$DepressionStatusGroup
##   [1] 1 1 2 2 1 2 1 2 2 1 1 1 1 2 2 2 1 2 1 1 2 2 1 2 2 2 1 1 2 1 1 2 2 1 2 2 2
##  [38] 2 1 1 1 1 2 2 2 1 2 2 1 2 1 2 2 1 1 2 1 2 2 1 2 2 1 1 2 2 2 1 1 2 2 2 1 1
##  [75] 2 2 2 2 1 1 2 1 2 2 2 2 2 2 2 1 1 2 2 1 2 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 2
## [112] 2 2 1 1 1 1 2 2 1 1 1 2 1 2 2 2 1 1 2 1 2 1 2 2 2 1 2 1 2 2 1 2 2 2 2 1 1
## [149] 1 1 2 1 1 2 1 1 1 2 2 1 1 2 1 2 2 2 1 1 1 2 2 2 1 2 1 1 1 1 1 2 1 2 2 1 1
## [186] 1 1 2 1 1 1 2 2 1 2 1 2 1 2 1 1 2 2 1 2 2 1 2 1 1 2 2 1 2 2 2 2 2 2 1 1 2
## [223] 1 2 2 2 2 1 2 2 2 2 1 2 1 2 1 2 1 1 2 2 1 1 1 1 2 2 2 2 1 1 1

Q1: Is there a significant difference in the average GPA between male and female college students?

cor(sleep$Gender, sleep$GPA, use = "complete.obs")
## [1] -0.2445769
scatter.smooth(sleep$GPA, main = "Average GPA", xlab= "Students", ylab= "GPA" )

hist(sleep$Gender, main="Gender", xlab="Female                                 Male",ylab="Number of Students")

There is a negative correlation between the average GPA and gender of students. Males have a lower average GPA than females.

Q2: Is there a significant different in the average number of early classes between the first two class years and the other class years?

cor(sleep$ClassYear, sleep$NumEarlyClass, use = "complete.obs")
## [1] -0.2687247
scatter.smooth(sleep$ClassYear, main = "Class Year and Early Classes Taken", xlab= "Early Classes Taken", ylab= "Class Year" )

There is a negative correlation between class year and choosing early classes.

Q3: Do students who identify as “larks” have significantly better cognitive skills (cognitiion z-score) compared to “owls”?

cor(sleep$LarkOwlGroup, sleep$CognitionZscore, use = "complete.obs")
## [1] -0.02134276
hist(sleep$LarkOwlGroup, main= "Owl Vs. Lark", xlab= "Owl                                                  Lark", ylab="Students")

hist(sleep$CognitionZscore, main= "Cognition Z Score", xlab= "Z Score", ylab="Students")

Q4: Is there a significant difference in the average number of classes missed in a semester between students who had at least one early class (EarlyClass=1) and those who didn’t (EarlyClass=0)?

cor(sleep$ClassesMissed, sleep$NumEarlyClass, use = "complete.obs")
## [1] -0.08284114
scatter.smooth(sleep$ClassesMissed, main = "Students Missed Class", xlab= "Students", ylab= "Classes Missed")

scatter.smooth(sleep$NumEarlyClass, main = "Students Early to Class", xlab= "Students", ylab= "Early Class")

There is a high negative correlation between students who missed class and those who had atleast one early class. Those who missed more classes had less of a chance of going to class early.

Q5: Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

cor(sleep$Happiness, sleep$DepressionStatusGroup, use = "complete.obs")
## [1] 0.01485652
hist(sleep$Happiness, main = "Happiness vs Depression", xlab="Happiness", ylab="Students")

hist(sleep$DepressionStatusGroup, main= "Depression Status", xlab="Moderate or Normal Depression", ylab="Students")

Correlation is 0.01485652. Most of the students are pretty happy when there is an equal split of students who say they are moderately depressed and have normal depression status.

Q6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

cor(sleep$PoorSleepQuality, sleep$AllNighter, use = "complete.obs")
## [1] 0.1044542
scatter.smooth(sleep$PoorSleepQuality, main = "Sleep Quality", xlab= "Students", ylab= "Poor Sleep Quality")

hist(sleep$AllNighter, main = "Pulling an All Nighter", xlab= "Pulled an All Nighter", ylab= "Students")

The correlation for sleep quality and if you have pulled an all nighter is very small. So they are not related

Q7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

cor(sleep$AlcoholUseGroup, sleep$StressScore, use = "complete.obs")
## [1] 0.01555206
hist(sleep$AlcoholUseGroup, main = "Alcohol Use", xlab="Not Drink and Drinking", ylab="Students")

hist(sleep$StressScore, main="Stress Levels", xlab="Amount of Stress", ylab="Students")

The correlation between Alcohol Use and Stress is extremly low. The two are unrelated

Q8: Is there a significant difference in the average number of drinks per week between students of different genders?

cor(sleep$Drinks, sleep$Gender, use = "complete.obs")
## [1] 0.3961698
hist(sleep$Drinks, main= "Number of Drinks per Week", xlab="Number of Drinks", ylab= "Students")

hist(sleep$Gender, main="Gender", xlab="Female                                                                    Male", ylab="Students")

There is a moderate positive correlation between the number of drinks per week and gender. ### Q9: Is there a significant difference in the average weekday bedtime between students with high and low stress (Stress=High vs. Stress=Normal)?

cor(sleep$WeekdaySleep, sleep$StressScore)
## [1] -0.09220388
hist(sleep$WeekdaySleep, main="Average Hours of Sleep on Weekdays", ylab="Students", xlab= "Hours of Sleep")

hist(sleep$StressScore, main= "Amount of Stress", xlab="Stress Score", ylab="Students")

The correlation between weekday bedtime and students with high and low stress is low. The time you go to bed doesn’t affect ones stress too much.

Q10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

cor(sleep$WeekdayBed, sleep$ClassYear)
## [1] -0.002674365
hist(sleep$WeekdayBed, main= "Average Weekday Bedtime", ylab="Students", xlab="Time(24 is Midnight)")

hist(sleep$ClassYear, main="Year of Schooling", xlab="Year", ylab="Students")

There is no correlation between sleep on the weekends and the year of schooling you’re in. The correlation is very close to zero.