Introduction

In this report, we will analyze the sleep patterns of students using the SleepStudy data set. This set of data contains 253 observations that measure 27 different variables. In order to detect if a significant difference exists between the compared values, a p-value of 0.05 was chosen. This p-value was chosen due to it being a very common value to use when seeing if there is a statistical significance. If p < 0.05, then there is a statistical significance, otherwise there is not a statistical significance.

Questions

In this report, we will analyze the following questions:

  1. Is there a significant difference in the average GPA between male and female college students? Hypothesis: The null hypothesis is true.

  2. Is there a significant difference in the average number of early classes between the first two class years and other class years? Hypothesis: The alternative hypothesis is true.

  3. Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”? Hypothesis: The null hypothesis is true.

  4. Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status? Hypothesis: The alternative hypothesis is true.

  5. Is there a significant difference in the amount of hours slept during the week vs. on the weekend? Hypothesis: The null hypothesis is true.

  6. Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)? Hypothesis: The alternative hypothesis is true.

  7. Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use? Hypothesis: The alternative hypothesis is true.

  8. Is there a significant difference in the amount of anxiety a student has compared to the amount of stress a student has? Hypothesis: The alternative hypothesis is true.

  9. Is there a significant difference between GPA and the number of classes missed? Hypothesis: The alternative hypothesis is true.

  10. Is there a significant difference in the average hours of sleep on weekends between first two year students and other students? Hypothesis: The null hypothesis is true.

Analysis

Q1: Is there a significant difference in the average GPA between male and female college students?

The p-value of GPA and gender is 0.0001243. That means there is a significant difference in GPA between male and female students. The 95% confidence interval was from 0.0998 to 0303. The mean GPA for females was 3.32, and for males it was 3.12.

My original hypothesis was false.

Q2: Is there a significant difference in the average number of early classes between the first two class years and other class years?

After data analysis, the p value for this test was 0.00004009. This implies that there is a significant difference in the number of students taking early classes based on their class year. The t statistic was 4.1813, which also supports a large difference. The 95% confidence interval was from 0.404 to 1.124. The mean number of early classes was for first and second years was 2.07, and the mean for third and fourth years was 1.3.

My original hypothesis was true.

Q3: Do students who identify as “larks” have significantly better cognitive skills (cognition z-score) compared to “owls”?

The p-value for this test came out to be 0.4229. This means there is not a significant difference in the cognition z-score between lark and owl students. The 95% confidence interval was between -0.19 and 0.45. The mean z-score for larks was 0.09 and for the owls, it was -0.038

My original hypothesis was true.

Q4: Is there a significant difference in the average happiness level between students with at least moderate depression and normal depression status?

After performing some statistical analysis on the data. The p-value came out to be 0.00008616. This means there is a massive difference in happiness levels for those with moderate and normal levels of depression. The confidence interval for these is in the 95% interval.

My original hypothesis was false.

Q5: Is there a significant difference in the amount of hours slept during the week vs. on the weekend?

For this question, I decided to preform a paired t-test instead of a normal one. The p-value came out to be 0.001025. This means that there is a large difference in the number of hours slept on weekends vs weekdays. The 95% confidence interval is between -0.559 and -0.143. The average difference in hours slept is -0.3512253. This means that students generally sleep 0.35hrs more on weekends.

My original hypothesis was false.

Q6: Is there a significant difference in average sleep quality scores between students who reported having at least one all-nighter (AllNighter=1) and those who didn’t (AllNighter=0)?

After analysis, no significant difference in sleep quality and having pulled an all nighter was found. This is because the p-value of the t-test was found to be 0.09479. The confidence interval was found to be in the 95% and ranged from -1.9456958 to 0.1608449.

My original hypothesis was false.

Q7: Do students who abstain from alcohol use have significantly better stress scores than those who report heavy alcohol use?

After preforming a t-test on the data, a p-value of 0.5362 was found. This means that there is no significant difference between the two groups of alcohol drinkers. The 95% confidence interval was between -6.261170 and 3.327346. The mean stress score for those who abstained was 8.97, and for those who drank heavily, the mean score was 10.48.

My original hypothesis was false.

Q8: Is there a significant difference in the amount of anxiety a student has compared to the amount of stress a student has?

For this test, I used another paired t-test. After computation, a p-value of 2.2e-16 was obtained. This means that there is a massive difference in the amount of stress a student has compared to the amount of anxiety has. On average, the anxiety scores were 4.09 less than the stress scores.

My original hypothesis was true.

Q9: Is there a significant difference between GPA and the number of classes missed?

After preforming a paired t-test on this question, it was found to have a t-value of 4.93 and a p-value of 1.497e-06. This means that there is a significant difference between a student’s GPA and the number of classes missed. The 95% confidence interval was between 0.621 and 1.45. On average, 1.03 classes were missed for each point lower in a student’s GPA.

My original hypothesis was true.

Q10: Is there a significant difference in the average hours of sleep on weekends between first two year students and other students?

After using a t-test on the data, a p-value of 0.9618 was found. Because of the high p-value, it can be determined that there is no significant difference between the average number of hours slept on the weekend, and what year of student they are. On average, first and second year students slept 8.213592 hours and third and fourth year students slept 8.213592. The confidence interval for 95% was between -0.3497614 and 0.3331607.

My original hypothesis was true.

Summary

In conclusion we learned that

##Appendix

db = read.csv("https://www.lock5stat.com/datasets3e/SleepStudy.csv")
head(db)
##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0
#Question 1 code:

t.test(db$GPA ~ db$Gender)
## 
##  Welch Two Sample t-test
## 
## data:  db$GPA by db$Gender
## t = 3.9139, df = 200.9, p-value = 0.0001243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09982254 0.30252780
## sample estimates:
## mean in group 0 mean in group 1 
##        3.324901        3.123725
#Question 2 code:

t.test(db$NumEarlyClass[db$ClassYear %in% c(1, 2)],db$NumEarlyClass[db$ClassYear %in% c(3, 4)])
## 
##  Welch Two Sample t-test
## 
## data:  db$NumEarlyClass[db$ClassYear %in% c(1, 2)] and db$NumEarlyClass[db$ClassYear %in% c(3, 4)]
## t = 4.1813, df = 250.69, p-value = 4.009e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.4042016 1.1240309
## sample estimates:
## mean of x mean of y 
##  2.070423  1.306306
#Question 3 code:
t.test(CognitionZscore ~ LarkOwl, data = subset(db, LarkOwl != "Neither"))
## 
##  Welch Two Sample t-test
## 
## data:  CognitionZscore by LarkOwl
## t = 0.80571, df = 75.331, p-value = 0.4229
## alternative hypothesis: true difference in means between group Lark and group Owl is not equal to 0
## 95 percent confidence interval:
##  -0.1893561  0.4465786
## sample estimates:
## mean in group Lark  mean in group Owl 
##         0.09024390        -0.03836735
#Question 4 code:

t.test(Happiness ~ DepressionStatus, data = subset(db, DepressionStatus != "severe") )
## 
##  Welch Two Sample t-test
## 
## data:  Happiness by DepressionStatus
## t = -4.3253, df = 43.992, p-value = 8.616e-05
## alternative hypothesis: true difference in means between group moderate and group normal is not equal to 0
## 95 percent confidence interval:
##  -5.818614 -2.119748
## sample estimates:
## mean in group moderate   mean in group normal 
##               23.08824               27.05742
#Question 5 code:

t.test(db$WeekdaySleep, db$WeekendSleep, paired = TRUE)
## 
##  Paired t-test
## 
## data:  db$WeekdaySleep and db$WeekendSleep
## t = -3.3224, df = 252, p-value = 0.001025
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.5594223 -0.1430283
## sample estimates:
## mean difference 
##      -0.3512253
#Question 6 code:

t.test(PoorSleepQuality ~ AllNighter, data = db)
## 
##  Welch Two Sample t-test
## 
## data:  PoorSleepQuality by AllNighter
## t = -1.7068, df = 44.708, p-value = 0.09479
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -1.9456958  0.1608449
## sample estimates:
## mean in group 0 mean in group 1 
##        6.136986        7.029412
#Question 7 Code:

t.test(StressScore ~ AlcoholUse, data = subset(db, AlcoholUse %in% c("Abstain", "Heavy")))
## 
##  Welch Two Sample t-test
## 
## data:  StressScore by AlcoholUse
## t = -0.62604, df = 28.733, p-value = 0.5362
## alternative hypothesis: true difference in means between group Abstain and group Heavy is not equal to 0
## 95 percent confidence interval:
##  -6.261170  3.327346
## sample estimates:
## mean in group Abstain   mean in group Heavy 
##              8.970588             10.437500
#Question 8 code:

t.test(db$AnxietyScore, db$StressScore, paired = TRUE)
## 
##  Paired t-test
## 
## data:  db$AnxietyScore and db$StressScore
## t = -11.417, df = 252, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -4.801229 -3.388494
## sample estimates:
## mean difference 
##       -4.094862
#Question 9 code:

t.test(db$GPA, db$ClassesMissed, paired = TRUE)
## 
##  Paired t-test
## 
## data:  db$GPA and db$ClassesMissed
## t = 4.9294, df = 252, p-value = 1.497e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.6210748 1.4475418
## sample estimates:
## mean difference 
##        1.034308
#Question 10 code:

t.test(db$WeekendSleep[db$ClassYear %in% c(1, 2)],db$WeekendSleep[db$ClassYear %in% c(3, 4)])
## 
##  Welch Two Sample t-test
## 
## data:  db$WeekendSleep[db$ClassYear %in% c(1, 2)] and db$WeekendSleep[db$ClassYear %in% c(3, 4)]
## t = -0.047888, df = 237.36, p-value = 0.9618
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.3497614  0.3331607
## sample estimates:
## mean of x mean of y 
##  8.213592  8.221892