1. What is an estimate for the average combined depressed, anxious, stressed score (DASscore) for this population of college students?
head(`SleepStudy.(1)`)
##   Gender ClassYear LarkOwl NumEarlyClass EarlyClass  GPA ClassesMissed
## 1      0         4 Neither             0          0 3.60             0
## 2      0         4 Neither             2          1 3.24             0
## 3      0         4     Owl             0          0 2.97            12
## 4      0         1    Lark             5          1 3.76             0
## 5      0         4     Owl             0          0 3.20             4
## 6      1         4 Neither             0          0 3.50             0
##   CognitionZscore PoorSleepQuality DepressionScore AnxietyScore StressScore
## 1           -0.26                4               4            3           8
## 2            1.39                6               1            0           3
## 3            0.38               18              18           18           9
## 4            1.39                9               1            4           6
## 5            1.22                9               7           25          14
## 6           -0.04                6              14            8          28
##   DepressionStatus AnxietyStatus Stress DASScore Happiness AlcoholUse Drinks
## 1           normal        normal normal       15        28   Moderate     10
## 2           normal        normal normal        4        25   Moderate      6
## 3         moderate        severe normal       45        17      Light      3
## 4           normal        normal normal       11        32      Light      2
## 5           normal        severe normal       46        15   Moderate      4
## 6         moderate      moderate   high       50        22    Abstain      0
##   WeekdayBed WeekdayRise WeekdaySleep WeekendBed WeekendRise WeekendSleep
## 1      25.75        8.70         7.70      25.75        9.50         5.88
## 2      25.70        8.20         6.80      26.00       10.00         7.25
## 3      27.44        6.55         3.00      28.00       12.59        10.09
## 4      23.50        7.17         6.77      27.00        8.00         7.25
## 5      25.90        8.67         6.09      23.75        9.50         7.00
## 6      23.80        8.95         9.05      26.00       10.75         9.00
##   AverageSleep AllNighter
## 1         7.18          0
## 2         6.93          0
## 3         5.02          0
## 4         6.90          0
## 5         6.35          0
## 6         9.04          0
attach(`SleepStudy.(1)`)
hist(DASScore)

summary(DASScore)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    7.00   16.00   20.04   28.00   82.00
# we estimate $\mu$
# 1- by point estimate
xbar <- mean(DASScore)  
xbar
## [1] 20.03953
#2- by confidence intervals

s <- sd(DASScore)
n <- length(DASScore)

s
## [1] 16.54187
n
## [1] 253

2 - verify the condition to estimate to ensure \(\bar x\) is normally distributed:

#check sample size and skewness
n>30
## [1] TRUE
# It is right skewnsee 
hist(DASScore)  

  1. Estimate the standard error, \(\sigma/\sqrt{n}\) with \(s/\sqrt{n}\). Compute \(t^*\). Compute the confidence interval.
# Compute the standard error 
s <- 
standard_error <- function(s, n) {
  # s: sample standard deviation
  # n: sample size
  return(s/sqrt(n))
}
se <- standard_error (16.54187, 253)
se
## [1] 1.039978
# Compute the t* value for 95% confidence interval. 
tstar <- qt(0.975,df = n-1)
tstar
## [1] 1.969422
# Compute the confidence interval (can use the function above!)

ci_function <- function(xbar, s, n, tstar) {
  # xbar: sample mean
  # s: sample standard deviation
  # n: sample size
  # tstar: t* computed using qt() for confidence interval
  lower <- xbar - tstar * standard_error(s, n)
  upper <- xbar + tstar * standard_error(s, n)
  return(c(lower, upper))
}

xbar <- 20.03953
s <- 16.54187
n <- 253
tstar <- 1.969422
ci_function(xbar, s, n, tstar)
## [1] 17.99137 22.08769

With 95% confidence, we estimate the the average combined depressed, anxious, stressed score (DASscore) for this population of college students is between 17.99 and 22.09.

Down here With bootstrap and it approximately same result for confidence intervals

#Check missing data 
is.na(DASScore)
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [157] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [169] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [205] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [217] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [229] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [241] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [253] FALSE
sum(is.na(DASScore)) #no missing data 
## [1] 0
#Point estimate and sample size 
n <- length(DASScore)
x.bar <- mean(DASScore)

#Bootstrap 

#10000 simulations 
boot.xbars <- c() #initializing vector
for (b in 1:10000){
  boot.samp <- sample(DASScore,n,replace=TRUE)
  boot.xbar <- mean(boot.samp)
  boot.xbars <- c(boot.xbar,boot.xbars)
}

se <- sd(boot.xbars)

#95% confidence interval for the average budget of US movies 
#Chop 2.5% on each end 
CI.lower <- sort(boot.xbars)[250] #2.5%
CI.upper <- sort(boot.xbars)[9750] #97.5%

#CI
c(CI.lower, CI.upper)
## [1] 18.03953 22.08696

With 95% confidence, we estimate the the average combined depressed, anxious, stressed score (DASscore) for this population of college students is between 18.03 and 22.08.

  1. What is an estimate for the proportion of college students who pulled all nighters (AllNighter) for this population? Do the majority of college students in this population pull all nighters?

An estimate for the proportion of college students who pulled all nighters is 0.13 .

\(H_0:\) The majority of college students in this population pull all nighters. \(H_A:\) The majority of college students in this population not pull all nighters. significant level of 0.05

```r
attach(`SleepStudy.(1)`)
```

```
## The following objects are masked from SleepStudy.(1) (pos = 3):
## 
##     AlcoholUse, AllNighter, AnxietyScore, AnxietyStatus, AverageSleep,
##     ClassesMissed, ClassYear, CognitionZscore, DASScore,
##     DepressionScore, DepressionStatus, Drinks, EarlyClass, Gender, GPA,
##     Happiness, LarkOwl, NumEarlyClass, PoorSleepQuality, Stress,
##     StressScore, WeekdayBed, WeekdayRise, WeekdaySleep, WeekendBed,
##     WeekendRise, WeekendSleep
```

```r
table(AllNighter)
```

```
## AllNighter
##   0   1 
## 219  34
```

```r
prop.table(table(AllNighter))
```

```
## AllNighter
##         0         1 
## 0.8656126 0.1343874
```

```r
n<- length(AllNighter)
n
```

```
## [1] 253
```

```r
k <- sum(AllNighter=="1")
k
```

```
## [1] 34
```

```r
prop.test(k,n)
```

```
## 
##  1-sample proportions test with continuity correction
## 
## data:  k out of n, null probability 0.5
## X-squared = 133.82, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.09609494 0.18412247
## sample estimates:
##         p 
## 0.1343874
```

```r
#Compute the standard error, test statistic, and p-value. 
# Define a function to calculate the test statistic 
#test_statistic <- function(point_estimate,null_value,se){
  
#  ans<- (point_estimate - null_value)/se
#  return(ans)
#}
  
# Compute the test statistic and p-value
#t.stat <- test_statistic(point_estimate = xbar, null_value = 34,se= se)  # number of student pulled all night is 34
#t.stat
#2*pt(t.stat,df = n-1,lower.tail = FALSE )
# Compare with t.test()

hist(AverageSleep)
```

<img src="Homework10_files/figure-html/unnamed-chunk-5-1.png" width="672" />

```r
AllNighter1 <- factor(AllNighter,levels = c(0,1),labels = c("Not Pulled","pulled"))
barplot(table(AllNighter1),main = "college students Prop.", col="cyan")
```

<img src="Homework10_files/figure-html/unnamed-chunk-5-2.png" width="672" />

0.87 not pulled all nighters. So the majority of college students in this population did not pull all nighters.

0.13 pulled all nighters and they represent less than quarter as shown in the graph.

We have a strong evidence the majority of college students in this population did not pull all-nighters because the alternative hypothesis is true and p_value not equal to 0.5 so based in that we can reject the null hypothesis test.