Elina Azrilyan

Homework 5

November 25th, 2018

Chapter 5 - Inference for Numerical Data Practice: 5.5, 5.19, 5.45 Graded: 5.6, 5.48

5.5 Working backwards, Part I. A 95% confidence interval for a population mean, μ, is given as (18.985, 21.015). This confidence interval is based on a simple random sample of 36 observations. Calculate the sample mean and standard deviation. Assume that all conditions necessary for inference are satisfied. Use the t-distribution in any calculations.

mu<-(21.015+18.986)/2
mu
## [1] 20.0005
marginoferror<-(21.015-18.986)/2
marginoferror
## [1] 1.0145
t.value <- qt(.975, 35)
t.value
## [1] 2.030108
(marginoferror/t.value)*6
## [1] 2.998363

*5.6 Working backwards, Part II. A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.

mu<-(65+77)/2
mu
## [1] 71
n<-25
marginoferror<-(77-65)/2
marginoferror
## [1] 6
t.value <- qt(.95, 24)
t.value
## [1] 1.710882
(marginoferror/t.value)*5
## [1] 17.53481

*5.48 Work hours and education. The General Social Survey collects data on demographics, education, and work, among many other characteristics of US residents. Using ANOVA, we can consider educational attainment levels for all 1,172 respondents at once. Below are the distributions of hours worked by educational attainment and relevant summary statistics that will be helpful in carrying out this analysis.

  1. Write hypotheses for evaluating whether the average number of hours worked varies across the five groups.

H0: Average number of hours worked across five groups is the same.

HA: Average number of hours worked across five groups is different for at least 1 group.

  1. Check conditions and describe any assumptions you must make to proceed with the test.

We assume independence within groups: that all observations in all education groups are sampled randomly and that the observations we are looking at are less that 10% of the entire population for each education group. We also assume independence between groups, which is reasonalbe since the sample is random. We can set that the sample size is over 30 and we can assume that the distribution is normal.

  1. Below is part of the output associated with this test. Fill in the empty cells.
mu <- c(38.67, 39.6, 41.39, 42.55, 40.85)
sd <- c(15.81, 14.97, 18.1, 13.62, 15.51)
n <- c(121, 546, 97, 253, 155)
df <- data.frame(mu,sd,n)

# Df degree
Df <- 5-1
Df
## [1] 4
#Df Residuals
DfResiduals<- 1172 - 5
DfResiduals
## [1] 1167
#DfTotal
DfTotal <- Df + DfResiduals
DfTotal
## [1] 1171

Sum Sq degree:

SSG <- sum(df$n *(df$mu - 40.45)^2 )
SSG
## [1] 2004.101

Sum Sq Total:

SSG + 267382
## [1] 269386.1

Mean Sq Residuals

MSG <- SSG/Df
MSG
## [1] 501.0251
MSE <- 267382/DfResiduals
MSE
## [1] 229.1191

F-Value

FValue<-MSG/MSE
FValue
## [1] 2.186745
  1. What is the conclusion of the test?

Answer: Since p-value is equal to 0.0682 - it is greater than 0.05, which means we cannot reject our null hypothesis. We can conclude that average number of hours worked across our groups are the same.