A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.
n <- 25
df <- n-1
t_stat <- qt(0.90, df)
confidence_lower <- 65
confidence_upper <- 77
sample_mean <- (confidence_lower + confidence_upper)/2
sample_mean
## [1] 71
margin_of_err <- confidence_upper - sample_mean
margin_of_err
## [1] 6
SE <- margin_of_err/t_stat
SD <- SE * sqrt(n)
SD
## [1] 22.76459
SAT scores of students at an Ivy League college are distributed with a standard deviation of 250 points. Two statistics students, Raina and Luke, want to estimate the average SAT score of students at this college as part of a class project. They want their margin of error to be no more than 25 points.
SD <- 250
ME <- 25
zstar <- qnorm(0.05, lower.tail=F)
n = (SD/ME * zstar)^2
n
## [1] 270.5543
Ans : It should be greater than then Raina. Because in order to get the confidence level of 99%, we need to more samples than 90% samples. Also if the z* increases, the no of same samples also increases.
SD <- 250
ME <- 25
zstar <- qnorm(0.005, lower.tail=F)
n = (SD/ME * zstar)^2
n
## [1] 663.4897
The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 200 students from this survey. Side-by-side box plots of reading and writing scores as well as a histogram of the di↵erences in scores are shown below.
Ans: Do not see a clear difference in the average of the reading and writing scores. The difference distribution is fairly normal around the zero difference, though it seems to be a slight skew to the right.
They are independent for each student.
H0 Null hypotesis -> mean diff = 0 H1 Alternate hyposthesis -> mean diff <> 0
Independence of observations: number of observation = 200 which is less than 10% of population.
Observations come from nearly normal distribution: The box plot provided in the text suggests the data are reasonably normally distributed and no outliers exist .Assuming each pair has greater than 30.
Ans: No, there is no strong evidence to reject the null hypothesis.
SD <- 8.887
SE <- SD/sqrt(200)
t_stat <- (−0.545- 0)/SE
t_stat
## [1] -0.867274
pvalue <- 1 - pt(t_stat, df=199)
pvalue
## [1] 0.8065818
Type 2 error since we reject the null hyposthesis..
Yes, It is. If 0 is outside the confidence range, we would have rejected the null hypothesis. Lets verify by calculating the confidence interval.
ci_lower <- (-0.545 - t_stat * SE)
ci_upper <- (−0.545 + t_stat * SE)
ci_lower
## [1] 0
ci_upper
## [1] -1.09
Each year the US Environmental Protection Agency (EPA) releases fuel economy data on cars manufactured in that year. Below are summary statistics on fuel efficiency (in miles/gallon) from random samples of cars with manual and automatic transmissions manufactured in 2012. Do these data provide strong evidence of a di↵erence between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage? Assume that conditions for inference are satisfied
Automatic Manual Mean 16.12 19.85 SD 3.58 4.51 n 26 26
Automatic_mean <- 16.12
Automatic_sd <- 3.58
Automatic_samples <- 26
Manual_mean <- 19.85
Manual_sd <- 4.51
Manual_samples <- 26
mean_diff <- Manual_mean - Automatic_mean
df <- Automatic_samples + Manual_samples - 2
pooled_var <-(Automatic_sd^2 * Automatic_samples -1 + Manual_sd^2 * Manual_samples-1) / df
se_diff <- sqrt(pooled_var/(Automatic_samples-1) + pooled_var/(Manual_samples-1))
t_val <- mean_diff / se_diff
pval <- 1-pt(t_val, df= df)
pval
## [1] 0.00126573
since the pval is less than that 0.05, We reject the null hypothesis and shows evidence that there is difference between average fuel efficiancy for automatic and manual car.