library(DATA606)
A 90% confidence interval for a population mean is (65,77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.
up.value <- 77
lo.value <- 65
n <- 25
sample.mean <- (up.value + lo.value) / 2; sample.mean
## [1] 71
margin.error <- up.value - sample.mean; margin.error
## [1] 6
t.value <-abs(qt(0.10/2, df=n-1))
s.deviation <- margin.error * sqrt(n) / t.value; s.deviation
## [1] 17.53481
SAT scores of students at an Ivy League college are distributed with a standard deviation of 250 points. Two statistics students, Raina and Luke, want to estimate the average SAT score of students at this college as part of a class project. They want their margin of error to be no more than 25 points.
s.deviation <- 250
margin.error <- 25
#.90+((1-0.90)/2)
z.value = qnorm(0.950)
n <- (z.value * s.deviation / margin.error)^2; round(n)
## [1] 271
Luke wants to use a 99% confidence interval. Without calculating the actual sample size, determine whether his sample should be larger or smaller than Raina’s, and explain your reasoning. Should be larger sample size for more confident interval
Calculate the minimum required sample size for Luke.
s.deviation <- 250
margin.error <- 25
#.99+((1-0.99)/2)
z.value = qnorm(0.995)
n <- (z.value * s.deviation / margin.error)^2; round(n+1)
## [1] 664
The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 200 students from this survey. Side-by-side box plots of reading and writing scores as well as a histogram of the differences in scores are shown below.
Yes, there appears to be difference based on the plots
I would assume that they are independent, that the student’s performance in reading does not affect the writing
H0 = no diffence in the average scores in reading and writing
HA = there’s diffence in the average scores in reading and writing
Independence is met. Sample size is large enough. No strong skew
These data provides the information to compute the pvalue which would confirm our hypothesis
mean.diff <- -0.545
sd.diff <- 8.887
n.diff <- 200
se.diff <- sd.diff/sqrt(n.diff)
t.value <- mean.diff/se.diff; t.value
## [1] -0.867274
p.value <- pt(t.value, df=n.diff-1); p.value
## [1] 0.1934182
The pvalue is greater than the critical value of 0.05, which fails to reject the null hypothesis; hence, based on data, there’s no strong evidence of difference between reading and writing in SAT scores
Type II error, fails to reject H0, when it is false. It may be that there is actually difference, but data does not support it
Yes since the null hypothesis says there’s no difference in scores
Each year the US Environmental Protection Agency (EPA) releases fuel economy data on cars manufactured in that year. Below are summary statistics on fuel efficiency (in miles/gallon) from random samples of cars with manual and automatic transmissions manufactured in 2012. Do these data provide strong evidence of a difference between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage? Assume that conditions for inference are satisfied.
Conduct hypothesis testing if there’s difference in city mileage between automatic and manual transmissison
H0: there is no difference in average fuel efficiency between automatic and manual cars
HA: there is difference in average fuel efficiency between automatic and manual cars
summary <- c('Mean', 'SD', 'n')
auto <- c(16.12, 3.58, 26)
manual <- c(19.85, 4.51, 26)
df <- data.frame(summary,auto,manual); df
## summary auto manual
## 1 Mean 16.12 19.85
## 2 SD 3.58 4.51
## 3 n 26.00 26.00
mean.diff <- df$auto[1] - df$manual[1]
se <- sqrt((((df$auto[2])^2/26) + ((df$m[2])^2/26)))
t.value <- mean.diff/se; t.value
## [1] -3.30302
p.value <- pt(t.value, df=25); p.value
## [1] 0.001441807
Based on the p-value being very small at 0.0014, we reject the null hypothesis. The sample data supports the alternate hypothesis that there is a difference in average city mileage between automatic and manual transmission