A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.
Sample Mean:
n <- 25
x1 <- 65
x2 <- 77
sm <- (x2 + x1) / 2
sm
## [1] 71
# Smaple mean is 71
Margin of Error:
n <- 25
x1 <- 65
x2 <- 77
me <- (x2 - x1) / 2
me
## [1] 6
# Margin of error is 6
Sample Standard Deviation:
df <- 25 - 1
p <- 0.9
t_index <- p + (1 - p)/2
t_val <- qt(t_index, df)
se <- me / t_val
sd <- se * sqrt(n)
sd
## [1] 17.53481
# Sample Standard Deviation is 17.53481
SAT scores of students at an Ivy League college are distributed with a standard deviation of 250 points. Two statistics students, Raina and Luke, want to estimate the average SAT score of students at this college as part of a class project. They want their margin of error to be no more than 25 points.
me = 25
ssd = 250
z = 1.65
x = ((ssd*1.65)/me)^2
x
## [1] 272.25
# The sample size should be 273 students.
As Luke wants a narrower confidence interval, he needs to collect a larger sample to have higher confidence.
z <- 2.575 # 99% Confidence interval
me <- 25
sd <- 250
n <- ((z * sd) / me ) ^ 2
n
## [1] 663.0625
# Minimum required sample size for Luke is 664 students
The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 200 students from this survey. Side-by-side box plots of reading and writing scores as well as a histogram of the differences in scores are shown below.
I do not see a clear difference in the average of the reading and writing scores. Sample variability introduced some minon differences.
I would say that the scores are independent of each student but not of each score, that is reading and writing scores are not independent of each other for each student.
H0:μread−write=0 (There is no difference in the average scores in reading and writing.) HA:μread−write≠0 (There is a difference in average scores.)
Students must be independent of each other-> Yes Nearly normal distribution-> Yes Sample size less than 10% of population-> Yes
sd_diff <- 8.887
mu_diff <- -0.545
n <- 200
se_diff <- sd_diff / sqrt(n)
t_value <- (mu_diff - 0) / se_diff
df <- n - 1
p <- pt(t_value, df = df)
p
## [1] 0.1934182
The P-value fails to be smaller than 5%. We reject the alternative Hypothesis in favor of the null.
Since we did not reject the null hypothesis, we may run into a type II error.
0 is the best possible result to reject the alternative hypothesis, which we did. I would expect 0 to be in the confidence interval.
Each year the US Environmental Protection Agency (EPA) releases fuel economy data on cars manufactured in that year. Below are summary statistics on fuel efficiency (in miles/gallon) from random samples of cars with manual and automatic transmissions manufactured in 2012. Do these data provide strong evidence of a difference between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage? Assume that conditions for inference are satisfied
The hypotheses for this test are as follows:
H0: The difference of average miles is equal to zero. HA: The difference of average miles is NOT equal to zero.
From the text we have as follows:
x = 26
auto_mean = 16.12
manual_mean = 19.85
auto_sd = 3.58
manual_sd = 4.51
diffOfMeans = auto_mean - manual_mean
sE = sqrt((auto_sd^2/x)+(manual_sd^2/x))
t = (diffOfMeans)/(sE)
p = pt(t,df=25)*2
p
## [1] 0.002883615
p-value is less than 0.05. We reject H0 in favor of HA Which means we’ve detected a difference in automatic vs manual car fuel efficiency.
H0 : The means of each group are equal HA : The means of each group are not equal
Observations independent of each other. Samples are 10% or less of the population and The survey is random. Sample sizes greater than 30 to prevent too much skew.
Assume confidence interval of 95%, α=0.05. Since p−value=0.0682>α, we fail to reject H0.
# Store given values
k <- 5
n <- 1172
MSG <- 501.54
SSE <- 267382
p <- 0.0682
# Find Df
dfG <- k-1
dfE <- n-k
dfT <- dfG + dfE
df <- c(dfG, dfE, dfT)
# Find Sum Sq
SSG <- dfG * MSG
SST <- SSG + SSE
SS <- c(SSG, SSE, SST)
# Find Mean Sq
MSE <- SSE / dfE
MS <- c(MSG, MSE, NA)
# Find F-value
Fv <- MSG / MSE
# Combine all values and display
result <- data.frame(df, SS, MS, c(Fv, NA, NA), c(p, NA, NA))
colnames(result) <- c("Df", "Sum Sq", "Mean Sq", "F Value", "Pr(>F)")
rownames(result) <- c("degree", "Residuals", "Total")
result
## Df Sum Sq Mean Sq F Value Pr(>F)
## degree 4 2006.16 501.5400 2.188992 0.0682
## Residuals 1167 267382.00 229.1191 NA NA
## Total 1171 269388.16 NA NA NA
Since the p-value = 0.0682 is greater than 0.05, we conclude that there is no significant difference between the groups and the null hypothesis does not get rejected.