5.6. Working backwards, Part II. A 90% con???dence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This con???dence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.

n = 25
df: 25 - 1 =24

Mean is the midpoint in the interval

m <- (65 + 77) / 2
m
## [1] 71

Margin of Error

me <- m - 65 #OR
me <- 77 - m
me
## [1] 6

Standard Deviation

n <- 25
df <- 24
t_critical <- qt(0.9, df) 
se <- me/t_critical
sd = se * sqrt(n); sd
## [1] 22.76459

5.14. SAT scores. SAT scores of students at an Ivy League college are distributed with a standard deviation of 250 points. Two statistics students, Raina and Luke, want to estimate the average SAT score of students at this college as part of a class project. They want their margin of error to be no more than 25 points.

  1. Raina wants to use a 90% confidence interval. How large a sample should she collect?

Confidence level = 0.90 => 1.65 (z-score) Equation: \(1.65 * \frac{s}{\sqrt{n}} = 25\)

me <- 25
me / 1.65
## [1] 15.15152
s <- 250
sqrt_n <- 250 / 15.15152
sqrt_n
## [1] 16.49999
samp_size <- sqrt_n^2
samp_size
## [1] 272.2498

Approximately 273

  1. Luke wants to use a 99% confidence interval. Without calculating the actual sample size, determine whether his sample should be larger or smaller than Raina’s, and explain your reasoning.

Luke’s sample size would be bigger because he will need more cases to fit that confidence interval. His confidence level will cover Raina’s interval and more.

  1. Calculate the minimum required sample size for Luke.

Confidence level = 0.99 => 2.58 (z-score)

me <- 25
me / 2.58
## [1] 9.689922
s <- 250
sqrt_n <- 250 / 9.689922
sqrt_n
## [1] 25.8
samp_size <- sqrt_n^2
samp_size
## [1] 665.6401

Approximately 666

5.20. High School and Beyond, Part I.

  1. Is there a clear difference in the average reading and writing scores? YES

  2. Are the reading and writing scores of each student independent of each other?
    YES. Sample is random, less than 10% of the population and > 30

  3. Create hypotheses appropriate for the following research question: is there an evident difference in the average scores of students in the reading and writing exam?

\(H_0: \mu = 0\)There is no difference in the average scores from reading and writing among the students
\(H_A: \mu \neq 0\) There is a difference in the average scores from reading and writing among the students

  1. Check the conditions required to complete this test. Independence: Sample is random, less than 10% of the population and > 30
    Slight skew but we can be lenient due to the fact that the sample size is greater than 30.

  2. The average observed difference in scores is \(\bar{x}_{read-write} = -0.545\), and the standard deviationof the differences is 8.887 points. Do these data provide convincing evidence of a difference between the average scores on the two exams?

N.B This proves that the writing scores are higher.

\(T = \frac{-0.545 - 0}{8.887 / \sqrt{200}}\)

t_critical <- (-0.545 - 0 )/ (8.887/ sqrt(200))
t_critical
## [1] -0.867274
df <- 200-1
p_value <- pt(t_critical, df)*2
p_value
## [1] 0.3868365

P-value is greater than 0.05 so we failed to reject \(H_0\)

  1. What type of error might we have made? Explain what the error means in the context of the application.
  1. Based on the results of this hypothesis test, would you expect a con???dence interval for the average di???erence between the reading and writing scores to include 0? Explain your reasoning.

5.32. Fuel efficiency of manual and automatic cars, Part I.

\(H_0: \mu_m = \mu_a\)
\(H_A: \mu_m \neq \mu_a\)

mean = \(\bar{x_m} - \bar{x_a}\) \(SE = \sqrt{\frac{S_m^2}{n_m} + \frac{S_a^2}{n_a}}\)

mean = 19.85 - 16.12
SE <- sqrt(((3.58^2) / 26) + ((4.51^2) / 26))

t_critical <- (mean - 0) / SE
t_critical
## [1] 3.30302
df <- 26 - 1

p_value <- pt(t_critical, df, lower.tail = F) * 2

P-value is less than 0.05 so we reject \(H_0\). The data does provide strong evidence of a difference between the average fuel of cars with manual and automatic transmission.

5.48. Work hours and education

  1. Write hypotheses for evaluating whether the average number of hours worked varies across the five groups.

\(H_0: \mu_{>hs} = \mu_{hs} = ... = \mu_{grad}\) The average number of hours worked is the same in all groups. Any observed difference is due to chance.
\(H_A:\mu_{>hs} \neq \mu_{hs} \neq ... \neq \mu_{grad}\) The average hours varies by education.

  1. Check conditions and describe any assumptions you must make to proceed with the test.
  1. Below is part of the output associated with this test. Fill in the empty cells.
  Df Sum Sq Mean Sq F-Value \(Pr(>F)\)
degree 4 2006.16 501.54 2.17 0.0682
Residuals 1167 267,382 230.84
Total 1172 269,388.16
  1. What is the conclusion of the test?