5.6

Working backwards, Part II. A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.

the sample mean is the mid point of the range (65+77)/2 = 71 we use the t-distribution of 24 degrees of freedom where the z= 1.711 using the Standard deviation formula 77 = 71 + 1.711 * s/sqrt(25) s = ((77 - 71) / 1.711) * sqrt(25) s = 17.5336060783

ME is s/sqrt(n) = 17.533606/sqrt(25) = 3.50672

Proof

71 + 1.711 * 3.50672 = 77 71 - 1.711 * 3.50672 = 65

5.14

5.14 SAT scores. SAT scores of students at an Ivy League college are distributed with a standard deviation of 250 points. Two statistics students, Raina and Luke, want to estimate the average SAT score of students at this college as part of a class project. They want their margin of error to be no more than 25 points. (a) Raina wants to use a 90% confidence interval. How large a sample should she collect? (b) Luke wants to use a 99% confidence interval. Without calculating the actual sample size, determine whether his sample should be larger or smaller than Raina’s, and explain your reasoning. (c) Calculate the minimum required sample size for Luke.

a)

sd <- 250 alpha <- 1 - 0.9

calculate SE ME = z * SE for 90% z is 1.645 25 = z * SE 25 = 1.645 * SE SE = 15.1976 = s/sqrt(n) 15.1976 = 250/sqrt(n) n <-(1.645*250/15.1976)^2 n = 732.25

In order to get a margin of error of no more than 25, the sample should be at least 732.

b)

The sample needs to be bigger because if one of the factors is bigger, the other should be smaller in order to have the same result, we achieve this by increasing the denominator

c)

calculate SE ME = z * SE for 99% z is 2.575 25 <= z * SE 25 <= 2.575 * SE n = (2.575*250/9.7087)^2 n = 4396.5530

this time the sample should be at least 4397

5.20

(a) Is there a clear difference in the average reading and writing scores?

the IQR of both box plots have similar size but as we can see in the histogram the bigger frequeny of the difference is around 0

(b) Are the reading and writing scores of each student independent of each other?

They are not, this is a paired observation because every student have a socre for reading and writing.

(c) Create hypotheses appropriate for the following research question: is there an evident difference in the average scores of students in the reading and writing exam?

H0 = Average(read) - Average(write) = 0 Ha = Average(read) - Average(write) != 0

(d) Check the conditions required to complete this test.

The sample size is at least 30 (it’s 200 in this case) and it’s random, the histogram also show us that data is not skewed, so we can proceed with inference.

N <- 200
s <- 8.887
M = -0.545
SE <- s/sqrt(N)
T <- (M-0)/SE
qt(c(.05, .95),df=N-1)
## [1] -1.652547  1.652547
P <- pt(T, df=N-1, lower.tail =FALSE)
P
## [1] 0.8065818

Since P > 0.05, we fail to reject Ho, that is, there’s no plausible difference between the average read and write samples.

(f) What type of error might we have made? Explain what the error means in the context of the application.

We may have made a Type II Error. Failing to reject null hypotheses when it is false. but the result value is not very close to the edges.

  1. Based on the results of this hypothesis test, would you expect a confidence interval for the average difference between the reading and writing scores to include 0? Explain your reasoning.

Yes, because the fact that there’s no plausible difference between read and write means that 0 is a value that will be there (the difference is 0)

5.32

Xm - Xa = 19.85 - 16.12 = 3.73 SE = sqrt(sa^2/na + sb^2/nb) = sqrt((4.51^2)/26 +(3.58^2)/26)) = 1.1293

Ho = Average(Manual) = Average(Automatic) Ha = Average(Manual) != Average(Automatic)

T = 3.73-0/1.1293 = 3.302931

We use the p-value for df=26-1= 25 degrees of freedom and 95% confidence. p-value = 2.060

Since the point estimate is outside the acceptance values for the t-probability graph, we reject to fail Ho, that is, the average consumption between a manual car and an automatic car is the same.

5.48

  1. Write hypotheses for evaluating whether the average number of hours worked varies across the five groups.

Ho = the average group work hours across groups are the same Ha = The average work hours across groups are different

  1. The data is bien taken independently, the distributions follows a normal pattern and the variances are pretty similar.

  2. DfSum SqMean SqFPr(>F)
    degree42006.16501.542.1899| 0.0682
    residuals1167 |267382 229.119
    total1171 | 269388.16
  3. Since p-value > 0.05, fail to reject H0. the data does not show enough evidence that the average work hours are different.