Working backwards, Part II. A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.
the sample mean is the mid point of the range (65+77)/2 = 71 we use the t-distribution of 24 degrees of freedom where the z= 1.711 using the Standard deviation formula 77 = 71 + 1.711 * s/sqrt(25) s = ((77 - 71) / 1.711) * sqrt(25) s = 17.5336060783
ME is s/sqrt(n) = 17.533606/sqrt(25) = 3.50672
71 + 1.711 * 3.50672 = 77 71 - 1.711 * 3.50672 = 65
5.14 SAT scores. SAT scores of students at an Ivy League college are distributed with a standard deviation of 250 points. Two statistics students, Raina and Luke, want to estimate the average SAT score of students at this college as part of a class project. They want their margin of error to be no more than 25 points. (a) Raina wants to use a 90% confidence interval. How large a sample should she collect? (b) Luke wants to use a 99% confidence interval. Without calculating the actual sample size, determine whether his sample should be larger or smaller than Raina’s, and explain your reasoning. (c) Calculate the minimum required sample size for Luke.
sd <- 250 alpha <- 1 - 0.9
calculate SE ME = z * SE for 90% z is 1.645 25 = z * SE 25 = 1.645 * SE SE = 15.1976 = s/sqrt(n) 15.1976 = 250/sqrt(n) n <-(1.645*250/15.1976)^2 n = 732.25
In order to get a margin of error of no more than 25, the sample should be at least 732.
The sample needs to be bigger because if one of the factors is bigger, the other should be smaller in order to have the same result, we achieve this by increasing the denominator
calculate SE ME = z * SE for 99% z is 2.575 25 <= z * SE 25 <= 2.575 * SE n = (2.575*250/9.7087)^2 n = 4396.5530
this time the sample should be at least 4397
the IQR of both box plots have similar size but as we can see in the histogram the bigger frequeny of the difference is around 0
They are not, this is a paired observation because every student have a socre for reading and writing.
H0 = Average(read) - Average(write) = 0 Ha = Average(read) - Average(write) != 0
The sample size is at least 30 (it’s 200 in this case) and it’s random, the histogram also show us that data is not skewed, so we can proceed with inference.
N <- 200
s <- 8.887
M = -0.545
SE <- s/sqrt(N)
T <- (M-0)/SE
qt(c(.05, .95),df=N-1)
## [1] -1.652547 1.652547
P <- pt(T, df=N-1, lower.tail =FALSE)
P
## [1] 0.8065818
Since P > 0.05, we fail to reject Ho, that is, there’s no plausible difference between the average read and write samples.
We may have made a Type II Error. Failing to reject null hypotheses when it is false. but the result value is not very close to the edges.
Yes, because the fact that there’s no plausible difference between read and write means that 0 is a value that will be there (the difference is 0)
Xm - Xa = 19.85 - 16.12 = 3.73 SE = sqrt(sa^2/na + sb^2/nb) = sqrt((4.51^2)/26 +(3.58^2)/26)) = 1.1293
Ho = Average(Manual) = Average(Automatic) Ha = Average(Manual) != Average(Automatic)
T = 3.73-0/1.1293 = 3.302931
We use the p-value for df=26-1= 25 degrees of freedom and 95% confidence. p-value = 2.060
Since the point estimate is outside the acceptance values for the t-probability graph, we reject to fail Ho, that is, the average consumption between a manual car and an automatic car is the same.
Ho = the average group work hours across groups are the same Ha = The average work hours across groups are different
The data is bien taken independently, the distributions follows a normal pattern and the variances are pretty similar.
| Df | Sum Sq | Mean Sq | F | Pr(>F) | |
| degree | 4 | 2006.16 | 501.54 | 2.1899 | | 0.0682 |
| residuals | 1167 | | 267382 | 229.119 | ||
| total | 1171 | | 269388.16 |
Since p-value > 0.05, fail to reject H0. the data does not show enough evidence that the average work hours are different.