Submitted by Zachary Herold Graded Problems: 4.4, 4.14, 4.24, 4.26, 4.34, 4.40, 4.48
The point estimate is the sample mean of 171.1. The median is reported as 170.3.
The point estimate is the sample standard deviation of 9.4. The IQR is 177.8 (Q3) - 163.8 (Q1) = 14.0.
Z.180 <- (180-171.1)/9.4
Z.155 <- (155-171.1)/9.4
print(c(Z.180, Z.155))
## [1] 0.9468085 -1.7127660
print(c(1 - pnorm(Z.180), pnorm(Z.155)))
## [1] 0.1718682 0.0433778
180cm is less than 1 standard deviation from the mean, expected to occur about 17.2% of the time, a fairly typical result. 155cm is 1.7 standard deviations from the mean, expected to occur about 4.3% of the time, a fairly unusual result.
One would expect the results to be different as they are randomly generated. However, since the sample size is large (507 adults), they should be relatively close to the previous results.
Margin of error quantifies the variability in estimate. This multiplies the z-score times the standard deviation of the sample divided by the square root of the sample size. With a confidence level of 90%, we calculate the margin of error as follows:
ME <- 1.96 * 9.4 / sqrt(507)
ME
## [1] 0.8182386
To get an estimate of consumer spending, 436 randomly sampled American adults were surveyed. Daily consumer spending for the six-day period after Thanksgiving, spanning the Black Friday weekend and Cyber Monday, averaged $84.71. A 95% confidence interval based on this sample is ($80.31, $89.11). Determine whether the following statements are true or false, and explain your reasoning.
False. The sample consists of the 436 adults. We know the mean for them, and can be 100% sure it is in the confidence interval. Inference is made on the population parameter, not the point estimate. The point estimate is always in the confidence interval.
False. Given the large sample size and Central Limit Theorem, the sample mean will be nearly normal, allowing for the method normal approximation described.
False. Each random sample will generate a new confidence interval. The confidence interval is not about a sample mean.
True.
True.
False. In the calculation of the standard error, we divide the standard deviation by the square root of the sample size. To cut the SE (or margin of error) into a third, we would need to sample 9 times the number of people in the initial sample.
True. 89.11 - 84.71 = 4.40 This is the upper boundary of the confidence interval minus the sample mean (point estimate).
Researchers investigating characteristics of gifted children collected data from schools in a large city on a random sample of thirty-six children who were identified as gifted children soon after they reached the age of four. The following histogram shows the distribution of the ages (in months) at which these children first counted to 10 successfully.
A sample size of 36 is sufficiently large, such that the distribution of random sample means will be nearly normal, allowing for the method normal approximation described.
H0 : ??(regular) - ??(gifted) = 0 HA : ??(regular) - ??(gifted) > 0
SE <- 4.31 / sqrt(36)
Z.32 <- ( 32 - 30.69 ) / SE
Z.32
## [1] 1.823666
1 - pnorm(Z.32)
## [1] 0.0341013
Z = 1.82 >>> p-value = 0.03, which is less than 0.10
The data provides convincing evidence that the age of gifted students counting to 10 is di???erent from that of the general population.
The p-value of 0.03 is less than the significance level. This means that the probably of the point estimate occurring, given that the null hypothesis is true, is sufficiently low. So we reject the null hypothesis in favor of the alternative.
age.high <- 30.69 + (1.645 * 4.31 / sqrt(36))
age.low <- 30.69 - (1.645 * 4.31 / sqrt(36))
print(c(age.low, age.high))
## [1] 29.50834 31.87166
Yes, they are consistent, as the mean age of non-gifted children counting to 10 is beyond the higher bound of the confidence interval.
H0 : ??(regularIQ) - ??(giftedIQ) = 0 HA : ??(regularIQ) - ??(giftedIQ) <> 0
SE <- 6.5 / sqrt(36)
Z.118 <- ( 118.2 - 100 ) / SE
Z.118
## [1] 16.8
2 * (1 - pnorm(Z.118))
## [1] 0
Z = 16.8 >>> p-value = approx. 0.00, which is less than 0.10
The data provides convincing evidence that the of mother’s IQ gifted students is di???erent from that of the general population.
Z(118.2) = (118.2 - 100)/6.5 = 2.8
IQ.high <- 118.2 + 1.645 * 6.5 / sqrt (36)
IQ.low <- 118.2 - 1.645 * 6.5 / sqrt (36)
print(c(IQ.low, IQ.high))
## [1] 116.4179 119.9821
The average IQ of 100 is far outside the confidence interval, suggesting that it is not at all a feasible situation.
Define the term “sampling distribution” of the mean, and describe how the shape, center, and spread of the sampling distribution of the mean change as sample size increases.
The sampling distribution is the distribution of randomly generated means, given the same sample size.
The shape is expected to be normal if the population itself is normally distributed or if the selection is independently derived (with <10% of populations values in the sample, and sample size of at least 30). The distribution center is expected to be close to the population mean. The spread is expected to be tighter the higher the sample size.
A manufacturer of compact fluorescent light bulbs advertises that the distribution of the lifespans of these light bulbs is nearly normal with a mean of 9,000 hours and a standard deviation of 1,000 hours.
Z.10500 <- (10500 - 9000) / 1000
Z.10500
## [1] 1.5
1 - pnorm(Z.10500)
## [1] 0.0668072
About 6.7%, with 10,500 hours about 1.5 standard deviations above the mean.
We should expect the mean lifespan of 15 light bulbs to be centered on the population mean of 9000, with a wide spread due to the low sample size. Since the population is nearly normal, the sample distribution should be normal too despite the low sample size.
SE <- 1000 / sqrt(15)
Z.10500 <- (10500 - 9000) / SE
Z.10500
## [1] 5.809475
1 - pnorm(Z.10500)
## [1] 3.133452e-09
The probability is very close to 0.
avg <- rep(0,15)
for (i in 1:15){
avg[i] <- mean(rnorm(15, 9000, 1000))
}
hist(avg,breaks = 5, xlim = c(6000,12000), probability = TRUE)
x <- 5000:13000
y <- dnorm(x = x, 9000, 1000)
lines(x = x, y = y, ylab='', lwd=2, col='blue')
We could not estimate (a) and (c) without a nearly normal population distribution. We also could not estimate (c) since the sample size is not sufficient to yield a nearly normal sampling distribution if the population distribution is not nearly normal.
Suppose you conduct a hypothesis test based on a sample where the sample size is n = 50, and arrive at a p-value of 0.08. You then refer back to your notes and discover that you made a careless mistake, the sample size should have been n = 500. Will your p-value increase, decrease, or stay the same? Explain.
The square root of 500 is approximately 3 times the square root of 50. Therefore, the z score should decrease by 1/3 with the larger size, as this figure is in the denominator. When the z score decreases, the p score increases, as the probability that we would get a z statistic of that value is sufficiently large if the null hypothesis is true.