Chapter4. Foundations of Inference: 4.4, 4.14, 4.24, 4.26, 4.34, 4.40, 4.48
4.4 Heights of adults.
(a) What is the point estimate of the average height of active individuals? What about the median?
Answer: The point estimate of the average height is the mean 171.1 and the mdian is 170.3.
(b)What is the point estimate for the standard deviation of the heights of active individuals? What about the IQR?
Answer: SD of the hights is 9.4, and the IQR is 14 from Q1 163.8 to Q3 177.8.
(c)Is a person who is 1m 80cm (180cm) tall considered unusually tall? And is a person who is 1m 55cm (155cm) considered unusually short? Explain your reasoning.
Answer: The person who is 155cm or 180cm is unusual since it isn’t within 2 SD of the mean.
sample_mean <-171.1
sd <-9.4
n <-507
Z180 <- (180-sample_mean)/(sd/sqrt(n))
Z180
## [1] 21.31897
Z155 <- (155-sample_mean)/(sd/sqrt(n))
Z155
## [1] -38.56577
2*sd
## [1] 18.8
(d) Another random sample of physically active individuals - Would you expect the mean and the sd of this new sample to be the ones given above?
Answer: No.Point estimates tha are base on samples only approximate the population parameter, and they vary from one sample to another.
(e) What measure do we use to quantify the variability of such an estimate? Compute this quantity using the data from the original sample under the condition that the data are a simple random sample.
Answer: The SE 0.4174687, which is 9.4/sqrt(507), is used to quantify the variability.
9.4/sqrt(507)
## [1] 0.4174687
4.14.Thanksgiving spending, Part I. - Answers as following
(a) False. Inference is made on the population parameter, not the point estimate. The point estimate is always in the confidence interval.
(b) False. Provided the data distribution is not very strongly skewed( n=436 in this sample, so we can be slightly lenient with the skew), the sample mean will be nearly normal, allowing for the method normal approximation described.
(c) False.The confidence interval is not about a sample mean.
False.
(d) True.
(e) False. To be more confident that we capture the parameter, we need a wider interval. Think about needing a bigger net to be more sure of catching a fish in a murky lake.
(f) False. In the calculation of the standard error, we divide the standard deviation by the square root of the sample size. To cut the SE (or margin of error) in half, we would need to sample 2^2 =4 times the number of people in the initial sample.
(g) True. Since the normal model was used to model the sample mean. The margin of error is half the width of the interval, and the sample mean is the midpoint of the interval.
4.24 Gifted children, Part I.
(a) Are conditions for inference satisfied?
Answer: Yes, simple random sample from the population, and the variable we measure has an exactly normal distribution with known population standard deviation.
(b) 90% CI, H0: u=32 months and Ha: u< 32 months
Answer: Comparing z score of one tail test with 90% CI at 1.28, H0 is not acceptable since its z score is 1.823 which falls within the significance level of 0.1.
mean <- 30.69
sd <- 4.31
n <- 36
x <- 32
se <- sd/sqrt(n)
z <- (x-mean) / se
z
## [1] 1.823666
(c) Interpret the p-value in context of the hypothesis test and the data.
Answer: P-value is 3.41% which is less than 10% (significan level of 0.1), so reject H0 where mean is 32.
pnorm(-abs(z))
## [1] 0.0341013
(d) 90% CI is 29.77053 to 31.60947
lower <- mean - 1.28 * se
upper <- mean + 1.28 * se
c(lower, upper)
## [1] 29.77053 31.60947
(e) Answer: Yes, my result in d matches the hypothesis since the mean of H0 is not in the CI.
4.26 Gifted children, Part II.(Mother’s IQ)
(a)Answer: the significance level of 0.1, H0: u=100 and Ha: u != 100 (two-tail test, 5% each side)
P-value is approxmated to 0 which is less than 10% (significan level of 0.1), so reject H0 where mean is 100.
mean2 <- 118.2
sd2 <- 6.5
x2 <- 100
se2 <- sd2/sqrt(n)
z2 <- (x2-mean2) / se2
pnorm(-abs(z2))
## [1] 1.22022e-63
(b) 90% CI is 116.4181 to 119.9819
z3<-qnorm(.95)
lower <- mean2 - z3 * se2
upper <- mean2 + z3 * se2
c(lower, upper)
## [1] 116.4181 119.9819
(c) Answer: Yes, my result in b matches the hypothesis since the mean of H0 is not in the CI.
4.34 CLT
Answer: The mean is the average of the simple random samples from the population and the shape of the distribution is normal distribution with a sample mean and a standard deviation. Increasing sample size, decreasing standard deviation so the frequcy of around mean will increase and the tail of the distribution is thiner.
4.40 CFLBs: lifespans oflight bulbs has u=9000 hours, sd=1000 hours
(a) p(x>10500 hours) =6.681%
pnorm(10500,mean = 9000,sd=1000,lower.tail = F,log.p = F)
## [1] 0.0668072
(b) describe the distribution of the mean lifespan of 15 light bulbs.
Answer: Since the sample size is less than 30 so the distribution of the mean lifespan may have strong skew and z score may not good index in the test.
(c) p(mean15 >10500) approximate to 0.
mean4 <- 9000
sd4 <- 1000
x4 <- 10500
n4 <-15
se4 <- sd4/sqrt(n4)
z4 <- (x4-mean4) / se4
pnorm(-abs(z4))
## [1] 3.133452e-09
(d) two distributions (population and sampling) on the same scale.
par(mfrow = c(2, 1))
hist(rnorm(n=500, mean=9000, sd=1000),breaks = 25)
hist(rnorm(n=50, mean=9000, sd=1000),breaks = 25)

(e) Answer: No, I can’t estimate (a) without a nearly normal population distribution. I also could not estimate (c) since the sample size is not sufficient to yield a nearly normal sampling ditstirbution if the population distribution is not nearly normal.
4.48
Answer: It can be approval by z function.While increasing n, z will increasing since n as its denominator of standard deviation. z is bigger and p(z) increases then p-value decreas (1-p(z)).