Problem 4.4 a. The point estimate, or mean, for this problem is 171.1 The median is 170.3 b. The point estimate for the standard deviation is 9.4. The IQR is 14
q1 <- 163.8
q3 <- 177.8
q3 - q1
## [1] 14
x1 <- 180
x2 <- 155
sd <- 9.4
mn <- 171.1
(x1 - mn) / sd
## [1] 0.9468085
(x2 - mn) / sd
## [1] -1.712766
based on the information about, we see that both values fall within two standard deviations of the mean, so neither height would be considered abnormal.
No, there is always natural variability in the sample statistic. we would be more surprised if the data was the same.
We use the standard error to quantify the variability. A sample mean deviates from the actual mean of a population; this deviation is the standard error. We can calculate this by:
sd <- 9.4
n <- sqrt (507)
sd/n
## [1] 0.4174687
the standard error of the mean is .417
Problem 4.14 a. True, this is the confidence intraval constructed with the given information
This is probably false. The sample size is larger than 30. There is a slight skew to the right but the spread does not look abnormal.
False, the condifence interval how sure we are that this is correct. It is not exact.
This is true, the confidence interval (if properly estimated) should show us that 95% of the time, when a sample of 436 is randomly taken from the population, the true population mean should fall within this interval.
True, this would be a wider interval, which would include more possible values for our estimate.
False, you would need even more samples than 3 times larger.
sd <- 4.4
x1 <- 80.31
x2 <- 89.11
x1 + sd
## [1] 84.71
x2 - sd
## [1] 84.71
True, both equations gives us our standard mean.
Problem 4.24 a. The sample is random and 36 children of a large city is most likely under 10% of the population. The sample size is over 30. There doesn’t appear to be any strong skew in the population. Based on this information the conditions for inference are satisfied.
x <- 32
n <- 36
avg <- 30.69
sd <- 4.31
SE <- sd/sqrt(n)
z = (avg - x)/SE
p = pnorm(z)
p
## [1] 0.0341013
There is a 3% chance of getting the observed mean if null hypothesis was true. Therefore, we reject the null hypothesis.
A small p value indicates strong evidence against the null hypothese, so you reject the null hypothesis.
x1 <- avg - 1.65 * SE
x2 <- avg + 1.65 * SE
x1
## [1] 29.50475
x2
## [1] 31.87525
Problem 4.26 a.
x <- 100
n <- 36
avg <- 118.2
sd <- 6.5
SE <- sd/sqrt(n)
z = (avg - x)/SE
z
## [1] 16.8
x1 <- avg - 1.65 * SE
x2 <- avg + 1.65 * SE
x1
## [1] 116.4125
x2
## [1] 119.9875
Problem 4.34 A sampling distribution is a probability distribution of a statistic obtained through a large number of samples drawn from a specific population. The sampling distribution of the mean takes the mean of each sample. As the sample size gets larger, the shape becomes on a more ‘normal’ and has a smaller spread.
Problem 4.40 a. What is the probability that a randomly chosen light bulb lasts more than 10,500 hours?
First, we will need to figure out the z-score for this randomly chosen sample.
z = (x - mu) / sd Once that is calculated, you need to subtract 1 from the pnorm because we are looking for the probability of 10,500 or more.
x <- 10500
mu <- 9000
sd <- 1000
p <- 1 - pnorm(x, mu, sd)
prob <- round(p,4)
prob
## [1] 0.0668
Answer: The probability that the randomly chosen light bulb lasts more than 10,500 hours is 6.668%
Answer: We do know that the data is nearly normal and the standard deviation is known, however, the approximation of the mean will be poor because the sample size is small. The distribution could be anything with the small sample size.
First, we need to find the z-score based on the sample size. The equation we will be using to find this is z = (x - mu) / S and s is calculated by finding S = sd / sqrt(n)
x <- 10500
mu <- 9000
sd <- 1000
n <- 15
smp15 <- sd/sqrt(n)
prob15 <- 1-pnorm(x, mu, smp15)
ans <- round(prob15, 4)
ans
## [1] 0
answer: The probability that the mean of lifespan of 15 randomly chosen light bulbs is more than 10,500 is 0!
I used dnorm for plotting because it gives the denisty.
s <- seq(6000,12000,100)
plot(s, dnorm(s, mu, sd), type="l", xlab = "Lifespan of Lightbulbs", ylab = "", col="blue")
lines(s, dnorm(s, mu, smp15), col="red")
e. We aren’t able to estimate the probability from parts (a) and (c) without a larger sample size.
Problem 4.48 The P value will decrease as the sample size gets larger.