imgage <- "C:/Users/jpsim/Documents/Stat & Probability for Data/heights.png"
include_graphics(imgage)
Solution:
head(bdims$hgt)
## [1] 174.0 175.3 193.5 186.5 187.2 181.5
meanhgt <- mean(bdims$hgt)
meanhgt
## [1] 171.1438
medianhgt <- median((bdims$hgt))
medianhgt
## [1] 170.3
Solution:
sdhgt <- sd(bdims$hgt)
sdhgt
## [1] 9.407205
IQR(bdims$hgt, na.rm = T)
## [1] 14
Q3 <- quantile(bdims$hgt, 0.75)
Q1 <- quantile(bdims$hgt, 0.25)
print (Q3 - Q1)
## 75%
## 14
Solution:
twoSDpos <- meanhgt + 2*sdhgt
twoSDneg <- meanhgt - 2*sdhgt
twoSDpos
## [1] 189.9582
twoSDneg
## [1] 152.3294
twoSDpos1 <- meanhgt + 1*sdhgt
twoSDneg1 <- meanhgt - 1*sdhgt
twoSDpos1
## [1] 180.551
twoSDneg1
## [1] 161.7366
Solution:
hist(bdims$hgt, probability = TRUE)
x <- 140:200
y <- dnorm(x = x, mean = meanhgt, sd = sdhgt)
lines(x = x, y = y, col = "blue")
Solution:
sd_x <- sdhgt/sqrt(nrow(bdims))
sd_x
## [1] 0.4177887
imgage <- "C:/Users/jpsim/Documents/Stat & Probability for Data/thanks.png"
include_graphics(imgage)
Solutions: a. We are 95% confident that the average spending of these 436 American adults is between $80.31 and $89.11.
Solution:
FALSE, Inference is measured on the population parameter. The CI should not be a representation of the sample but the population.
Solution:
False, the data is only slightly skewed.
d.We are 95% confident that the average spending of all American adults is between $80.31 and $89.11.
Solution:
TRUE, the population parameter is estimated by the Point Estimate and the CI, respectively.
Solution:
TRUE, there is a small percentage of CI, the interval becomes narrow
Solution: False, the sample size should 9 times larger
Solution: TRUE, the margin of error is half the CI
imgage <- "C:/Users/jpsim/Documents/Stat & Probability for Data/giftp1.png"
include_graphics(imgage)
Solution: Yes the sampling was random, the sample size is > 30 and the distribution is nearly normal without skewness.
Solution:
H0 : \(\mu\) = 32
HA : \(\mu\) < 32
Calculating the SE:
given std. Dev. (sd) = 4.31
given n = 36
SE = sd/sqrt(n) = 4.31/sqrt(36) = 0.7183
To calculate Z-score:
\(Z_{30.69}\) = (30.69-32)/0.7183 = -1.82
P(Z<-1.82) = 0.034
Solution: \(\alpha\) = = 0.1 but since the calculated p-value < 0.1 whch is 0.034, we reject the null hypothesis of Ho
therefore, we can with 90% confidence believe there’s evidence that gifted kids start reading before 32
months old.
d.Calculate a 90% confidence interval for the average age at which gifted children first count to 10 successfully.
Solution:
CIupper=30.69+(1.645*(0.7183))
CIupper
## [1] 31.8716
CIlower=30.69−(1.645*(0.7183))
CIlower
## [1] 29.5084
Solution: Yes, interval doesn’t contain any 32 months old
imgage <- "C:/Users/jpsim/Documents/Stat & Probability for Data/giftp2.png"
include_graphics(imgage)
Solution:
H0 : \(\mu\) = 100
HA : \(\mu\) != 100
Calculating the SE:
given std. Dev. (sd) = 6.5
given n = 36
SE = sd/sqrt(n) = 6.5/sqrt(36) = 1.083
To calculate Z-score:
\(Z_{118.2}\)= (118.2-100)/1.083 = 16.80
P(Z!=16.80) = 0
Mean = 118.2
n = 36
SD = 6.5
SE = SD/sqrt(n)
z_score = (Mean -100)/SE
pnorm(z_score)
## [1] 1
Solution:
CIupper =118.2+(1.645*(1.083))
CIupper
## [1] 119.9815
CIlower=118.2−(1.645*(1.083))
CIlower
## [1] 116.4185
Solution: \(\alpha\) = 0.1, based on calculated p-value of 0.0 , we reject the null hypothesis and beleive that there
are enough evidience at the 90% confidence level the IQ is Not 100. The CLT do not contain 100 IQ.
Define the term “sampling distribution” of the mean, and describe how the shape, center, and spread of the sampling distribution of the mean change as sample size increases.
Solution: Sampling distribution means taking n samples from the population and measuring the means of these samples These samples’ means which in itself has a distribution; therefore the term sampling distribution. The shape, center and spread of the mean changes as sample size increase. The greater the sample size, the more “bell-shape” the distribution becomes or in other words, the shape becomes more normal, the center approaches the true population mean, and the spread decreases.
Solution:
\(Z_{10500}\) = (10500-9000)/1000 = 1.5
P(X>10500) = 0.0668
mean = 9000
sd = 1000
z_score = (10500 - 9000)/sd
prob = 1-pnorm(z_score)
prob
## [1] 0.0668072
SE_sample = 1000/sqrt(15)
SE_sample
## [1] 258.1989
z_score = (10500 - 9000)/258.2
probability = 1 - pnorm(z_score)
probability
## [1] 3.13392e-09
s <- seq(5000,13000,0.01)
plot(s, dnorm(s,9000, 1000), type="l", ylim = c(0,0.002),)
lines(s, dnorm(s,9000, 258.1989), col="blue")
the probabilities with a skewed distribution cannot be estimated.
Suppose you conduct a hypothesis test based on a sample where the sample size is n = 50, and arrive at a p-value of 0.08. You then refer back to your notes and discover that you made a careless mistake, the sample size should have been n = 500. Will your p-value increase, decrease, or stay the same? Explain.
Solution:
The p-value will decrease with a larger sample size as the spead of the distribution will narrow causing the standard deviation to decrease with increase in n.