a) within 1.5 standard deviations of the mean
(pnorm(1.5) - pnorm(-1.5))
## [1] 0.8663856
b) more than 2.5 sd of the mean
(1 - pnorm(2.5))
## [1] 0.006209665
c) more than 3.5 sd above or below the mean
pnorm(-3.5) + (1 - pnorm(3.5))
## [1] 0.0004652582
(a) The 90th percentile of a normal distribution is how many standard deviations above the mean?
qnorm(.9)
## [1] 1.281552
(b) The 10th percentile of a normal distribution is how many sd below the mean?
qnorm(.1)
## [1] -1.281552
The 10th percentile is 1.281 sd below the mean.
x <- seq(80, 260, length=431)
y <- dnorm(x, mean=155, sd=27)
plot(x, y, type="l", lwd=2, col = "blue",
main = "Serum cholesterol, 12 - 14 yr olds (approximated plot)" ,
xlab = "serum cholesterol (mg/dl)")
What percentage of 12 to 14-year-olds have serum cholesterol values
(a) 164 or more?
pnorm(164, 155, 27, lower.tail = FALSE)
## [1] 0.3694413
(b) 137 or less?
pnorm(137, 155, 27)
## [1] 0.2524925
(c) 186 or less?
pnorm(186, 155, 27)
## [1] 0.8745463
(d) 100 or more?
1 - pnorm(100, 155, 27)
## [1] 0.9791768
(e) between 159 and 186?
pnorm(186, 155, 27) - pnorm(159, 155, 27)
## [1] 0.3156592
(f) between 100 and 132?
pnorm(132, 155, 27) - pnorm(100, 155, 27)
## [1] 0.176325
(g) between 132 and 159?
pnorm(159, 155, 27) - pnorm(132, 155, 27)
## [1] 0.3617389
(a) Pr{Y >= 159}
pnorm(159, 155, 27, FALSE)
## [1] 0.4411129
(b) Pr{159 < Y < 186}
pnorm(186, 155, 27) - pnorm(159, 155, 27)
## [1] 0.3156592
(a) The 80th percentile of the serum cholesterol distribution N(155,27)
qnorm(.8, 155, 27)
## [1] 177.7238
(b) the 20th percentile
qnorm(.2, 155, 27)
## [1] 132.2762
x <- seq(129, 360, length=10002)
y <- dnorm(x, mean=245, sd=40)
plot(x, y, type="l", lwd=2, col = "blue",
main = "Rome marathon run times (approximated plot)" ,
xlab = "Final time (minutes)")
(a) What percentage of times were greater than 200 minutes?
1 - pnorm(200, 245, 40)
## [1] 0.8697055
(b) What is the 60th percentile of the times?
qnorm(.6, 245, 40)
## [1] 255.1339
(c) The normal curve approximation is fairly good except around the 240-minute mark. How can we explain this anomalous behavior of the distribution?
A large number of runners fall between 190 minutes and 240 minutes, the mean is pulled upwards by the number slower runners, which outweighs the number of fast runners. If high and low outliers were eliminated, the curve would probably center closer to 240.
(a) - II Right skewed data, the upward curve of the quantile plot shows a number of high values and possibly a high outlier.
(b) - III Left skewed data, the downward curve of the quantile plot shows a number of low values.
(c) - II The approximately straight line of the quantile plot shows that the population data is normal.
A normal quantile plot was created from the times that it took 166 bicycle riders to complete the stage 11 time trial in the 2001 Tour de France cycling race.
(a) Are the times of the fastest riders better than, worse than, or roughly equal to the times one would expect the fastest riders to have if the data came from a truly normal distribution?
The times of the three fastest riders are better than a truly normal distribution would show. The values fall above the regression line.
(b) Are the times of the slowest riders better than, worse than, or roughly equal to the times one would expect the slowest riders to have if the data came from a truly normal distribution?
The time of the slowest riders a roughly equal to the truly normal distribution, the values fall very close to the regression line.
Resting heart rate measured for a group of subjects; and after drinking coffee. The change in heart rate followed a normal distribution, with a mean increase of 7.3 beats per minute and a standard deviation of 11.1, let Y denote the change in heart rate for a randomly selected person. Find
(a)
paste("Pr{Y > 10} = ",
1 - pnorm(10, 7.3, 11.1) %>%
round(3))
## [1] "Pr{Y > 10} = 0.404"
(b)
paste("Pr{Y > 20} = ", 1 - pnorm(20, 7.3, 11.1) %>%
round(3))
## [1] "Pr{Y > 20} = 0.126"
(c)
pr <- (pnorm(15, 7.3, 11.1) - pnorm(5, 7.3, 11.2)) %>%
round(3)
paste0("Pr{5 < Y < 15} = ", pr)
## [1] "Pr{5 < Y < 15} = 0.337"
paste0("Pr{Y < 0} = ",
pnorm(0, 7.3, 11.1) %>%
round(3))
## [1] "Pr{Y < 0} = 0.255"
A high outlier would need to be greater than 37.247.
q1 <- qnorm(.25, 7.3, 11.1)
q3 <- qnorm(.75, 7.3, 11.1)
iqr <- q3 - q1
l1 <- c(paste0("upper fence = ", q3 + 1.5 * iqr),
paste0("Q1 = ", q1),
paste0("Q3 = ", q3),
paste0("IQR = ", iqr)
)
l1
## [1] "upper fence = 37.247344908706" "Q1 = -0.186836227176506"
## [3] "Q3 = 14.7868362271765" "IQR = 14.973672454353"
If the heart rates follow a normal distribution, which of the following Shapiro–Wilk’s test P-values for a random sample of 15 subjects are consistent with this claim?
(b) P-value = 0.1345. A p=value >= 0.10 shows no compelling evidence of non-normality.