Area under the curve, Part I. (4.1, p. 142) What percent of a standard normal distribution \(N(\mu=0, \sigma=1)\) is found in each region? Be sure to draw a graph.
Since mu is 0 and sigma is 1, Z and x are equivalent.
# use the DATA606::normalPlot function
Z <- -1.35
Z is less than -1.35.
pnorm(Z)
## [1] 0.08850799
Probabiility that Z is less than -1.35 is 0.08850799
DATA606::normalPlot(bounds= c(-1e+06,Z))
The percentage found in the region of the standard normal distribution is 8.85%
# use the DATA606::normalPlot function
Z <- 1.48
Z is greater than 1.48.
1 - pnorm(Z)
## [1] 0.06943662
Probabiility that Z > -1.35 is 0.06943662
DATA606::normalPlot(bounds= c(Z,1e+06))
The percentage found in the region of the standard normal distribution is 6.94%
# use the DATA606::normalPlot function
Z1 <- -0.4
Z2 <- 1.5
Z is greater than -0.4 and less than 1.5.
pnorm(Z2) - pnorm(Z1)
## [1] 0.5886145
Probabiility that Z is betweem -0.4 and 1.5 is 0.5886145
DATA606::normalPlot(bounds= c(Z1,Z2))
The percentage found in the region of the standard normal distribution is 58.9%
|Z| > 2 or -2 > Z > 2
# use the DATA606::normalPlot function
pnorm(2) - pnorm(-2)
## [1] 0.9544997
DATA606::normalPlot(bounds= c(-2,2))
The percentage found in the region of the standard normal distribution is 95.4%
Triathlon times, Part I (4.4, p. 142) In triathlons, it is common for racers to be placed into age and gender groups. Friends Leo and Mary both completed the Hermosa Beach Triathlon, where Leo competed in the Men, Ages 30 - 34 group while Mary competed in the Women, Ages 25 - 29 group. Leo completed the race in 1:22:28 (4948 seconds), while Mary completed the race in 1:31:53 (5513 seconds). Obviously Leo finished faster, but they are curious about how they did within their respective groups. Can you help them? Here is some information on the performance of their groups:
Remember: a better performance corresponds to a faster finish.
Answers
a. X~N(4313,583) for male group, X~N(5261,807) for female group.
b. The times of Leo and Mary are X.
Leo Z-score = (X - mu) / std = (4948 - 4313) / 583 = 1.089
Mary Z score = (5513 - 5261) / 807 = 0.3122
c. Mary ranked better in her group because her result was closer to the mean. Both performed below average in their respective groups.
d.
pnorm(-1.089194)
## [1] 0.1380342
Leo was faster than 13.8% of the others in his group.
e.
pnorm(-0.3122677)
## [1] 0.3774185
Mary was faster than 37.7% of the others in her group.
f.
Part b and c would be different because comparisons between the groups of male and female runners would no longer be valid.
Parts d and e would be different because the pnorm function wouldn’t give us useful results as it’s based on the normal distribution table.
Heights of female college students Below are heights of 25 female college students.
\[ \stackrel{1}{54}, \stackrel{2}{55}, \stackrel{3}{56}, \stackrel{4}{56}, \stackrel{5}{57}, \stackrel{6}{58}, \stackrel{7}{58}, \stackrel{8}{59}, \stackrel{9}{60}, \stackrel{10}{60}, \stackrel{11}{60}, \stackrel{12}{61}, \stackrel{13}{61}, \stackrel{14}{62}, \stackrel{15}{62}, \stackrel{16}{63}, \stackrel{17}{63}, \stackrel{18}{63}, \stackrel{19}{64}, \stackrel{20}{65}, \stackrel{21}{65}, \stackrel{22}{67}, \stackrel{23}{67}, \stackrel{24}{69}, \stackrel{25}{73} \]
# Use the DATA606::qqnormsim function
Answers
a. One standard deviation from 61.52 inches is 56.94 and 66.1.
Two standard deviations is 52.36 and 70.68.
Three is 47.78 and 75.26.
17 values fall between one standard deviation.
24 values fall between two standard deviations.
All values fall between three standard deviations.
17/25 is 68%
24/25 is 96%
25/25 is 100%
The heights do approximately follow the 68-95-99.7% rule.
DATA606::qqnormsim(heights)
The histogram shows a slightly skewed, not completely symmetrical distribution – but close.
The simulated normal quantile plots are similar to the generated qqnorm plot – approximately the same number of heights fall on the trend line.
So the data do appear to generally follow the normal distribution.
Defective rate. (4.14, p. 148) A machine that produces a special type of transistor (a component of computers) has a 2% defective rate. The production is considered a random process where each transistor is independent of the others.
Answers
1/P = 1 / 0.02 = 50 transistors expected.
sqrt((1 - 0.02) / 0.02 ^ 2) = 49.49747 standard deviation.
The mean and std dev of the wait time until success decrease as the probability of an event increases.
Male children. While it is often assumed that the probabilities of having a boy or a girl are the same, the actual probability of having a boy is slightly higher at 0.51. Suppose a couple plans to have 3 kids.
Answers
P <- 0.51
n <- 3
k <- 2
n_fac <- factorial(n)
k_fac <- factorial(k)
n_k_fac <- factorial(n-k)
P <- (n_fac / (k_fac * n_k_fac)) * P^(k) * (1 - P)^(n-k)
P
## [1] 0.382347
38.23% chance of two boys.
addition_rule <- P * P * (1-P) + P * (1-P) * P + (1-P) * P * P
addition_rule
## [1] 0.2708826
Answers from a and b match
Serving in volleyball. (4.30, p. 162) A not-so-skilled volleyball player has a 15% chance of making the serve, which involves hitting the ball so it passes over the net on a trajectory such that it will land in the opposing team’s court. Suppose that her serves are independent of each other.
Answers
choose(9,2)*0.15^3*0.85^7
## [1] 0.03895012
3.895%
b.
Each serve has a 15% chance of success regardless of previous attempts.
c.
The answer to problem a is dependent on two successful serves already being made, and is therefore not an independent variable. In problem b the 10th serve is independently considered.