library('DATA606')
normalPlot(mean = 0, sd = 1, bounds=c(-1.13,4), tails = FALSE)
normalPlot(mean = 0, sd = 1, bounds=c(-4,0.18), tails = FALSE)
\(Z > 8\): The probability of a value in normal distribution being 8 standard deviations away from mean is well under \(0.01\%\), so the area is almost nearly \(0\).
\(|Z| < 0.5\): The area is \(0.6914625 - 0.3085375 = 0.3829249\) or \(38.3\%\).
normalPlot(mean = 0, sd = 1, bounds=c(-0.5,0.5), tails = FALSE)
Men, Ages 30-34: \(N(\mu = 4313, \sigma = 583)\), and Women, Ages 25-29: \(N(\mu = 5261, \sigma = 807)\).
\({Z}_{Leo} = \frac{x - \mu}{\sigma} = \frac{4948 - 4313}{583} \approx 1.0892\) and \({Z}_{Mary} = \frac{x - \mu}{\sigma} = \frac{5513 - 5261}{807} \approx 0.3123\); Leo finished the race about 1.09 standad deviations above the mean, while Mary finished the race about 0.31 standard deviations above the mean.
Please note that because a better performance corresponds to a faster finish, lower Z-scores correspond to better performance. Mary ranked better in her group since her Z-score is better than Leo’s.
Leo’s Z-score corresponds to probability \(0.8619672\). Since higher Z-score corresponds to slower finish, Leo finished faster than \(1 - 0.8619672 = 0.1380328\) or \(13.8\%\).
Mary’s Z-score corresponds to probability \(0.6225937\). Since higher Z-score corresponds to slower finish, Mary finished faster than \(1 - 0.6225937 = 0.3774063\) or \(37.74\%\).
If distributions are not nearly normal, then part (b) will remain the same since Z-scores can still be calculated. However, parts (d) and (e) rely on the normal model for calculations, so the results would change.
heights <- c(54, 55, 56, 56, 57, 58, 58, 59, 60, 60, 60, 61, 61,
62, 62, 63, 63, 63, 64, 65, 65, 67, 67, 69, 73)
hgt_m <- mean(heights)
hgt_m
## [1] 61.52
hgt_sd <- sd(heights)
hgt_sd
## [1] 4.583667
qqnormsim(heights)
Looking at the QQ plots, the plot for actual data mostly follows the line with a few outliers at the edges. It appears better than some QQ plots for simulated data with normal distribution. As such I think we can conclude that the heights data follows a normal distribution.
# Values one standard deviation away from mean
pnorm(hgt_m + hgt_sd, mean = hgt_m, sd = hgt_sd) -
pnorm(hgt_m - hgt_sd, mean = hgt_m, sd = hgt_sd)
## [1] 0.6826895
# Values two standard deviation away from mean
pnorm(hgt_m + 2 * hgt_sd, mean = hgt_m, sd = hgt_sd) -
pnorm(hgt_m - 2 * hgt_sd, mean = hgt_m, sd = hgt_sd)
## [1] 0.9544997
# Values three standard deviation away from mean
pnorm(hgt_m + 3 * hgt_sd, mean = hgt_m, sd = hgt_sd) -
pnorm(hgt_m - 3 * hgt_sd, mean = hgt_m, sd = hgt_sd)
## [1] 0.9973002
Using normal distribution probability, we can confirm that the heights follow the 68-95-99.7% rule very closely.
\(p = 0.02\)
\(P(10th\ transistor\ is\ the\ first\ with\ a\ defect) = (1 - p)^{n-1} p = (1 - 0.02)^9 * 0.02 = 0.016675\)
\(P(no\ defects\ in\ a\ batch\ of\ 100) = (1 - p)^{100} = 0.98^{100} = 0.1326196\)
\(\mu = \frac{1}{p} = \frac{1}{0.02} = 50\) and \(\sigma = \sqrt{\frac{1-p}{p^2}} = \sqrt{\frac{0.98}{0.0004}} = \sqrt{2450} = 49.4974747\)
If \(p = 0.05\), then \(\mu = \frac{1}{0.05} = 20\) and \(\sigma = \sqrt{\frac{0.95}{0.0025}} = 19.4935887\).
When probability of an event is higher, the event is more common, so the expected number of trials before it occurs and the standard deviation are lower.
If \(p = 0.51\), \(n = 3\) and \(k = 2\), then \(P(two\ boys\ out\ of\ three\ kids) = \frac{n!}{k!(n-k)!} p^k (1-p)^{n-k} = \frac{3!}{2!} * 0.51^2 * 0.49 = 0.382347\).
Possible combinations include:
\(P(two\ boys\ out\ of\ three\ kids)\)
\(=(P(boy)*P(boy)*P(girl))+(P(boy)*P(girl)*P(boy))+(P(girl)*P(boy)*P(boy))\)
\(=3*0.51*0.51*0.49 = 0.382347\)\(p = 0.15\)
Serves are independent events and previous outcomes have no effect on future events. The probability of the success on the 10th serve is 0.15.
Part a is looking for the probability of a specific combination of successes withing 10 serves. Although each serve is independent, we are considering all 10 serves in determining the probability of the desired pattern. Contrary to this part b is only concerned with one serve. Previous outcomes are irrelevant because events are independent.