This document presents homework Assignment 3 for Data 606.
We are asked to find the percent of a standard normal distribution is found in each region below and to draw a graph.
Note that we can use -Inf and Inf to plug into the normal CDF to calculate tail regions for the pnorm and normalPlot functions below.
We also express probabilities in decimal format rather than percentage format. We use 0.5 to mean 50% percent probability.
pnorm(-Inf) # effectively equals 0
## [1] 0
pnorm(Inf) # effectively equals 1
## [1] 1
normalPlot(bounds=c(-1.13,Inf))
normalPlot(bounds=c(-Inf, 0.18))
normalPlot(bounds=c(8,Inf))
normalPlot(bounds=c(-0.5, 0.5 ) )
The shorthand for these distributions is \[ N(\mu = 4313, \sigma = 583) \]
Z-score for Leo’s and Mary’s times are calculated below:
(zLeo = (4948 - 4313) / 583.0 )
## [1] 1.089194
(zMary = (5513 - 5261 ) / 807 )
## [1] 0.3122677
Leo’s Z-Score is 1.089 Mary’s Z-Score is 0.312
Both racers were slower than average because a positive Z-score denotes a longer running time.
Mary ranks better than Leo in her respective group because her Z-score is less than Leo’s Z-score. A lower Z-score means a faster running time.
Leo finished faster than 13.80% on all male runners based on the calculation shown below.
(leo_fraction_faster = 1 - pnorm( zLeo ) )
## [1] 0.1380342
(mary_fraction_faster = 1 - pnorm(zMary))
## [1] 0.3774186
part b would not change. It asks to calculate Z-scores, which is the same regardless of the true probability distribution. However, one could no longer infer a probability from the Z-score because the shape of the probability distribution is unknown. That, c, d, e could change.
We first load the heights into a vector for analysis.
fheights = as.numeric( c(54,55,56,56,57,58,58,59,60,60,60,61,61,62,62,63,63,63,64,65,65,67,67,69,73 ) )
Next we validate the count, mean and standard deviation match the textbook’s assertion: They do.
length(fheights)
## [1] 25
mean(fheights)
## [1] 61.52
sd(fheights)
## [1] 4.583667
To test the rule, we first convert all raw heights to Z scores.
(Z = (fheights - mean(fheights) ) / sd(fheights) )
## [1] -1.6406080 -1.4224420 -1.2042761 -1.2042761 -0.9861101 -0.7679442
## [7] -0.7679442 -0.5497782 -0.3316122 -0.3316122 -0.3316122 -0.1134463
## [13] -0.1134463 0.1047197 0.1047197 0.3228856 0.3228856 0.3228856
## [19] 0.5410516 0.7592175 0.7592175 1.1955494 1.1955494 1.6318813
## [25] 2.5045451
P1 = length( subset( Z, Z <= 1 & Z >= -1 ) ) / length( Z )
P2 = length( subset( Z, Z <= 2 & Z >= -2 ) ) / length( Z )
P3 = length( subset( Z, Z <= 3 & Z >= -3 ) ) / length( Z )
df = data.frame( stdev = c( 1, 2, 3), empirical = c( P1, P2, P3), normal = c( .68, .95, .997 ) )
knitr::kable(df, digits = 4, caption="Comparing Females Heights Distribution with 1-3 SDs")
| stdev | empirical | normal |
|---|---|---|
| 1 | 0.68 | 0.680 |
| 2 | 0.96 | 0.950 |
| 3 | 1.00 | 0.997 |
Looking at the table, we conclude that the 68-95-99.7 rule fits the heights data very well.
```
We use the formula and methods for geometric distribution explained on page 143. Defect rate is \(p = 0.02\) for a machine producing transistors.
p = 0.02
The probability that the first defect is the 10th transistor is \[ Pr[\text{1st defect at 10th transistor}] = p (1-p)^9 = 0.016675 \]
The probability of no defects in a batch of 100 is:
\[ Pr[\text{no defects in 100 transistors}] = (1-p)^{100} = 0.1326196 \] While a single transistor has a low probability of defect, the impact of compounding means that some defects are common in large batches.
We expect on average \(1/p\) trials to observe a defect. This is 50 trials.
The standard deviation is \[ \sigma = \sqrt{ \frac{1-p}{p^2} } = \sqrt{ \frac{ 0.98}{(0.02)^2} } = 49.4974747 \text{ trials }\]
If another machine has a defect rate \(q = 0.05\) we are asked to repeat the calculations for average trials to first defect and the associated standard deviation. This gives:
q = 0.05
\[ \text{ Expected trials to first defect} = \frac{1}{q} = \frac{1}{0.05} = 20 \text{ trials} \]
\[ \text{ Standard deviation of trials to first defect} = \sqrt{\frac{1-q}{q^2} } = \sqrt{\frac{ 0.95}{0.05^{2}}} = 19.4935887 \text{ trials} \]
A more mathematical approach to use first derivatives of mean and standard deviation to measure the marginal impact:
\[ \text{marginal sensitivity of wait time} = \frac{d}{dp}\left( \frac{1}{p}\right) = -\frac{1}{p^2} \]
\[ \text{marginal sensitivity of stdev} = \frac{d\sigma}{dp} = \frac{d}{dp}\left( \sqrt{ \frac{1-p}{p^2} } \right) \] \[ = (1/2)(1-p)^{-3/2}p^{-1}-p^{-2}(1-p)^{-1/2} \] \[ = p^{-1}(1-p)^{-1/2}\left( (1/2)(1-p)^{-1} - p^{-1} \right) \] \[ = p^{-1}(1-p)^{-1/2} \frac{ 2p- 2}{2(1-p)p} = p^{-1}(1-p)^{-1/2} \frac{-1}{p}\] \[ \frac{d\sigma}{dp} = \frac{-1}{p^2 \sqrt{1-p} } < 0 \text{ for all 0<p<1 }\]
We conclude that the first derivatives of both quantities are negative for all positive probabilities p. Thus, increasing the defect rate always decreasing the wait time and its standard deviation.
\[ Pr[\text{2 boys in 3 kids}] = \binom{3}{2}(0.51)^2 (.49) = 0.382347 \]
The sum of the identical probabilites of these 3 scenarios is:
\[ 3 \times (0.51)^2 (0.49) = 3 \times 0.127449 = 0.382347 \] This confirms that parts a and b give equivalent answers.
\[\binom{8}{3}= 56\] combinations to enumerate.
n=10
k=3
p=0.15
( answer = choose(n-1,k-1) * p^k * (1-p)^(n-k) )
## [1] 0.03895012
\[Prob[\text{3rd success on 10 attempts} ] = \binom{n-1}{k-1}p^{k}(1-p)^{n-k}= 0.0389501\]
If she has already made two successful serves in nine attempts, the probability that her 10th serve is successful remains unchanged at 0.15 because the 10th serve is independent.
There is no contradiction between parts a and b because they are calculating probabilities of different events. Part a is asking for the unconditional probability of 3 successes in 10 trials before they have started. Part b is asking for a conditional probability after 9 trials have occurred in a specific manner of 1 future trial.