What percent of a standard normal distributionN(\(\mu = 0\) , \(\sigma = 1\)) is found in each region? Be sure to draw a graph.
1- pnorm(-1.13)
## [1] 0.8707619
normalPlot(bounds = c(-1.13,Inf))
pnorm(0.18)
## [1] 0.5714237
normalPlot(bounds = c(-Inf,0.18))
1- pnorm(8)
## [1] 6.661338e-16
normalPlot(bounds = c(8,Inf))
pnorm(0.5) - pnorm(-0.5)
## [1] 0.3829249
normalPlot(bounds = c(-0.5,0.5))
In triathlons, it is common for racers to be placed into age and gender groups. Friends Leo and Mary both completed the Hermosa Beach Triathlon, where Leo competed in the Men, Ages 30 - 34 group while Mary competed in the Women, Ages 25 -29 group. Leo completed the race in 1:22:28 (4948 seconds), while Mary completed the race in 1:31:53 (5513 seconds). Obviously Leo finished faster, but they are curious about how they did within their respective groups. Can you help them? Here is some information on the performance of their groups:
Remember: a better performance corresponds to a faster finish.
Men’s group, age 30-34 \(\mu = 4313 s\) , \(\sigma = 583 s\)
Women’s group, age 25-29 \(\mu = 5216 s\) , \(\sigma = 807 s\)
\(Z = \frac{x - \mu}{\sigma}\)
Leo’s \(Z = \frac{4948s-4313s}{583s}\)
(4948-4313)/583
## [1] 1.089194
Mary’s \(Z = \frac{5513s-5261s}{807s}\)
(5513-5261)/807
## [1] 0.3122677
The lower the time the better you did, so you want a small Z-score, perferably negative. In this case Mary did better since her Z is less than Leo’s.
1- pnorm(1.089194)
## [1] 0.1380342
normalPlot(bounds = c(1.089194,Inf))
1- pnorm(0.3122677)
## [1] 0.3774185
normalPlot(bounds = c(0.3122677,Inf))
Yes, any outliers or assymetry in the data will change the precentages based on how badly assymetric the data and the number of outliers, high ot low.
Below are heights of 25 female college students.
fhgt <- c(54,55,56,56,57,58,58,59,60,60,60,61,61,62,62,63,63,63,64,65,65,67,67,69,73)
fhgt
## [1] 54 55 56 56 57 58 58 59 60 60 60 61 61 62 62 63 63 63 64 65 65 67 67
## [24] 69 73
fmn <- mean(fhgt)
fsd <- sd(fhgt)
j = 0
k = 0
l = 0
for(i in fhgt){
if(i <= (fmn+3*fsd) & i >= (fmn-3*fsd)){
j = j+1
if(i <= (fmn+2*fsd) & i >= (fmn-2*fsd)){
k = k+1
if(i <= (fmn+fsd) & i >= (fmn-fsd)){
l = l+1
}
}
}
}
print(paste(as.character(j/length(fhgt)),"are within 3 stdev"))
## [1] "1 are within 3 stdev"
print(paste(as.character(k/length(fhgt)),"are within 2 stdev"))
## [1] "0.96 are within 2 stdev"
print(paste(as.character(l/length(fhgt)),"are within 1 stdev"))
## [1] "0.68 are within 1 stdev"
These data seem to follow the 68-95-99.7% rule well.
hist(fhgt, xlab = "Female Height in Inches", main = "Histogram of Female Height")
qqnorm(fhgt)
qqline(fhgt)
qqnormsim(fhgt)
For a small data set it follows a Normal Distribution fairly well. It would be unreasonable to expect sample 25 random people to make a prefect Bell Curve.
A machine that produces a special type of transistor (a component of computers) has a 2% defective rate. The production is considered a random process where each transistor is independent of the others.
We’ll treat a defect as a success for mathematical reasons, a defect is the event of interest.
n = 10
p= 0.02
q = 0.98
prb = (q^(n-1)*p)
prb
## [1] 0.01667496
1.667496%
n = 100
p= 0.02
q = 0.98
prb = (q^(n))
prb
## [1] 0.1326196
13.26196% chance of no defects in a batch of 100.
To answer this we need to find the reciprocol of the failure rate of 2%, \(\mu = 1/p = 1/0.02 = \textbf{50}\) , \(\sigma = \sqrt{q/p^2} = \sqrt{0.98/0.02^2} = \textbf{49.497}\)
To answer this we need to find the reciprocol of the failure rate of 5%, \(\mu = 1/p = 1/0.05 = \textbf{20}\) , \(\sigma = \sqrt{q/p^2} = \sqrt{0.95/0.05^2} = \textbf{19.494}\)
Both the mean, \(\mu\), and standard deviation, \(\sigma\) decrease as probabilty increases.
While it is often assumed that the probabilities of having a boy or a girl are the same, the actual probability of having a boy is slightly higher at 0.51. Suppose a couple plans to have 3 kids.
I use both the r function dbinom and the formula \(P = {{n}\choose{k}} p^k*q^{n-k}\)
dbinom(2,3,0.51)
## [1] 0.382347
n = 3
k = 2
p= 0.51
q = 0.49
prb = (factorial(n)/(factorial(k)*factorial(n-k))*p^k*q^(n-k))
prb
## [1] 0.382347
\[P(\{f,m,m\} or\{m,f,m\} or \{m,m,f\}) = P(\{f,m,m\}) + P(\{m,f,m\}) + P(\{m,m,f\})\] \[(0.49*0.51*0.51)+(0.51*0.49*0.51)+(0.51*0.51*0.49) = 0.382347\]
With 8 children and 3 boys you would need to figure out how many permutations there are for 3 boys out of 8 children by hand. There are \(8!/(8-3)! = 336\) permutations in this case. In using the binomial distribution you just plug into the formula:
dbinom(3,8,0.51)
## [1] 0.2098355
n = 8
k = 3
p= 0.51
q = 0.49
prb = (factorial(n)/(factorial(k)*factorial(n-k))*p^k*q^(n-k))
prb
## [1] 0.2098355
A not-so-skilled volleyball player hasa 15% chance of making the serve, which involves hitting the ball so it passes over the net on a trajectory such that it will land in the opposing team’s court. Suppose that her serves are independent of each other.
Use the negative Binomial Distribution.
n = 10
k = 3
p= 0.15
q = 0.85
prb = (factorial(n-1)/(factorial(k-1)*factorial(n-k))*p^k*q^(n-k))
prb
## [1] 0.03895012
3.895012 %
15% since the serves are independant.
As given the prior history does not matter when making the serves. So the 10th serve had a 15% chance of success regardless of what came before. When you look at all the possible scenerios for 10 shots 15% of them will have the 10th shot be successful. Two previous successes is only one scenerio out of many.