Submitted by Zachary Herold
3.2, 3.4, 3.18, 3.22, 3.38, 3.42
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
Part II. What percent of a standard normal distribution N(μ = 0, ! = 1) is found in each region?
dnorm(-1.13)
## [1] 0.2106856
cord.x <- c(-1.13, seq(-1.13,3,0.01),-1)
cord.y <- c(0, dnorm(seq(-1.13,3,0.01)),0)
curve(dnorm(x,0,1), xlim = c(-3,3))
polygon(cord.x, cord.y, col= "skyblue")
1 - dnorm(0.18)
## [1] 0.6074685
cord.x <- c(-3, seq(-3,0.18,0.01),0.18)
cord.y <- c(0, dnorm(seq(-3,0.18,0.01)),0)
curve(dnorm(x,0,1), xlim = c(-3,3))
polygon(cord.x, cord.y, col= "skyblue")
dnorm(8)
## [1] 5.052271e-15
(1 - dnorm(0.5)) - (dnorm(-0.5))
## [1] 0.2958693
cord.x <- c(-0.5, seq(-0.5,0.5,0.01),0.5)
cord.y <- c(0, dnorm(seq(-0.5,0.5,0.01)),0)
curve(dnorm(x,0,1), xlim = c(-3,3))
polygon(cord.x, cord.y, col= "skyblue")
• The finishing times of the Men, Ages 30 - 34 group has a mean of 4313 seconds with a standard deviation of 583 seconds.
Leo completed the race in 1:22:28 (4948 seconds),
• The finishing times of the Women, Ages 25 - 29 group has a mean of 5261 seconds with a standard deviation of 807 seconds.
Mary completed the race in 1:31:53 (5513 seconds).
N(μ = 4313, sigma = 583)
N(μ = 5261, sigma = 807)
print(paste("Leo's Z-score: ", (4948 - 4314) / 583))
## [1] "Leo's Z-score: 1.08747855917667"
print(paste("Mary's Z-score: ", (5513 - 5261) / 807))
## [1] "Mary's Z-score: 0.312267657992565"
The Z scores here indicate that Leo’s result was 1.09 standard deviations from the mean, while Mary was only 0.31.
Both runners were slower than the mean, but Mary did better relatively speaking, as Leo’s result far underperformed the mean in terms of number of standard deviations, when compared with Mary’s.
dnorm(1.087479)
## [1] 0.2208561
dnorm(0.3122677)
## [1] 0.3799582
We could not answer parts (b)-(e) since we cannot use the normal probability table to calculate probabilities and percentiles without a normal model.
Below are heights of 25 female college students.
hgt <- c(54,55,56,56,57,58,58,59,60,60,60,61,61,62,62,63,63,63,64,65,65,67,67,69,73)
mean(hgt)
## [1] 61.52
sd(hgt)
## [1] 4.583667
sum(hgt < mean(hgt) + sd(hgt) & hgt > mean(hgt) - sd(hgt)) /25
## [1] 0.68
sum(hgt < mean(hgt) + 2 * sd(hgt) & hgt > mean(hgt) - 2 * sd(hgt)) /25
## [1] 0.96
sum(hgt < mean(hgt) + 3 * sd(hgt) & hgt > mean(hgt) - 3 * sd(hgt)) /25
## [1] 1
The ratios are very close.
hist(hgt, probability = TRUE, ylim = c(0,0.1))
x <- 50:75
y <- dnorm(x = x, mean = mean(hgt), sd = sd(hgt))
lines(x = x, y = y, col = "blue")
The real data matches well against the normal distribution, as can also be seen from this Q-Q plot. The points on the normal probability plot seem to follow a straight line.
qqnorm(hgt)
qqline(hgt)
A machine that produces a special type of transistor has a 2% defective rate. The production is considered a random process where each transistor is independent of the others.
prob <- 0.02
(1-prob)^9 * (prob)
## [1] 0.01667496
(1-prob)^100
## [1] 0.1326196
1 / (prob)
## [1] 50
What is the standard deviation?
sqrt(100 * prob * (1-prob))
## [1] 1.4
On average how many transistors would you expect to be produced with this machine before the first with a defect?
prob2 <- 0.05
1 / (prob2)
## [1] 20
What is the standard deviation?
sqrt(100 * prob2 * (1-prob2))
## [1] 2.179449
When p is smaller, the event is rarer, meaning the xpected number of trials before a success and the standard deviation of the waiting time are higher.
The actual probability of having a boy is slightly higher at 0.51. Suppose a couple plans to have 3 kids.
prob.boy <- .51
factorial(3)/ factorial(2) * (.51)^2 * (1-prob.boy)
## [1] 0.382347
(B,B,G), (B,G,B), (G,B,B)
prob.boy <- .51
(prob.boy *prob.boy* (1-prob.boy)) +
(prob.boy * (1-prob.boy) * prob.boy) +
((1-prob.boy) * prob.boy * prob.boy)
## [1] 0.382347
These match.
It is more efficient to apply the rule that counts the number of combinations that can occur (as order is not important here) rather than adding them up separately.
A not-so-skilled volleyball player has a 15% chance of making the serve.
prob3 <- 0.15
factorial(9)/(factorial(2) * factorial(7)) * (1-prob3)^7 * prob3^3
## [1] 0.03895012
15%. We are only looking at the probability of getting the next serve in.
In case (b), the first nine serves were already made, and thus no longer conditional. We take the prior results as given, no longer subject to probability.