Chapter 3. Distributions of Random Variables
3.2 Area under the curve, Part II - Normal distribution with mean=0, sd=1
sim_norm <- rnorm(n = 100000, mean = 0, sd = 1)
hist(sim_norm, probability = TRUE)
x <- -10:10
y <- dnorm(x = x , mean = 0, sd = 1)
lines(x = x, y = y, col = "blue")

(a) Z > -1.13
1-pnorm(-1.13)
## [1] 0.8707619
(b) Z < 0.18
pnorm(0.18)
## [1] 0.5714237
(c) Z > 8
1-pnorm(8)
## [1] 6.661338e-16
(d) |Z| < 0.5
pnorm(0.5)-pnorm(-0.5)
## [1] 0.3829249
3.4 Triathlon times, Part I
(a) The short-hand for two normal distributions
Answer: Leo completed in 4948 seconds in the group with mean 4313 and sema 583 and Mary completed in 5513 in the group with mean 5261 and sema 807.
(b) Z for Leo and z for Mary
Anwser: Z-Score of Leo is 1.089194 which means Leo run faster than 13.80342% people in his group, and Z-score of Mary is 0.3122677 which means Mary run faster than 37.74186% people in her group.
# Z for Leo
x <- 4948
u <- 4313
sema <-583
Z_Leo <- (x-u)/sema
Z_Leo
## [1] 1.089194
1-pnorm(Z_Leo)
## [1] 0.1380342
#Z for Mary
x <- 5513
u <- 5261
sema <-807
Z_Mary <- (x-u)/sema
Z_Mary
## [1] 0.3122677
1-pnorm(Z_Mary)
## [1] 0.3774186
(c)Rank better in their respective groups?
Answer: No, they both run slower than the average times (mean u) of their respective groups.
(d)Answer: Z-Score of Leo is 1.089194 which means Leo run faster than 13.80342% people in his group.
(e)Answer: Z-score of Mary is 0.3122677 which means Mary run faster than 37.74186% people in her group.
(f)Answer: If the distribution of finishing times are not nealy normal, I can’t estemate these resuals.
# Z for Leo
x <- 4948
u <- 4313
sema <-583
1-pnorm(x,mean=u,sd=sema)
## [1] 0.1380342
#Z for Mary
x <- 5513
u <- 5261
sema <-807
1-pnorm(x,mean=u,sd=sema)
## [1] 0.3774186
3.18 Heights of female college students.
(a)68-95-99.7% rule
Answer:68% of the data are within 1 standard deviation of the mean, 96% are within 2 and 100% are within 3 standard deviations of the mean.
hgt <- c(54,55,56,56,57,58,58,59,60,60,60,61,61,62,62,63,63,63,64,65,65,67,67,69,73)
u <-61.5
sema <-4.58
lowbond1 <- u-sema
uperbond1 <- u+sema
c1 <- subset(hgt, hgt>lowbond1 & hgt<uperbond1)
p1 <- length(c1)/25
p1
## [1] 0.68
lowbond2 <- u-2*sema
uperbond2 <- u+2*sema
c2 <- subset(hgt, hgt>lowbond2 & hgt<uperbond2)
p2 <- length(c2)/25
p2
## [1] 0.96
lowbond3 <- u-3*sema
uperbond3 <- u+3*sema
c3 <- subset(hgt, hgt>lowbond3 & hgt<uperbond3)
p3 <- length(c3)/25
p3
## [1] 1
(b) Is normal distribution?
Answer: Yes
hist(hgt, probability = TRUE)
x <- 40:80
y <- dnorm(x = x, mean = u, sd =sema)
lines(x = x, y = y, ylim = c(0, 0.15),col = "blue")

qqnorm(hgt)
qqline(hgt)

3.22 Defective rate
Geometric distribution: P(defective)=0.02
p <- 0.02
u <- 1/p
sd <- sqrt((1-p)/(p^2))
(a) The probability that the 10th transistor produced is the first with a defect
p*((1-p)^9)
## [1] 0.01667496
(b) P(no defective transistors in a batch of 100)
p^0*((1-p)^100)
## [1] 0.1326196
(c) Expect to be produced before the first with a defect
u
## [1] 50
(d) P(defective)=0.05, expect to be produced before the first with a defect
p2 <- 0.05
u2 <-1/p2
u2
## [1] 20
sd2 <- sqrt((1-p2)/(p2^2))
sd2
## [1] 19.49359
(e) How does increading the P of an event affect u and sd of the wait time until success?
Answer: Increading the probability of an event, mean will decreading since the first with a defect in less samples and sd will decreading too since p is denominator and becomes bigger. As the result, the waiting time until success is short when p is increading.
3.38 Male Children
Binomial model : P(actual male child) =0.51
p <- 0.51
q <- 1-p
(a) two boys of three children
#C(3,2)p^2*q
3*p^2*q
## [1] 0.382347
(b) b <- boy, g<- girl
All possible ordering two boys of three children
Answer: {bbg,bgb,gbb}
Addition rule:
p^2*q+p*q*p+q*p^2
## [1] 0.382347
(c) three boys of eight kids
Answer: In b, order is matter so it is counted by each order of the outcome.
#C(8,3)p^3*q^(8-3)
((8*7*6)/(3*2*1))*p^3*q^5
## [1] 0.2098355
3.42 Serving in volleyball
Negative binomial
p <- 0.15
q <- 1-p
(a) 10th try she will make her 3rd successful serve
#C(9,2)p^3*q^(10-3)
(9*8)/(2*1)*p^3*q^7
## [1] 0.03895012
(b)The probability that her 10th serve will be successful
p
## [1] 0.15
(c) why different between a and b?
Answer: In a, the probability is regarding to each scenario of comabination of possible two success in period night serve and the 3rd success at 10th. However, b is regarding to the single event of the success and each serve is independent, the success of each serve is 15%.