library(DATA606)
##
## Welcome to CUNY DATA606 Statistics and Probability for Data Analytics
## This package is designed to support this course. The text book used
## is OpenIntro Statistics, 3rd Edition. You can read this by typing
## vignette('os3') or visit www.OpenIntro.org.
##
## The getLabs() function will return a list of the labs available.
##
## The demo(package='DATA606') will list the demos that are available.
normalPlot(mean = 0, sd = 1, bounds = c(-1.13, 4)) #(a) Z > -1.13
normalPlot(mean = 0, sd = 1, bounds = c(-4, 0.18)) #(b) Z < 0.18
normalPlot(mean = 0, sd = 1, bounds = c(8, 10)) #(c) Z > 8
normalPlot(mean = 0, sd = 1, bounds = c(-0.5, 0.5)) #(d) |Z| < 0.5
Men, Ages 30-34: N(mean = 4313, sd = 583) Women, Ages 25-29: N(mean = 5261, sd = 807)
Leo’s Z-score: 1.0892 Mary’s Z-score: 0.31 These Z-scores tell me how well they performed compared to others in their respective group.
These Z-scores tell me that Leo is probabily slower than average; while Mary is about average or slightly slower.
# Leo is faster than those who are on the right tail of the distribution
pnorm(1.0892, lower.tail = F) #using Z-score
## [1] 0.1380328
pnorm(q = 5513, mean = 5261, sd = 807, lower.tail = F) #using observation value, mean and sd
## [1] 0.3774186
pnorm(0.3123, lower.tail = F) #using Z-score
## [1] 0.3774063
Answer to parts (b) and (c) would not change as Z-scores can be calculated for distributions that are not normal. However, parts (d) and (e) could not be answer because we cannot use the normal probability table to calculate probabilities and percentiles without a normal model.
height <- c(54,55,56,56,57,58,58,59,60,60,60,61,61,62,62,63,63,63,64,65,65,67,67,69,73)
pnorm(61.52+4.58, mean = 61.52, sd = 4.58) - pnorm(61.52-4.58, mean = 61.52, sd = 4.58)
## [1] 0.6826895
pnorm(61.52+2*4.58, mean = 61.52, sd = 4.58) - pnorm(61.52-2*4.58, mean = 61.52, sd = 4.58)
## [1] 0.9544997
pnorm(61.52+3*4.58, mean = 61.52, sd = 4.58) - pnorm(61.52-3*4.58, mean = 61.52, sd = 4.58)
## [1] 0.9973002
The heights approximately follow the 68-95-99.7% rule.
3.18
qqnormsim(height)
The distribution is unimodal and symmetric. The points on the normal probability plot seem to follow a straight line with one outlier on the upper right, but not too extreme compared with the simulated plots.
(0.98)^9 * 0.02 # 9 fine transistors followed by 1 defect
## [1] 0.01667496
(.98)^100 # 100 transistors each with prob. of .98
## [1] 0.1326196
1/.02 #expected value is 1/p
## [1] 50
sqrt((1-.02)/(.02)^2) #sd is square root of (1-p)/p^2
## [1] 49.49747
We would expect 50 transistors to be produced before the first defect, with standard deviation 49.5.
1/.05 #expected value is 1/p
## [1] 20
sqrt((1-.05)/(.05)^2) #sd is square root of (1-p)/p^2
## [1] 19.49359
We would expect 20 transistors to be produced before the first defect, with standard deviation 19.5.
Increasing the probability of an event will speed up the waiting time until first success.
choose(3,2)*(.51)^2*(.49) #2 out of 3 with prob. of having a boy, 1 with prob. of having a girl
## [1] 0.382347
bbg <- (.51)*(.51)*(.49) #BBG
bgb <- (.51)*(.49)*(.51) #BGB
gbb <- (.49)*(.51)*(.51) #GBB
bbg+bgb+gbb #Add up the probabilities from all scenarios
## [1] 0.382347
Choosing 3 out of 8 will generate many more scenarios (56 combinations), therefore it will be more tedious if we calculate the probability for each scenario and then add them up.
#for the 10th try to be the 3rd success, she must have exactly 2 successes out of the first 9 tries, and then follow by 1 success
choose(9, 2) * (.15)^2 * (.85)^7 * (.15)
## [1] 0.03895012
15%, since her serves are independent of each other.
In part (b), the condition of previous 9 attempts are already given, so prob. of 2 out of 9 successes = 1; whereas in part (a), we need to calculate the probability of 2 successes in 9 attempts, and that prob = choose(9, 2) * (.15)^2 * (.85)^7 = 26%. Thus the prob. in (a) is less than that in (b).