knitr::opts_knit$set(root.dir = "c:/users/Michael/DROPBOX/priv/CUNY/MSDS/201902-Spring/DATA606-Jason/Homework")
##setwd("c:/users/Michael/DROPBOX/priv/CUNY/MSDS/201902-Spring/DATA606-Jason/Homework")In triathlons, it is common for racers to be placed into age and gender groups.
Friends Leo and Mary both completed the Hermosa Beach Triathlon, where
• Leo competed in the Men, Ages 30 - 34 group, while
• Mary competed in the Women, Ages 25 - 29 group.
• Leo completed the race in 1:22:28 (4948 seconds), while
• Mary completed the race in 1:31:53 (5513 seconds).
Obviously Leo finished faster, but they are curious about how they did within their respective groups. Can you help them?
Here is some information on the performance of their groups:
• The finishing times of the Men, Ages 30 - 34 group has a mean of 4313 seconds with a standard deviation of 583 seconds.
• The finishing times of the Women, Ages 25 - 29 group has a mean of 5261 seconds with a standard deviation of 807 seconds.
• The distributions of finishing times for both groups are approximately Normal.
Remember: a better performance corresponds to a faster finish.
\[ N_{\text{men}}(\mu=4313, \sigma=583) =\frac { 1 }{ 583 \sqrt { 2\pi } } { e }^{ -\frac { { \left( x-4313 \right) }^{ 2 } }{ { 2 \left( 583 \right) }^{ 2 } } } \]
\[ N_{\text{women}}(\mu=5261, \sigma=807) =\frac { 1 }{ 807 \sqrt { 2\pi } } { e }^{ -\frac { { \left( x-5261 \right) }^{ 2 } }{ { 2 \left( 807 \right) }^{ 2 } } } \]
leo_zscore <- (4948-4313)/583
print(paste("Leo's Z-score is " , leo_zscore))## [1] "Leo's Z-score is 1.08919382504288"
mary_zscore <- (5513-5261)/807
print(paste("Mary's Z-score is " , mary_zscore))## [1] "Mary's Z-score is 0.312267657992565"
leo_quantile = 1 - pnorm(leo_zscore)
leo_percentile = round(100*leo_quantile,2)
resultleo = pnorm(q= leo_zscore,mean = 0, sd = 1, lower.tail = F)
pctresultleo = as.character(paste0(round(100*resultleo,2),"%"))
x <- seq(-4,4,length=200)
y <- dnorm(x,mean=0, sd=1)
plot(x, y, type = "l", lwd = 2,
xlim = c(-4,4),
ylab='', xlab='fast <------ z-score: men age 30-34 ------> slow', yaxt='n')
lb <- leo_zscore; ub <- 10
i <- x >= lb & x <= ub
polygon(c(lb,x[i],ub), c(0,y[i],0), col="yellow")
text(1.6, .05, pctresultleo)mary_quantile = 1 - pnorm(mary_zscore)
mary_percentile = round(100*mary_quantile,2)
mary_quantile = 1 - pnorm(mary_zscore)
mary_percentile = round(100*mary_quantile,2)
resultmary = pnorm(q= mary_zscore,mean = 0, sd = 1, lower.tail = F)
pctresultmary = as.character(paste0(round(100*resultmary,2),"%"))
x <- seq(-4,4,length=200)
y <- dnorm(x,mean=0, sd=1)
plot(x, y, type = "l", lwd = 2,
xlim = c(-4,4),
ylab='', xlab='fast <------ z-score: women age 25-29 ------> slow', yaxt='n')
lb <- mary_zscore; ub <- 10
i <- x >= lb & x <= ub
polygon(c(lb,x[i],ub), c(0,y[i],0), col="yellow")
text(1.0, .05, pctresultmary)Below are heights of 25 female college students:
54, 55, 56, 56, 57, 58, 58, 59, 60, 60, 60, 61, 61, 62, 62, 63, 63, 63, 64, 65, 65, 67, 67, 69, 73
heights = c(54, 55, 56, 56, 57, 58, 58, 59, 60, 60, 60, 61, 61, 62, 62, 63, 63, 63, 64, 65, 65, 67, 67, 69, 73)
mu = mean(heights)
sigma = sd(heights)
sd1 = c(mu-sigma,mu+sigma)
sd2 = c(mu-2*sigma,mu+2*sigma)
sd3 = c(mu-3*sigma,mu+3*sigma)
sd1heights = heights[heights>sd1[1]&heights<sd1[2]]
sd1pct = length(sd1heights)/length(heights)
print(paste0("One sd : ", as.character(round(100*sd1pct,2)),"%"))## [1] "One sd : 68%"
sd2heights = heights[heights>sd2[1]&heights<sd2[2]]
sd2pct = length(sd2heights)/length(heights)
print(paste0("Two sd : ", as.character(round(100*sd2pct,2)),"%"))## [1] "Two sd : 96%"
sd3heights = heights[heights>sd3[1]&heights<sd3[2]]
sd3pct = length(sd3heights)/length(heights)
print(paste0("Three sd : ", as.character(round(100*sd3pct,3)),"%"))## [1] "Three sd : 100%"
prob_each_defective = 0.02
prob_each_not_defective = 1 - prob_each_defective
prob_nine_not_defective = prob_each_not_defective^9
prob_tenth_is_first_defect = prob_nine_not_defective * prob_each_defective
prob_tenth_is_first_defect## [1] 0.01667496
prob_100_not_defective = prob_each_not_defective^100
prob_100_not_defective## [1] 0.1326196
mu02 = 1 / prob_each_defective
mu02## [1] 50
sigma02 = sqrt(prob_each_defective*prob_each_not_defective)
sigma02## [1] 0.14
prob_defective = 0.05
prob_not_defective = 1 - prob_defective
mu05 = 1 / prob_defective
mu05## [1] 20
sigma05 = sqrt(prob_defective*prob_not_defective)
sigma05## [1] 0.2179449
While it is often assumed that the probabilities of having a boy or a girl are the same, the actual probability of having a boy is slightly higher at 0.51.
Suppose a couple plans to have 3 kids.
p_boy = 0.51
p_girl = 1- p_boy
n = 3
k = 2
p_two_boys_of_three = choose(n,k) * (p_boy^k) * p_girl^(n-k)
p_two_boys_of_three## [1] 0.382347
p_girl_first = p_girl * p_boy * p_boy
p_girl_middle = p_boy * p_girl * p_boy
p_girl_last = p_boy * p_boy * p_girl
p_sum = p_girl_first + p_girl_middle + p_girl_last
p_sum## [1] 0.382347
p_two_boys_of_three == p_sum## [1] TRUE
choose(8,3)## [1] 56
A not-so-skilled volleyball player has a 15% chance of making the serve, which involves hitting the ball so it passes over the net on a trajectory such that it will land in the opposing team’s court.
Suppose that her serves are independent of each other.
p_success = .15
p_fail = 1 - p_success
n = 9
k = 2
p_two_of_nine_successes = choose(n,k) * p_success^k * p_fail^(n-k)
p_two_of_nine_successes## [1] 0.2596674
### dbinom:
dbinom(x = 2, size = 9, prob = .15)## [1] 0.2596674
p_third_success_on_tenth_trial = p_two_of_nine_successes * p_success
p_third_success_on_tenth_trial## [1] 0.03895012
### dnbinom (Negative Binomial):
dnbinom(x = 10-3, size = 3, prob = .15)## [1] 0.03895012