In this project, students will demonstrate their understanding of probability and the normal and binomial distributions.
IQ scores are approximately normally distributed with: X ∼ N(μ=100,σ=15)
pnorm(q = 65, mean = 100, sd = 15, lower.tail = FALSE)
## [1] 0.9901847
99.02% of the population has an IQ greater than 65.
qnorm(p = 0.05, mean = 100, sd = 15, lower.tail = FALSE)
## [1] 124.6728
An IQ score of approximately 124.7 represents the top 5% of the population. It represents the IQ score denoting the 95th percentile.
Recall our definition: A value is considered unusual if it lies more than two standard deviations from the mean.
lower_bound <- 100 - 2*15
upper_bound <- 100 + 2*15
The “usual” range of IQ values lies between 70 and 130. Any value below 70 or above 130 is considered unusual.
pnorm(q = upper_bound, mean = 100, sd = 15) - pnorm(q = lower_bound, mean = 100, sd = 15)
## [1] 0.9544997
1 - 0.9544997
## [1] 0.0455003
4.55% of the population has IQ values that fall outside of the “usual” range, meaning 4.55% of the population have IQ values below 70 or above 130.
Two students took different standardized tests.
Alex took the SAT and scored 1650. Taylor took the ACT and scored 27.
Assume the distributions:
SAT∼N(1500,300) ACT∼N(21,5)
meanSAT <- 1500
meanACT <- 21
sdSAT <- 300
sdACT <- 5
(1650 - meanSAT)/sdSAT
## [1] 0.5
(27 - meanACT)/sdACT
## [1] 1.2
Relative to other test-takers, Taylor performed better. Taylor’s z-score is 1.2, meaning it is 1.2 standard deviations from the mean ACT score. In contrast, Alex’s z-score is 0.5 standard deviations from the mean SAT score. Therefore, Taylor’s z-score is relatively higher than Alex’s, and she performed better than the average ACT test-takers.
Comparing raw scores alone would be misleading due to the fact that ACT and SAT scores fall within different ranges of values. The spread of the scores and the means of the scores are vastly different. Therefore, comparing z-scores is more reasonable as it takes the mean and the spread of each score variable into account.
You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d) and you randomly guess every question.
15 * 0.2
## [1] 3
The probability of answering one question correct is 0.2 or 1/5. Therefore, multiplying the total number of questions times the probability of answering one question correctly yields an expected value of 3 questions answered correctly. The expected value is the same as the mean of a binomial distribution.
dbinom(x = 15, size = 15, prob = 0.2)
## [1] 3.2768e-11
The probability of getting every question correct is 3.2768e-11, which indicates the probability of exactly 15 successes in 15 trials is extremely low.
dbinom(x = 15, size = 15, prob = 0.8)
## [1] 0.03518437
The probability of getting every question incorrect is 0.03518437, which is higher than the probability of answering every question correct. This is due to the fact that probability of answering a question incorrectly is 4/5 or 0.8, which is significantly higher than the probability of answering a question correctly.
dbinom(x = 0, size = 15, prob = 0.2)
## [1] 0.03518437
If you substitute “x = 0” and “prob = 0.2” into the original function, you will arrive at the same probability. This is because the probability of answering a question correctly, 0.2 or 1/5, is the complement of answering a question incorrectly, 0.8 or 4/5.
dbinom(x = 10, size = 15, prob = 0.2)
## [1] 0.000100764
The probability of getting exactly 10 questions correct, or the probability of 10 successes in 15 trials, is 0.000100764.
pbinom(q = 9, size = 15, prob = 0.2, lower.tail = FALSE)
## [1] 0.0001132257
The probability of getting 10 or more correct answers is 0.0001132257.
Based on the calculated probability, I do not believe this claim. The probability of guessing randomly and achieving 10 out of 15 answers correct is 0.000100764 or 0.01%. That is a probability of less than 1%, meaning the odds of answering exactly 10 out of 15 questions correctly, guessing randomly, is highly unlikely.
15 * 0.8
## [1] 12
pbinom(q = 11, size = 15, prob = 0.2, lower.tail = FALSE)
## [1] 1.011253e-06
Achieving an 80% on this quiz would require answering at least 12 out of 15 questions correctly. Therefore, the probability of achieving an 80% or higher is 1.011253e-06, making it highly unlikely to maintain a passing grade.
A company schedules 10 employees for a shift. Each employee independently shows up with probability: p = 0.85
Let X = number of employees who show up
The company needs at least 8 workers to operate normally.
pbinom(q = 7, size = 10, prob = 0.85, lower.tail = TRUE)
## [1] 0.1798035
Out of the 10 employees scheduled for a shift, the probability that fewer than 8 employees show up is 0.1798035 or approximately 18%.
1 - pbinom(q = 7, size = 10, prob = 0.85, lower.tail = TRUE)
## [1] 0.8201965
The company needs at least 8 employees to show up to a shift to operate normally. Therefore, finding the complement of less than 8 employees showing up yields a value of 0.8201965. This means that the probability of 8 employees or more showing up for this shift is 0.8201965 or approximately 82%.
Another way to calculate this without using the complement is:
pbinom(q = 7, size = 10, prob = 0.85, lower.tail = FALSE)
## [1] 0.8201965
Substituting “lower.tail = FALSE” into the original function yields the probability of 8 or more employees showing up to this shift.
In the context of scheduling workers, it is necessary for the company to have at least 8 employees show up for a shift in order to operate normally. This means if 10 employees are scheduled for a shift, the probability of the company having enough employees show up to the shift and having enough employees to operate normally is 82%. The probability of the company being short-staffed is 18%.
pbinom(q = 7, size = 12, prob = 0.85, lower.tail = FALSE)
## [1] 0.9760781
pbinom(q = 7, size = 12, prob = 0.85, lower.tail = TRUE)
## [1] 0.02392191
If management wants at least a 95% chance of having enough workers, they should schedule more than 10 employees. Increasing the amount of employees scheduled for a shift increases the probability of 8 employees or more showing up to this shift. Statistically speaking, this makes sense because increasing the number of trials increases the probability of obtaining 8 successes or more. Based on the calculations above, scheduling 12 employees would increase the probability of 8 employees or more showing up to a shift to approximately 97.6%. Additionally, the probability of fewer than 8 employees showing up to a shift decreases to approximately 2.4%.
ACT scores are approximately normally distributed where: X ∼ N(21,5) a. Use R to simulate 10,000 ACT scores.
set.seed(123)
ACT <- rnorm(n = 10000, mean = 21, sd = 5)
hist(ACT)
mean(ACT > 30) * 100
## [1] 3.54
pnorm(q = 30, mean = 21, sd = 5, lower.tail = FALSE)
## [1] 0.03593032
The theoretical probability of getting an ACT score above 30 is 0.03593032 or 3.59%.
The two values are similar but not identical because the theoretical probability calculates the normal distribution based on the ideal mean and the ideal standard deviation. In contrast, the simulated normal distribution uses the observed values in the data set for the mean and the standard deviation.
Create your own real-world situation that could be modeled using either a binomial distribution or a normal distribution.
Your problem must include: * A description of the situation * Identification of reasonable parameters (mean, sd OR n, p) * One probability calculation in R * A written interpretation of the result
Examples might include: * basketball free throws * weather events * exam scores * products being defective
Based on short-term data, Micron Technology (MU) stock has a 70% probability of having an upward price increase daily. Over a 22 trading day period, what is the probability that Micron Technology (MU) will have 15 or more days of an upward price increase?
pbinom(q = 14, size = 22, prob = 0.7, lower.tail = FALSE)
## [1] 0.6712507
The probability of Micron Technology (MU) experiencing an upward price increase 15 or more days out of the 22 day period is approximately 67.1%.