In this project, students will demonstrate their understanding of probability and the normal and binomial distributions.
IQ scores are approximately normally distributed with: X ~ N(mu=100,sigma=15)
#Proportion of the population with IQ greater than 65
pnorm(65, mean = 100, sd = 15, lower.tail = FALSE)
## [1] 0.9901847
The proportion of the population with an IQ greater than 65 is approximately 0.99018. This means about 99.02% of people have an IQ above 65.
# IQ score for top 5% of the population
qnorm(.95, mean = 100, sd = 15)
## [1] 124.6728
The IQ score that represents the top 5% of the population is approximately 124.67. This means a person must score about 125 or higher to be in the top 5% of IQ scores.
Recall our definition: A value is considered unusual if it lies more than two standard deviations from the mean.
#Usual range
lower <- 100 - (2*15)
upper <- 100 + (2*15)
lower
## [1] 70
upper
## [1] 130
The usual range is from 70 to 130.
#Probability outside this range
pnorm(q= 70, mean = 100, sd = 15) + pnorm(q= 130, mean = 100, sd = 15, lower.tail = FALSE)
## [1] 0.04550026
The proportion of the population that falls outside of this range is about 4.55%.
Two students took different standardized tests.
Alex took the SAT and scored 1650. Taylor took the ACT and scored 27.
Assume the distributions:
SAT~N(1500,300) ACT~N(21,5)
# Z-scores
zalex <- (1650 - 1500)/ 300
ztaylor <- (27 - 21)/ 5
zalex
## [1] 0.5
ztaylor
## [1] 1.2
Which student performed better relative to other test-takers? Taylor performed better relative to other test-takers because their Z-score was higher.
Explain why comparing the raw scores alone would be misleading. Comparing the raw scores would be misleading because the SAT and ACT have different scales of measurement. Using the Z-scores allows for a standardization of the scores, making them comparable.
You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d) and you randomly guess every question.
#Average correct answer
15 * (1/5)
## [1] 3
The expected correct number is 3.
#Probability of getting every question correct
dbinom(x = 15, size = 15, prob = 1/5)
## [1] 3.2768e-11
The probability of getting all 15 questions correct is small.
#Probability of getting every question incorrect
dbinom(x= 0, size = 15, prob = 1/5)
## [1] 0.03518437
The probability of getting all 15 questions incorrect is small.
#Probability of getting exactly 10 questions correct
dbinom(x = 10, size = 15, prob = 1/5)
## [1] 0.000100764
The probability of getting exactly 10 questions correct is low.
#Probability of getting 10 or more questions correct
pbinom(q = 9, size = 15, prob = 1/5, lower.tail = FALSE)
## [1] 0.0001132257
The probability of getting 10 or more questions correct is unlikely.
Based on the probability above, getting 10 or more correct is extremely unlikely. Because of its unlikeness, I wouldn’t believe the student’s claim of guessing.
#Needs at least an 80%
pbinom(q = 11, size = 15, prob = 1/5, lower.tail = FALSE)
## [1] 1.011253e-06
The probability of maintaining a passing grade of at least 12 correct is low, meaning it’s unlikely to pass by randomly guessing.
A company schedules 10 employees for a shift. Each employee independently shows up with probability: p = 0.85
Let X = number of employees who show up
The company needs at least 8 workers to operate normally.
#Probability that fewer than 8 employees show up
pbinom(q= 7, size = 10, prob = 0.85)
## [1] 0.1798035
#Probability that at least 8 employees show up
pbinom(q= 7, size = 10, prob = 0.85, lower.tail = FALSE)
## [1] 0.8201965
This probability represents the likelihood that at least 8 employees show up, meaning the company can operate normally.
Since the probability of having at least 8 workers is below 95%, the company should schedule more employees to increase the likelihood of having enough workers.
ACT scores are approximately normally distributed where: X ~ N(21,5) a. Use R to simulate 10,000 ACT scores.
#Simulate 10,000 ACT scores from a normal distribution
set.seed(1)
scores <- rnorm(n=10000, mean = 21, sd = 5)
#Calculate the proportion of simulated scores greater than 30
mean(scores > 30)
## [1] 0.0375
#Calculate the theoretical probability of scoring above 30
pnorm(q= 30, mean = 21, sd = 5, lower.tail = FALSE)
## [1] 0.03593032
The values are similar because the simulation approximates the probability, but there are differences that occur due to randomness.
Create your own real-world situation that could be modeled using either a binomial distribution or a normal distribution.
Your problem must include: * A description of the situation * Identification of reasonable parameters (mean, sd OR n, p) * One probability calculation in R * A written interpretation of the result
Examples might include: * basketball free throws * weather events * exam scores * products being defective
Example: Basketball Free Throws
A basketball player makes free throws with a probability of 0.8. The player takes 10 shots.
The Parameters n = 10, p = 0.8
What is the probability the basketball player makes exactly 8 shots?
#Probability of making exactly 8 out of 10 free throw shots
dbinom(x = 8, size = 10, prob = 0.8)
## [1] 0.3019899
The probability that the basketball player will make exactly 8 out of 10 free throw shots is about 0.3019899. This means there is about a 30.2% chance the player makes exactly 8 out of 10 shots.