This project will demonstrate your understanding of the normal and binomial probability distributions in R and RStudio.
Assume IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. If a person is randomly selected, find each of the requested probabilities. Here, x, denotes the IQ of the randomly selected person.
#First assign values to the mean and standard deviation.
mean_IQ <- 100
sd_IQ <- 15
#Then use pnorm
pnorm(65, mean_IQ, sd_IQ, lower.tail = FALSE)
## [1] 0.9901847
The probability that a randomly selected person has an IQ over 65 is 99.02%.
#Again, use pnorm to find the percentage.
pnorm(150, mean_IQ, sd_IQ, lower.tail = TRUE)
## [1] 0.9995709
The probability that a randomly selected person has an IQ less than 150 is 99.96%.
Assume the same mean and standard deviation of IQ scores that was described in question 2.
#To find the score that delineates the top 5%, use qnorm.
qnorm(0.05, mean_IQ, sd_IQ, lower.tail = FALSE )
## [1] 124.6728
To qualify for the gifted program, a student must have an IQ higher than 124.67.
#Since 125 is close to the value found in part a, I'm guessing that the percentage is close to 5%.
pnorm(125, mean_IQ, sd_IQ, lower.tail = FALSE)
## [1] 0.04779035
The probability that a randomly selected person has an IQ score greater than 125 is 4.78%.
The z-score is calculated with the formula \(z = \frac{(x - \mu)}{\sigma}\)
(140 - mean_IQ)/sd_IQ
## [1] 2.666667
The z-score for an IQ of 140 is +2.67.
We mentioned in week 6 that a data value is considered “unusual” if it lies more than two standard deviations from the mean. Is an IQ of 140 considered unusual?
Yes. With a z-score of 2.67, an IQ of 140 is considered unusual.
What is the probability of getting an IQ greater than 140?
#This calls for pnorm, again.
pnorm(140, mean_IQ, sd_IQ, lower.tail = FALSE)
## [1] 0.003830381
The probability of an IQ greater than 140 is 0.38%.
You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d,e) and you randomly guess every question.
#The expected value is calculated by multiplying the number of questions by the probability that any one question is correct.
15*0.2
## [1] 3
I expect 3 questions of the 15 to be answered correctly.
#To find the probability of answering every question correctly, use dbinom with a probability of 0.2.
dbinom(15,15,0.2)
## [1] 3.2768e-11
For all intents and purposes, this is a probability of zero.
#Since the chance of answering a question incorrectly is much higher than answering a question correctly, I guess that the probability of all incorrect answers should be different than zero.
dbinom(15,15,0.8)
## [1] 0.03518437
#This is an alternate method.
dbinom(x = 0, size = 15, prob = 0.2)
## [1] 0.03518437
The probability of answering ALL questions INCORRECTLY is 3.52%.
Consider still the 15-question multiple choice quiz that each question has 5 options (a,b,c,d,e) and you randomly guess every question.
0.6*15
## [1] 9
60% of 15 is 9.
#Calculate the probability of answering at most 9 questions correctly
pbinom(q = 9,size = 15,prob = 0.2)
## [1] 0.9998868
The probability of failure is 99.99%
#A grade of 80% requires at least 12 correct answers.
#First I'll calculate this as the complement of answering at most 11 questions correctly.
1- pbinom(q = 11, size = 15, prob = 0.2, lower.tail = TRUE)
## [1] 1.011253e-06
#A check is made by setting lower.tail = FALSE)
pbinom(q = 11, size = 15, prob = 0.2, lower.tail = FALSE)
## [1] 1.011253e-06
There is a 0.0001% chance of maintaining a passing grade.
Suppose you own a catering company. You hire local college students as servers. Not being the most reliable employees, there is an 80% chance that any one server will actually show up for a scheduled event. For a wedding scheduled on Saturday, you need at least 5 servers.
dbinom(x = 5,size = 5,prob = 0.8)
## [1] 0.32768
If 5 employees are scheduled, the probability that all 5 will show up is 32.77%.
#First, I'll answer this by adding together the probability of 5 employees coming to work and the probability of 6 employees coming to work and the probability of 7 employees coming to work.
dbinom(x = 5,size = 7, prob = 0.8) + dbinom(x = 6,size = 7, prob = 0.8) + dbinom(x = 7,size = 7, prob = 0.8)
## [1] 0.851968
#Let's see if I get the same answer with pbinom
pbinom(q = 4, size = 7,prob = 0.8,lower.tail = FALSE)
## [1] 0.851968
When I increase the number of employees scheduled to 7, the probability that 5 or more people show up is 85.20%.
#I'll write a loop that finds the smallest number of servers that have to be scheduled to achieve a 99% probability that at least 5 will show up.
#Let n be the number of employees to be scheduled. I'll increase n by one for each iteration and calculate a new probability, P. The initial conditions will be set to the values calculated in part b.
n <- 7
P <- 0.85
while (P<0.99){
n = n + 1
P = pbinom(q = 4, size = n,prob = 0.8, lower.tail = FALSE)
cat("When n = ",n,", P = ",P,"\n")}
## When n = 8 , P = 0.9437184
## When n = 9 , P = 0.9804186
## When n = 10 , P = 0.9936306
From this result, we see that 10 employees must be scheduled to be 99% confident that at least 5 of them will show up. Seems like a good time to find more reliable employees!