This project will demonstrate an understanding of the normal and binomial probability distributions in R and RStudio.
Assume IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. If a person is randomly selected, find each of the requested probabilities. Here, x, denotes the IQ of the randomly selected person.
# Using the pnorm function to find the probability that x will be less than 65, and subtracting the value from 1 to find its complement (The complement is the probability that x will be greater than 65)
1-pnorm(65, 100, 15)
## [1] 0.9901847
The probability that a randomly selected person will have an IQ score greater than 65 is .9902, or 99.02%.
# Using the pnorm function to find the probability that x will be less than 150
pnorm(150, 100, 15)
## [1] 0.9995709
The probability that a randomly selected person will have an IQ score less than 150 is .9996, or 99.96%.
Assume the same mean and standard deviation of IQ scores that was described in question 1.
# Storing the values for mean and standard deviation into the environment
IQ_mean <- 100
IQ_sd <- 15
# Using qnorm to find the score that separates the bottom 95% from the top 5%
qnorm(.95, IQ_mean, IQ_sd)
## [1] 124.6728
The minimum qualifying IQ score to qualify for the special program is approximately 124.67.
# Using the pnorm function with "lower.tail = FALSE" to find the probability that a randomly selected person will have an IQ greater than 125
pnorm(125, IQ_mean, IQ_sd, lower.tail = FALSE)
## [1] 0.04779035
The probability that a randomly selected person will have an IQ score greater than 125 is .0478, or 4.78%.
# Plugging values into the z-score formula for an IQ of 140
(140-IQ_mean)/IQ_sd
## [1] 2.666667
The z-score for an IQ of 140 is 2.67.
Yes, an IQ score of 140 is considered “unsual” becasue 140 is 2.67 standard deviations away from the mean, and 2.67 > 2.
# Using the pnorm function with "lower.tail = FALSE" to find the probability of a random individual having an IQ above 140
pnorm(140, IQ_mean, IQ_sd, lower.tail = FALSE)
## [1] 0.003830381
The probability of getting an IQ greater than 140 is .0038, or 0.38%.
You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d,e) and you randomly guess every question.
# Multiplying the number of questions by the probability of guessing an answer correctly
15*.2
## [1] 3
I would expect to see 3 questions answered correctly on average.
# Using the dbinom function to find the probability of the specific instance where 15/15 answers are correct
dbinom(x = 15, size = 15, prob = .2)
## [1] 3.2768e-11
The probability of getting every question correct is 3.2768e-11, or .000000000032768, which is approximately 0%.
# Using the dbinom function to find the probability of the specific instance where 0/15 answers are correct
dbinom(x = 0, size = 15, prob = .2)
## [1] 0.03518437
The probabiltiy of getting every question incorrect is .0352, or 3.52%.
Consider still the 15-question multiple choice quiz that each question has 5 options (a,b,c,d,e) and you randomly guess every question.
# Finding 60% of 15
.6*15
## [1] 9
One would need to get 9 questions out of 15 correct to score exactly a 60%.
# Using pbinom to find the cumulative probability of scoring either 60% or lower
pbinom(q = 9, size = 15, prob = .2)
## [1] 0.9998868
The probability of failing is .9999, making failure very likely.
# Calculating 79% of 15 to find q, as 80% still maintains a passing grade
.79*15
## [1] 11.85
# Using the pbinom function with the "lower.tail = FALSE" argument to find the probability of scoring an 80% or higher
pbinom(q = 11.85, size = 15, prob = .2, lower.tail = FALSE)
## [1] 1.011253e-06
The probability of maintaining a passing grade is 1.011253e-06, or .000001011253, which is approximately 0%.
Suppose you own a catering company. You hire local college students as servers. Not being the most reliable employees, there is an 80% chance that any one server will actually show up for a scheduled event. For a wedding scheduled on Saturday, you need at least 5 servers.
# Using the dbinom function to find the probability of the specific instance where 5/5 employees come to work
dbinom(x = 5, size = 5, prob = .8)
## [1] 0.32768
The probability of all 5 employees coming to work is .3277.
# Using the pbinom function to find the probability of the specfic instance where at least 5/7 employees come to work
pbinom(q = 4, size = 7, prob = .8, lower.tail=FALSE)
## [1] 0.851968
The probability of at least 5 employees out of 7 showing up to work is approximately .8520.
#Using the pbinom function with a range of possibilities to observe multiple probabilities of getting five employees to show up, and creating a data frame to see which value corresponds with which scenario
data.frame(employees_scheduled=c(5:15), prob = pbinom(q=4, size = c(5:15), prob = .8, lower.tail = FALSE))
## employees_scheduled prob
## 1 5 0.3276800
## 2 6 0.6553600
## 3 7 0.8519680
## 4 8 0.9437184
## 5 9 0.9804186
## 6 10 0.9936306
## 7 11 0.9980346
## 8 12 0.9994188
## 9 13 0.9998340
## 10 14 0.9999540
## 11 15 0.9999875
# After seeing in the data frame that when size = 10, the probability of having 5 employees present is .9936, it is clear that at least ten employees must be schedueled. To check this, the next line of code shows testing the one specific instance of schedueling 10 employees to work with the need for at least 5 to show up
pbinom(q=4, size=10, prob = .8, lower.tail=FALSE)
## [1] 0.9936306
In order to be 99% confident that at least 5 employees will show up to work, a minimum of 10 employees should be schedueled.