Purpose

In this project, Colin will demonstrate his understanding of probability and the normal and binomial distributions and try not to forget to remove any instructions or be ridiculous.


Question 1

Assume IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. If a person is randomly selected, find each of the requested probabilities. Here, x, denotes the IQ of the randomly selected person.

q1_mean <- 100
q1_sd <- 15
  1. P(x > 65)
(1 - pnorm(65,q1_mean,q1_sd)) * 100
## [1] 99.01847
  1. P(x < 150)
pnorm(150,q1_mean,q1_sd)*100
## [1] 99.95709

Question 2

Assume the same mean and standard deviation of IQ scores that was described in question 1.

  1. A high school offers a special program for gifted students. In order to qualify, students must have IQ scores in the top 5%. What is the minimum qualifying IQ?
qnorm(0.95, q1_mean, q1_sd)
## [1] 124.6728
  1. If one person is randomly selected, what is the probability that their IQ score is greater than 110?
(1 - pnorm(110, q1_mean, q1_sd)) * 100
## [1] 25.24925

Question 3

  1. Still using the mean and standard deviation from question 1, what is the z-score for an IQ of 140?
(140 - q1_mean) / q1_sd
## [1] 2.666667
  1. We mentioned in week 6 that a data value is considered “unusual” if it lies more than two standard deviations from the mean. Is an IQ of 140 considered unusual?
q3b_zscore <- abs(140 - q1_mean) > 2 * q1_sd
q3b_zscore
## [1] TRUE
  1. What is the probability of getting an IQ greater than 140?
q3b_answer <- (1 - pnorm(q3b_zscore))
q3b_answer * 100
## [1] 15.86553

Question 4

You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d,e) and you randomly guess every question.

  1. How many questions do you expect to answer correctly on average?
q4a_answer <- 15 * (1 / 5)
q4a_answer
## [1] 3
  1. What is the probability that you get every question correct?
q4b_answer <- (1 / 5)^15
q4b_answer
## [1] 3.2768e-11
floor(q4b_answer)
## [1] 0
  1. What is the probability that you get every question incorrect?
q4c_answer <- ((5 - 1)/5)^15
q4c_answer * 100
## [1] 3.518437

Question 5

Consider still the 15-question multiple choice quiz that each question has 5 options (a,b,c,d,e) and you randomly guess every question.

  1. How many questions does one need to answer correctly in order score exactly a 60%?
q5a_answer <- 15 * 0.60
q5a_answer
## [1] 9
  1. If a grade of 60% or lower is considered failing, then what is the probability of you failing?
q5b_answer <- (pbinom(q5a_answer - 1, 15, 1 / 5))
q5b_answer * 100
## [1] 99.9215
  1. If you need a grade of 80% or higher on this quiz to maintain a passing grade, what is the probability of you maintaining that passing grade?
q5c_answer <- 1 - pbinom(q5a_answer - 1, 15, 1 / 5)
q5c_answer
## [1] 0.0007849854

Question 6

Suppose you own a catering company. You hire local college students as servers. Not being the most reliable employees, there is an 80% chance that any one server will actually show up for a scheduled event. For a wedding scheduled on Saturday, you need at least 5 servers.

  1. Suppose you schedule 5 employees, what is the probability that all 5 come to work?
q6a_answer <- dbinom(5, 5, 4/5)
q6a_answer * 100
## [1] 32.768
  1. Suppose you schedule 7 employees, what is the probability that at least 5 come to work?
q6b_answer <- 1 - pbinom(4, 7, 4/5)
q6b_answer * 100
## [1] 85.1968
  1. It is really important that you have at least 5 servers show up! How many employees should you schedule in order to be 99% confident that at least 5 show up? Hint: there is no single formula for the answer here, so maybe use some kind of trial and error method.
for(q6c_answer in 5:20) {
  yes_theyre_here <- 1 - pbinom(4, q6c_answer, 4/5)
  if (yes_theyre_here >= 0.99) {
    break
  }
}
q6c_answer
## [1] 10

Question 7

  1. Generate a random sample of 10,000 numbers from a normal distribution with mean of 51 and standard deviation of 7. Store that data in object called rand_nums.
rand_nums <- rnorm(10000, 51, 7)
  1. Create a histogram of that random sample.
hist(rand_nums, main = "Question 7", xlab = "Value", breaks = 30, col = "lightblue", border = "black")

Question 8

  1. How many values in your rand_nums vector are below 40?
q8a_answer <- sum(rand_nums < 40)
q8a_answer
## [1] 580
  1. For a theoretical normal distribution, how many of those 10,000 values would you expect to be below 40?
q8b_answer <- pnorm(40, 51, 7) * 10000
q8b_answer
## [1] 580.4157
  1. Is your answer in part a reasonably close to your answer in part b?
q8a_answer
## [1] 580
q8b_answer
## [1] 580.4157

That looks reasonably close! Even when Knitting this document a few dozen times. Plus, there’s this:

((q8a_answer - q8b_answer)/q8b_answer) * 100
## [1] -0.07161569

Reasonably close.