Instructions

  1. Update the author line at the top to have your name in it.
  2. You must knit this document to an html file and publish it to RPubs. Once you have published your project to the web, you must copy the web url link into the appropriate Course Project assignment in MyOpenMath before 11:59pm on the due date.
  3. Answer all the following questions completely. Some may ask for written responses.
  4. Use R chunks for code to be evaluated where needed and always comment all of your code so the reader can understand what your code aims to accomplish.
  5. Proofread your knitted document before publishing it to ensure it looks the way you want it to. Tip: Use double spaces at the end of a line to create a line break and make sure text does not have a header label that isn’t supposed to.
  6. DELETE ALL THESE INSTRUCTIONS BEFORE PUBLISHING YOUR DOCUMENT.

Purpose

In this project, students will demonstrate their understanding of probability and the normal and binomial distributions.


Question 1

Assume IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. If a person is randomly selected, find each of the requested probabilities. Here, x, denotes the IQ of the randomly selected person.

  1. P(x > 65)
# Using the pnorm function to find the probability that x will be less than 65, and subtracting the value from 1 to find its complement (The complement is the probability that x will be greater than 65)

# first assign values to the mean and standard deviation
IQ_mean <- 100
IQ_sd <- 15

# Then use the pnorm function
pnorm(65, mean = IQ_mean, sd = IQ_sd, lower.tail = FALSE)
## [1] 0.9901847

The probability that a randomly selected person has an IQ score greater than 65 is 0.9902 or 99.02%

  1. P(x < 150)
# Using the pnorm function to find the probability that x will be less than 150
pnorm(150, mean = 100, sd = 15, lower.tail = TRUE)
## [1] 0.9995709

The probability that a randomly selected person has IQ score less than 150 is 0.9996 or 99.96%

Question 2

Assume the same mean and standard deviation of IQ scores that was described in question 1.

  1. A high school offers a special program for gifted students. In order to qualify, students must have IQ scores in the top 5%. What is the minimum qualifying IQ?
# using the qnorm to find the score that seperates the bottom 95% from the top 5%
qnorm( 0.95, mean = IQ_mean, sd = IQ_sd)
## [1] 124.6728

The minimum qualifying IQ score to qualify for the special program is approximately 124.67

  1. If one person is randomly selected, what is the probability that their IQ score is greater than 110?
# Using the pnorm function to find the probability of the IQ score greater than 110
pnorm( 110, mean = IQ_mean, sd = IQ_sd, lower.tail = FALSE)
## [1] 0.2524925

The probanbility that a person will have an IQ score greater than 110 is 0.252495 0r 25.25%

Question 3

  1. Still using the mean and standard deviation from question 1, what is the z-score for an IQ of 140?
# Using the Z-score formular and plugging the value for an IQ of 140
(140 - IQ_mean)/IQ_sd
## [1] 2.666667

The Z-score for an IQ of 140 is 2.6666667 or 2.67.

  1. We mentioned in week 6 that a data value is considered “unusual” if it lies more than two standard deviations from the mean. Is an IQ of 140 considered unusual?

Yes, because an IQ score of 140 is considered “unusual” because 140 i2 2.67 standard deviations away from the mean, and 2.67 is greater than 2

  1. What is the probability of getting an IQ greater than 140?
# Using the pnorm function with "lower.tail = FALSE" to find the probability of a random individual having an IQ greater than 140
pnorm(140, mean = IQ_mean, sd = IQ_sd, lower.tail = FALSE)
## [1] 0.003830381

The probability of getting an IQ greater than 140 is 0.00383, or 0.38%

Question 4

You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d,e) and you randomly guess every question.

  1. How many questions do you expect to answer correctly on average?
# Multiplying the number of questions by the probability of guessing an answer correctly
15*0.2
## [1] 3

I would expect to see 3 questions answered correctly on average.

  1. What is the probability that you get every question correct?
# Using the dbinom function to find the probability of the specific instance where 15/15 answers are correct
dbinom(x = 15, size = 15, prob = 0.2)
## [1] 3.2768e-11

The probability of getting every question correct is 3.2768e-11, or 0.000000000032768, which is approximately 0%.

  1. What is the probability that you get every question incorrect?
# Using the dbinom function to find the probability of the specific instance where 0/15 answers are correct
dbinom(x = 0, size = 15, prob = 0.2)
## [1] 0.03518437

The probabiltiy of getting every question incorrect is 0.0352, or 3.52%.

Question 5

Consider still the 15-question multiple choice quiz that each question has 5 options (a,b,c,d,e) and you randomly guess every question.

  1. How many questions does one need to answer correctly in order score exactly a 60%?
# Finding 60% of 15
0.6*15
## [1] 9

One would need to get 9 questions out of 15 correct to score exactly a 60%.

  1. If a grade of 60% or lower is considered failing, then what is the probability of you failing?
# Using pbinom to find the cumulative probability of scoring either 60% or lower
pbinom(q = 9, size = 15, prob = 0.2)
## [1] 0.9998868

The probability of failing is 0.9999, making failure very likely.

  1. If you need a grade of 80% or higher on this quiz to maintain a passing grade, what is the probability of you maintaining that passing grade?
# Calculating 79% of 15 to find q, as 80% still maintains a passing grade
0.79*15
## [1] 11.85
# Using the pbinom function with the "lower.tail = FALSE" argument to find the probability of scoring an 80% or  higher
pbinom(q = 11.85, size = 15, prob = 0.2, lower.tail = FALSE)
## [1] 1.011253e-06

The probability of maintaining a passing grade is 1.011253e-06, or 0.000001011253, which is approximately 0%.

Question 6

Suppose you own a catering company. You hire local college students as servers. Not being the most reliable employees, there is an 80% chance that any one server will actually show up for a scheduled event. For a wedding scheduled on Saturday, you need at least 5 servers.

  1. Suppose you schedule 5 employees, what is the probability that all 5 come to work?
# Using the dbinom function to find the probability of the specific instance where 5/5 employees come to work
dbinom(x = 5, size = 5, prob = 0.8)
## [1] 0.32768

The probability of all 5 employees coming to work is 0.3277.

  1. Suppose you schedule 7 employees, what is the probability that at least 5 come to work?
# Using the pbinom function to find the probability of the specfic instance where at least 5/7 employees come to work
pbinom(q = 4, size = 7, prob = 0.8, lower.tail=FALSE)
## [1] 0.851968

The probability of at least 5 employees out of 7 showing up to work is approximately 0.8520.

  1. It is really important that you have at least 5 servers show up! How many employees should you schedule in order to be 99% confident that at least 5 show up? Hint: there is no single formula for the answer here, so maybe use some kind of trial and error method.
# Knowing that scheduling 7 people only gives us about an 85% chance of having five people show up and knowing I need that 85 to be a 99, I tried a few more numbers. For example, what if we schedule 8, 9, or 10 people?
pbinom(4, 8, 0.8, lower.tail = FALSE)
## [1] 0.9437184
pbinom(4, 9, 0.8, lower.tail = FALSE)
## [1] 0.9804186
pbinom(4, 10, 0.8, lower.tail = FALSE)
## [1] 0.9936306

In order to be 99% confident that at least five people show up, ten employees will have to be scheduled.

Question 7

  1. Generate a random sample of 10,000 numbers from a normal distribution with mean of 51 and standard deviation of 7. Store that data in object called rand_nums.
rand_nums <- round(rnorm(10000, 51, 7))
  1. Create a histogram of that random sample.
hist(rand_nums)

Question 8

  1. How many values in your rand_nums vector are below 40?
# use the pnorm function
pnorm(40, mean = 51, sd = 7)
## [1] 0.05804157

The number of values in the my rand_nums that are below 40 is 0.05804157 or 5.8%

  1. For a theoretical normal distribution, how many of those 10,000 values would you expect to be below 40?
pnorm(10000, mean = 51, sd = 7)
## [1] 1

The 10000 values that I xpect to be below 40 is 1

  1. Is your answer in part a reasonably close to your answer in part b? yes.