Project #3 - Normal and Binomial Distributions

Purpose

In this project, students will demonstrate their understanding of probability and the normal and binomial distributions.

Question 1

IQ scores are approximately normally distributed with: X ∼ N(μ=100,σ=15)

What proportion of the population has an IQ greater than 65? Interpret the result in context in a complete sentence.

pnorm(q = 65, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 0.9901847

99.02% of the population has an IQ greater than 65.

What IQ score represents the top 5% of the population? Explain in a sentence what this value means in plain language.

qnorm(p = 0.05, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 124.6728

An IQ score of approximately 124.7 represents the top 5% of the population. It represents the IQ score denoting the 95th percentile.

Question 2

Recall our definition: A value is considered unusual if it lies more than two standard deviations from the mean.

Find the IQ values that mark the lower and upper bounds of the “usual” range.

lower_bound <- 100 - 2*15
upper_bound <- 100 + 2*15

The “usual” range of IQ values lies between 70 and 130. Any value below 70 or above 130 is considered unusual.

What proportion of the population falls outside this range?

pnorm(q = upper_bound, mean = 100, sd = 15) - pnorm(q = lower_bound, mean = 100, sd = 15)

## [1] 0.9544997

1 - 0.9544997

## [1] 0.0455003

4.55% of the population has IQ values that fall outside of the “usual” range, meaning 4.55% of the population have IQ values below 70 or above 130.

Question 3

Two students took different standardized tests.

Alex took the SAT and scored 1650. Taylor took the ACT and scored 27.

Assume the distributions:
SAT∼N(1500,300) ACT∼N(21,5)

Compute the z-score for each student.

meanSAT <- 1500
meanACT <- 21
sdSAT <- 300
sdACT <- 5
(1650 - meanSAT)/sdSAT

## [1] 0.5

(27 - meanACT)/sdACT

## [1] 1.2

Which student performed better relative to other test-takers?

Relative to other test-takers, Taylor performed better. Taylor’s z-score is 1.2, meaning it is 1.2 standard deviations from the mean ACT score. In contrast, Alex’s z-score is 0.5 standard deviations from the mean SAT score. Therefore, Taylor’s z-score is relatively higher than Alex’s, and she performed better than the average ACT test-takers.

Explain why comparing the raw scores alone would be misleading.

Comparing raw scores alone would be misleading due to the fact that ACT and SAT scores fall within different ranges of values. The spread of the scores and the means of the scores are vastly different. Therefore, comparing z-scores is more reasonable as it takes the mean and the spread of each score variable into account.

Question 4

You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d) and you randomly guess every question.

How many questions do you expect to answer correctly on average?

15 * 0.2

## [1] 3

The probability of answering one question correct is 0.2 or 1/5. Therefore, multiplying the total number of questions times the probability of answering one question correctly yields an expected value of 3 questions answered correctly. The expected value is the same as the mean of a binomial distribution.

What is the probability that you get every question correct?

dbinom(x = 15, size = 15, prob = 0.2)

## [1] 3.2768e-11

The probability of getting every question correct is 3.2768e-11, which indicates the probability of exactly 15 successes in 15 trials is extremely low.

What is the probability that you get every question incorrect?

dbinom(x = 15, size = 15, prob = 0.8)

## [1] 0.03518437

The probability of getting every question incorrect is 0.03518437, which is higher than the probability of answering every question correct. This is due to the fact that probability of answering a question incorrectly is 4/5 or 0.8, which is significantly higher than the probability of answering a question correctly.

dbinom(x = 0, size = 15, prob = 0.2)

## [1] 0.03518437

If you substitute “x = 0” and “prob = 0.2” into the original function, you will arrive at the same probability. This is because the probability of answering a question correctly, 0.2 or 1/5, is the complement of answering a question incorrectly, 0.8 or 4/5.

What is the probability of getting exactly 10 questions correct?

dbinom(x = 10, size = 15, prob = 0.2)

## [1] 0.000100764

The probability of getting exactly 10 questions correct, or the probability of 10 successes in 15 trials, is 0.000100764.

What is the probability of getting 10 or more correct answers?

pbinom(q = 9, size = 15, prob = 0.2, lower.tail = FALSE)

## [1] 0.0001132257

The probability of getting 10 or more correct answers is 0.0001132257.

Suppose a student claims they guessed randomly but got 10 out of 15 correct. Based on your probability above, do you believe this claim? Explain your reasoning. (There is no single correct answer, but your reasoning must use the probability you calculated.)

Based on the calculated probability, I do not believe this claim. The probability of guessing randomly and achieving 10 out of 15 answers correct is 0.000100764 or 0.01%. That is a probability of less than 1%, meaning the odds of answering exactly 10 out of 15 questions correctly, guessing randomly, is highly unlikely.

If you need a grade of 80% or higher on this quiz to maintain a passing grade, what is the probability of you maintaining that passing grade?

15 * 0.8

## [1] 12

pbinom(q = 11, size = 15, prob = 0.2, lower.tail = FALSE)

## [1] 1.011253e-06

Achieving an 80% on this quiz would require answering at least 12 out of 15 questions correctly. Therefore, the probability of achieving an 80% or higher is 1.011253e-06, making it highly unlikely to maintain a passing grade.

Question 5

A company schedules 10 employees for a shift. Each employee independently shows up with probability: p = 0.85

Let X = number of employees who show up

The company needs at least 8 workers to operate normally.

What is the probability that fewer than 8 employees show up?

pbinom(q = 7, size = 10, prob = 0.85, lower.tail = TRUE)

## [1] 0.1798035

Out of the 10 employees scheduled for a shift, the probability that fewer than 8 employees show up is 0.1798035 or approximately 18%.

What is the probability the company has enough workers for this shift?

1 - pbinom(q = 7, size = 10, prob = 0.85, lower.tail = TRUE)

## [1] 0.8201965

The company needs at least 8 employees to show up to a shift to operate normally. Therefore, finding the complement of less than 8 employees showing up yields a value of 0.8201965. This means that the probability of 8 employees or more showing up for this shift is 0.8201965 or approximately 82%.

Another way to calculate this without using the complement is:

pbinom(q = 7, size = 10, prob = 0.85, lower.tail = FALSE)

## [1] 0.8201965

Substituting “lower.tail = FALSE” into the original function yields the probability of 8 or more employees showing up to this shift.

Explain what this probability means in the context of scheduling workers.

In the context of scheduling workers, it is necessary for the company to have at least 8 employees show up for a shift in order to operate normally. This means if 10 employees are scheduled for a shift, the probability of the company having enough employees show up to the shift and having enough employees to operate normally is 82%. The probability of the company being short-staffed is 18%.

Management wants at least a 95% chance of having enough workers. Should they schedule more than 10 employees? Explain your reasoning.

pbinom(q = 7, size = 12, prob = 0.85, lower.tail = FALSE)

## [1] 0.9760781

pbinom(q = 7, size = 12, prob = 0.85, lower.tail = TRUE)

## [1] 0.02392191

If management wants at least a 95% chance of having enough workers, they should schedule more than 10 employees. Increasing the amount of employees scheduled for a shift increases the probability of 8 employees or more showing up to this shift. Statistically speaking, this makes sense because increasing the number of trials increases the probability of obtaining 8 successes or more. Based on the calculations above, scheduling 12 employees would increase the probability of 8 employees or more showing up to a shift to approximately 97.6%. Additionally, the probability of fewer than 8 employees showing up to a shift decreases to approximately 2.4%.

Question 6

ACT scores are approximately normally distributed where: X ∼ N(21,5) a. Use R to simulate 10,000 ACT scores.

set.seed(123)
ACT <- rnorm(n = 10000, mean = 21, sd = 5)

Find what percent of your simulated ACT scores were above 30

hist(ACT)

mean(ACT > 30) * 100

## [1] 3.54

Now compute the theoretical probability of getting an ACT score above 30 using pnorm().

pnorm(q = 30, mean = 21, sd = 5, lower.tail = FALSE)

## [1] 0.03593032

The theoretical probability of getting an ACT score above 30 is 0.03593032 or 3.59%.

Compare the two values. Why are they similar but not identical?

The two values are similar but not identical because the theoretical probability calculates the normal distribution based on the ideal mean and the ideal standard deviation. In contrast, the simulated normal distribution uses the observed values in the data set for the mean and the standard deviation.

Question 7

Create your own real-world situation that could be modeled using either a binomial distribution or a normal distribution.

Your problem must include: * A description of the situation * Identification of reasonable parameters (mean, sd OR n, p) * One probability calculation in R * A written interpretation of the result

Examples might include: * basketball free throws * weather events * exam scores * products being defective

Based on short-term data, Micron Technology (MU) stock has a 70% probability of having an upward price increase daily. Over a 22 trading day period, what is the probability that Micron Technology (MU) will have 15 or more days of an upward price increase?

pbinom(q = 14, size = 22, prob = 0.7, lower.tail = FALSE)

## [1] 0.6712507

The probability of Micron Technology (MU) experiencing an upward price increase 15 or more days out of the 22 day period is approximately 67.1%.