Project #3 - Normal and Binomial Distributions

Purpose

In this project, students will demonstrate their understanding of probability and the normal and binomial distributions.

Question 1

IQ scores are approximately normally distributed with: X ∼ N(μ=100,σ=15)

What proportion of the population has an IQ greater than 65? Interpret the result in context in a complete sentence.

# Finds percentage of those with IQ above 65
pnorm(q = 65, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 0.9901847

99% of the population has an IQ greater than 65.

What IQ score represents the top 5% of the population? Explain in a sentence what this value means in plain language.

# Finds the IQ value of the top 95th percentile
qnorm(p = 0.95, mean = 100, sd = 15)

## [1] 124.6728

An IQ of 124.67 represents the top 5% of the population.

Question 2

Recall our definition: A value is considered unusual if it lies more than two standard deviations from the mean.

Find the IQ values that mark the lower and upper bounds of the “usual” range.

# Finds the values two SDs above and below mean
100 + (15*2)

## [1] 130

100 - (15*2)

## [1] 70

Anything above 130 or below 70 is considered an usual IQ.

What proportion of the population falls outside this range?

# Finds percentage of total unusual IQ values
pnorm(q = 70, mean = 100, sd = 15) + pnorm(q = 130, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 0.04550026

4.6% of the population has an unusual IQ.

Question 3

Two students took different standardized tests.

Alex took the SAT and scored 1650. Taylor took the ACT and scored 27.

Assume the distributions:
SAT∼N(1500,300) ACT∼N(21,5)

Compute the z-score for each student.

# Finds the z-score for each
(1650 - 1500) / 300

## [1] 0.5

(27 - 21) / 5

## [1] 1.2

Which student performed better relative to other test-takers?

Taylor performed better relatively, as a higher z-score means the score is more outstanding compared to the average.

Explain why comparing the raw scores alone would be misleading.

The range of ACT scores is much lower than that of SAT scores. While Alex scored 150 points above average compared to Taylor’s 6 points, the average score of the SAT being 1500 means this higher value is not necessarily as notable.

Question 4

You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d,e) and you randomly guess every question.

How many questions do you expect to answer correctly on average?

Approximately 1 in 5, or 3 total questions correct.

What is the probability that you get every question correct?

# Probability of getting 15/15
0.2^15

## [1] 3.2768e-11

dbinom(x = 15, size = 15, prob = 0.2)

## [1] 3.2768e-11

Approximately 3.2768e-9%.

What is the probability that you get every question incorrect?

# Probability of getting 0/15
dbinom(x = 0, size = 15, prob = 0.2)

## [1] 0.03518437

Approximately 3.5%.

What is the probability of getting exactly 10 questions correct?

# Probability of getting 10/15
dbinom(x = 10, size = 15, prob = 0.2)

## [1] 0.000100764

Approximately 0.01%.

What is the probability of getting 10 or more correct answers?

# Probability of getting 10 or more correct
pbinom(q = 9, size = 15, prob = 0.2, lower.tail = FALSE)

## [1] 0.0001132257

1 - pbinom(q = 9, size = 15, prob = 0.2)

## [1] 0.0001132257

Approximately 0.011%.

Suppose a student claims they guessed randomly but got 10 out of 15 correct. Based on your probability above, do you believe this claim? Explain your reasoning. (There is no single correct answer, but your reasoning must use the probability you calculated.)

It’s possible that someone could get 10/15 by guessing, although not very realistic. It is about a 1 in 10,000 chance, so it can definitely happen to SOMEONE, just not very often at all.

If you need a grade of 80% or higher on this quiz to maintain a passing grade, what is the probability of you maintaining that passing grade?

# Probability of scoring 80% or higher
15 * 0.8

## [1] 12

pbinom(q = 11, size = 15, prob = 0.2, lower.tail = FALSE)

## [1] 1.011253e-06

If guessing, the probably of maintaining a passing grade is 1.011253e-04% (very small).

Question 5

A company schedules 10 employees for a shift. Each employee independently shows up with probability: p = 0.85

Let X = number of employees who show up

The company needs at least 8 workers to operate normally.

What is the probability that fewer than 8 employees show up?

# Probability of that than 8 employees show up
pbinom(q = 7, size = 10, prob = 0.85)

## [1] 0.1798035

There is an 18% chance that fewer than 8 (7 or less) show up.

What is the probability the company has enough workers for this shift?

# Probably of 8 or more employees
pbinom(q = 7, size = 10, prob = 0.85, lower.tail = FALSE)

## [1] 0.8201965

There is an 82% chance that the company has enough workers.

Explain what this probability means in the context of scheduling workers. It means that there is a level of unreliability with workers showing up, and that there is a risk of not having enough employees.
Management wants at least a 95% chance of having enough workers. Should they schedule more than 10 employees? Explain your reasoning. They should schedule more than 10 employees. By accounting for this unreliability with more employees, more on average will show up if more are scheduled and reduce the risk.

Question 6

ACT scores are approximately normally distributed where: X ∼ N(21,5)

Use R to simulate 10,000 ACT scores.

# Generates 10,000 ACT scores
ACTscores <- rnorm(n = 10000, mean = 21, sd = 5)

Find what percent of your simulated ACT scores were above 30

pnorm(q = 30, mean = mean(ACTscores), sd = sd(ACTscores), lower.tail = FALSE)

## [1] 0.03564294

About 3.67% of scores are above 30.

Now compute the theoretical probability of getting an ACT score above 30 using pnorm().

pnorm(q = 30, mean = 21, sd = 5, lower.tail = FALSE)

## [1] 0.03593032

About 3.59% of scores are above 30.

Compare the two values. Why are they similar but not identical?

There are slight variations in the means and standard deviations of the simulated and theoretical scores, leading to slightly different values.

Question 7

Create your own real-world situation that could be modeled using either a binomial distribution or a normal distribution.

Your problem must include: * A description of the situation * Identification of reasonable parameters (mean, sd OR n, p) * One probability calculation in R * A written interpretation of the result

Examples might include: * basketball free throws * weather events * exam scores * products being defective

A pizza place will give you the pizza for free if it takes longer than 25 minutes to deliver. You live an average of 16 minutes away from the pizza place, with a standard deviation of 5 minutes. What is the probability of getting a free pizza?

# Probability of delivery taking longer than 20 minutes
pnorm(q = 25, mean = 16, sd = 5, lower.tail = FALSE)

## [1] 0.03593032

There is approximately a 3.59% chance of getting a free pizza, or about every 28 orders.