Project #3 - Normal and Binomial Distributions

Purpose

In this project, students will demonstrate their understanding of probability and the normal and binomial distributions.

Question 1

IQ scores are approximately normally distributed with: X ∼ N(μ=100,σ=15)

What proportion of the population has an IQ greater than 65? Interpret the result in context in a complete sentence.

pnorm(q=65, mean=100, sd=15, lower.tail=FALSE)

## [1] 0.9901847

Using distributive function (pnorm) to calculate the % of the population with an IQ score above 65. Using the average IQ of 100, and a standard deviation of 15, and excluding values below 65,over ninety-nine percent of the population has an IQ score above 65

What IQ score represents the top 5% of the population? Explain in a sentence what this value means in plain language.

qnorm(p=0.95, mean=100, sd=15,)

## [1] 124.6728

The (qnorm) function finds a value within a certain percentile of the given data using probability. Setting p to 0.95 shows the score in the 95th percentile of the data. Filing in the function with the mean score of 100 and standard deviation of 15 gives the IQ score of the 95th percentile, 124.68.

This value is the minimum score required to be in the top 5% of IQ scores. ## Question 2 Recall our definition: A value is considered unusual if it lies more than two standard deviations from the mean.

Find the IQ values that mark the lower and upper bounds of the “usual” range.

#lower range 
100-30

## [1] 70

#upper range 
100+30

## [1] 130

Using the mean score of 100, and multiplying the standard deviation of 15 by two, then adding and subtracting from the mean, we obtain our lower and uper bounds for the “ususal” IQ score, 70 and 130. b. What proportion of the population falls outside this range?

pnorm(130,100,15)-pnorm(70,100,15)

## [1] 0.9544997

1 - 0.9544997

## [1] 0.0455003

Using pnorm to calculate the proportion of the population that is inside the usual range of values, and subtracting from the whole proportion (1) shows that 4.55% of individuals have IQ scores outside this range.

Question 3

Two students took different standardized tests.

Alex took the SAT and scored 1650. Taylor took the ACT and scored 27.

Assume the distributions:
SAT∼N(1500,300) ACT∼N(21,5)

Compute the z-score for each student.

(1650-1500)/300

## [1] 0.5

(27-21)/5

## [1] 1.2

Z- score is calculated by taking the data value, subtracting it from the mean, and divide by the standard deviation. For SAT scores, the mean is 1500 with a standard deviatin of 300, and ACT scores have a mean of 21 w=and standard deviation of 5.
b. Which student performed better relative to other test-takers?
Taylor performed better relative to other test takers as her score is a gretaer number of standard deviations away from the average.
c. Explain why comparing the raw scores alone would be misleading. Comparing raw scores wouldn’t work because of the size discrepancy in the scores. SATs have a greater range of scores that are possible, ACTs will have less variance in the score possibilities.

Question 4

You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d) and you randomly guess every question.
n=15 p=0.2 p’=0.8 a. How many questions do you expect to answer correctly on average?
E(X)=np

15*0.2

## [1] 3

n represents the 15 questions on the quiz, and we there is a 20% chance answering a question correct by guessing, so we expect to get three questions correct if all 15 questions are answered via guessing.
b. What is the probability that you get every question correct?

dbinom(x=15, size=15, prob=0.2)

## [1] 3.2768e-11

(Dbinom) illustrates probabilities using our original probability of getting one question right combined with total number of questions and how many questions are correct. It is statistically almost impossible to get all questions correct when guessing.

What is the probability that you get every question incorrect?

dbinom(x=0, size=15, prob=0.2)

## [1] 0.03518437

Adjusting our for getting all questions incorrect (x=0) gives a significantly more likely possibilty, although still unlikely.

What is the probability of getting exactly 10 questions correct?

dbinom(x=10, size=15, prob=0.2,)

## [1] 0.000100764

Further adjusting of how many questions are correct.

What is the probability of getting 10 or more correct answers?

1-pbinom(q=9, size=15, prob=0.2)

## [1] 0.0001132257

Probability of at least 10/15, or better

Suppose a student claims they guessed randomly but got 10 out of 15 correct. Based on your probability above, do you believe this claim? Explain your reasoning. (There is no single correct answer, but your reasoning must use the probability you calculated.)

 0.03518437/ 0.000100764

## [1] 349.176

You are almost 350x more likely to get all questions wrong than getting 10/15 correct via guessing.

If you need a grade of 80% or higher on this quiz to maintain a passing grade, what is the probability of you maintaining that passing grade?

1-pbinom(q=12, size=15, prob=0.2)

## [1] 5.704909e-08

probability of getting at least 12 questions correct.

Question 5

A company schedules 10 employees for a shift. Each employee independently shows up with probability: p = 0.85

Let X = number of employees who show up

The company needs at least 8 workers to operate normally.

What is the probability that fewer than 8 employees show up?

pbinom(q=7, size=10, prob=0.85)

## [1] 0.1798035

Probability that 7 or less employees show up is approx. 18%. b. What is the probability the company has enough workers for this shift?

1-pbinom(q=7, size=10, prob=0.85)

## [1] 0.8201965

subtract the odds of fewew than7 employees showing up from 100% (1)~82%. c. Explain what this probability means in the context of scheduling workers. About 82 percent of the time, enough employees will show up for the company to function normally, while about 18 percent of the time they will be understaffed. d. Management wants at least a 95% chance of having enough workers. Should they schedule more than 10 employees? Explain your reasoning. Their current probabilty of 0.82 is below managements expected probability of 0.95. The risk for understaffing is still too high, so there should be more than 10 employees scheduled.

Question 6

ACT scores are approximately normally distributed where: X ∼ N(21,5) a. Use R to simulate 10,000 ACT scores.

ACT<-rnorm(n=10000, mean=21,sd=5)
hist(ACT)

Find what percent of your simulated ACT scores were above 30

mean(ACT)

## [1] 20.9982

sd(ACT)

## [1] 5.063722

pnorm(q=30, mean=21.05569, sd= 5.01006 )

## [1] 0.9628912

1-pnorm(q=30, mean=21.05569, sd= 5.01006 )

## [1] 0.03710876

First, find the probability of ACT score below 30, and subtract that value from 100%.

Now compute the theoretical probability of getting an ACT score above 30 using pnorm().

1-pnorm(q=30,mean=21,sd=5)

## [1] 0.03593032

Compare the two values. Why are they similar but not identical? The mean and std. deviation from the table generated are slightly different from the mean and std. dev used to generate the table.

Question 7

Create your own real-world situation that could be modeled using either a binomial distribution or a normal distribution.

Your problem must include: * A description of the situation * Identification of reasonable parameters (mean, sd OR n, p) * One probability calculation in R * A written interpretation of the result

Examples might include: * basketball free throws * weather events * exam scores * products being defective

Wilyer Abreu averages a home run in about 20 percent of his at bats, and in at typical game he gets five at bats

We can model the number of home runs he’ll hit in that game

Binomial distribution because Fixed number of trials (at bats) At-bat is either a home run or not Constant probability (n=5, p=0.2)

Probability he hits 3 home runs in a game?

1-pbinom(3, size=5, prob=0.2)

## [1] 0.00672

There is a 0.6 percent chance Abreyu will hit 3 home runs in 5 at bats, which is highly unlikely, but can be achieved if he has a statistically unusual game.