Project #3 - Normal and Binomial Distributions

Purpose

In this project, students will demonstrate their understanding of probability and the normal and binomial distributions.

Question 1

Assume IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. If a person is randomly selected, find each of the requested probabilities. Here, x, denotes the IQ of the randomly selected person.

P(x > 65)

#Probability of a randomly selected person scoring greater than 65 IQ score.  
pnorm(q = 65, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 0.9901847

99.02% rounded to the nearest hundredth,

P(x < 150)

#Probability of a randomly selected person scoring less than 150 IQ score. 
pnorm(q = 150, mean = 100, sd = 15)

## [1] 0.9995709

99.96% probability rounded to two decimal places.

Question 2

Assume the same mean and standard deviation of IQ scores that was described in question 1.

A high school offers a special program for gifted students. In order to qualify, students must have IQ scores in the top 5%. What is the minimum qualifying IQ?

#To find the minimum qualifying IQ in the top 5%.
qnorm(p = 0.05, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 124.6728

If one person is randomly selected, what is the probability that their IQ score is greater than 110?

#To find the probability that a randomly selected person will score an IQ greater than 110.
pnorm(q = 110, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 0.2524925

This probability is 25.25%, rounded to two decimal places.

Question 3

Still using the mean and standard deviation from question 1, what is the z-score for an IQ of 140?
Considering that the z-score is computed by dividing mean by standard deviation, for an IQ of 140 we can find it as follows…

#store the mean and standard deviation of the IQ scores...
IQ_mean <- c(100)
IQ_sd <- c(15)

#Use stored values to compute the z-score.
(140 - IQ_mean)/IQ_sd

## [1] 2.666667

We mentioned in week 6 that a data value is considered “unusual” if it lies more than two standard deviations from the mean. Is an IQ of 140 considered unusual?
Considering that the standard deviation for IQ scores is 15 and the mean being 100, mental math tells us that 2 * 15 is 30 and 100 + 30 = 130, 140 < 130 so an IQ score of 140 is considered unusual.
What is the probability of getting an IQ greater than 140?

#To find the probability of getting an IQ greater than 140.
pnorm(q = 140, mean = 100, sd = 15, lower.tail = FALSE)

## [1] 0.003830381

The probability of getting an IQ greater than 140 is .383%.

Question 4

You are taking a 15-question multiple choice quiz and each question has 5 options (a,b,c,d,e) and you randomly guess every question.

How many questions do you expect to answer correctly on average?

#Expected average = the mean.
15 * .2

## [1] 3

What is the probability that you get every question correct?

#To find the probability of 15/15.
dbinom(x = 15, size = 15, prob = .2)

## [1] 3.2768e-11

What is the probability that you get every question incorrect?

#To find the probability of 0/15.
dbinom(x = 0, size = 15, prob = .2)

## [1] 0.03518437

3.5184% chance that all questions are answered incorrectly.

Question 5

Consider still the 15-question multiple choice quiz that each question has 5 options (a,b,c,d,e) and you randomly guess every question.

How many questions does one need to answer correctly in order score exactly a 60%?

#We can find this simply by multipliying the percentage to the number of questions.
15 * .6

## [1] 9

If a grade of 60% or lower is considered failing, then what is the probability of you failing?

#To find the probability of scoring a 60% or lower. 
pbinom(q = 9, size = 15, prob = .2)

## [1] 0.9998868

If you need a grade of 80% or higher on this quiz to maintain a passing grade, what is the probability of you maintaining that passing grade?

#To find the probability of scoring an 80% or higher. 

#First let's find the amount of questions that denote a score of 70%...
15 * .7

## [1] 10.5

#With the previous result in mind, let's use the value in the pbinom function. 
pbinom(q = 10.5, size = 15, prob = .2, lower.tail = FALSE)

## [1] 1.24617e-05

Question 6

Suppose you own a catering company. You hire local college students as servers. Not being the most reliable employees, there is an 80% chance that any one server will actually show up for a scheduled event. For a wedding scheduled on Saturday, you need at least 5 servers.

Suppose you schedule 5 employees, what is the probability that all 5 come to work?

#To find the probability that all 5 show up to work. 
dbinom(x = 5, size = 5, prob = .8)

## [1] 0.32768

Suppose you schedule 7 employees, what is the probability that at least 5 come to work?

#To find the probability that at least 5 of the 7 come to work. 
pbinom(q = 4, size = 7, prob = .8, lower.tail = FALSE)

## [1] 0.851968

It is really important that you have at least 5 servers show up! How many employees should you schedule in order to be 99% confident that at least 5 show up? Hint: there is no single formula for the answer here, so maybe use some kind of trial and error method.

#We will attempt to find the confidence interval that will tell us the range of average employess that should be scheduled in order to be 99% confident that at least 5 show up. 

# Define the parameters
p <- 0.8  # Probability that each server shows up
min_servers_needed <- 5  # Minimum number of servers needed
confidence_level <- 0.99  # Confidence level

# Function to calculate the probability of at least k successes in n trials
prob_at_least_k <- function(k, n, p) {
  sum(dbinom(k:n, n, p))
}

# Function to find the number of employees needed to be confident that at least k servers show up
find_min_employees <- function(k, p, confidence_level) {
  n <- k  # Start with the minimum number of servers needed
  while (TRUE) {
    probability <- prob_at_least_k(k, n, p)
    if (probability >= confidence_level) {
      return(n)
    }
    n <- n + 1
  }
}

# Calculate the number of employees needed
min_employees <- find_min_employees(min_servers_needed, p, confidence_level)
cat("You should schedule at least", min_employees, "employees to be 99% confident that at least", min_servers_needed, "servers show up.")

## You should schedule at least 10 employees to be 99% confident that at least 5 servers show up.

Yes, it is important to have at least 5 serves show up.

Question 7

Generate a random sample of 10,000 numbers from a normal distribution with mean of 51 and standard deviation of 7. Store that data in object called rand_nums.

#Function to generate the random sample. 
rand_nums <- rnorm(n = 10000, mean = 51, sd = 7)

Create a histogram of that random sample.

#Function to create histogram of rand_nums.
hist(rand_nums)

Question 8

How many values in your rand_nums vector are below 40?

# Example vector
my_vector <- c(rand_nums)

# Threshold
threshold <- 40

# Count the number of values below the threshold
count_below_threshold <- sum(my_vector < threshold)

# Output the result
cat("Number of values below the threshold:", count_below_threshold)

## Number of values below the threshold: 599

For a theoretical normal distribution, how many of those 10,000 values would you expect to be below 40?
I would expect around 10% of values to be below 40% in a theorectical normal distribution.
Is your answer in part a reasonably close to your answer in part b? My answer is reasonably close to my answer in part b as 10% of 10000 is 1000, and my product for part b is 567 numbers below the threshold. I believe this lower quartile in the normal distribution floats around 5% to 10%.