HW 3 - Distributions

Preparing Workspace

rm(list = ls()) # Clear environment
gc()            # Clear unused memory

##          used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
## Ncells 535960 28.7    1196600   64         NA   669428 35.8
## Vcells 993890  7.6    8388608   64      16384  1851741 14.2

cat("\f")       # Clear the console

if(!is.null(dev.list())) dev.off() # Clear all plots

## null device 
##           1

Setting WD

setwd("/Users/josephmancuso/Documents/BC/Spring'24/Week 3/HW")
getwd()

## [1] "/Users/josephmancuso/Documents/BC/Spring'24/Week 3/HW"

Question 1: A researcher wishes to conduct a study of the color preferences of new car buyers. Suppose that 50% of this population prefers the color red. If 20 buyers are randomly selected, what is the probability that between 9 and 12 (both inclusive) buyers would prefer red?

#defining parameters
n.1 <- 20 # number of observations
p.red <- .5 #probability of success
x.1 <- 9:12 # number of successes

#plotting distribution
plot(x = 0:n.1, 
     y = dbinom(x = 0:n.1, size = n.1, prob = p.red),
     type = "h",
     col = "green",
     main = "Binomial Distribution - Car Color Preferences",
     sub = "n = 20, p = .5",
     xlab = "Buyers Who Prefer Red Cars",
     ylab = "Probability"
     )

#calculating probabilites for each value of x between 9-12
probabilities.1 <- dbinom(x = x.1,
                          size = n.1,
                          prob = p.red)

round(sum(probabilities.1),4)

## [1] 0.6167

There is a .6167 probability that between 9 and 12 people prefer a red car from a random sample of 20 new car buyers. Probability was computed using the Probability Mass Function.

Question 2: A quality control inspector has drawn a sample of 13 light bulbs from a recent production lot. Suppose 20% of the bulbs in the lot are defective. What is the probability that less than 6 but more than 3 bulbs from the sample are defective?

#defining parameters
n.2 <- 13 #number of observations
p.defective <- .2 #probability of success
x.2 <- 4:5 # number of successes

#plotting distribution
plot(x = 0:n.2, 
     y = dbinom(x = 0:n.2, size = n.2, prob = p.defective),
     type = "h",
     col = "green",
     main = "Binomial Distribution - Defective Lightbulbs",
     sub = "n = 13, p = .2",
     xlab = "Number of Defective Lightbulbs",
     ylab = "Probability"
     )

#calculating probabilites for each value of x less than 6 and more than 3
probabilities.2 <- dbinom(x = x.2,
                          size = n.2,
                          prob = p.defective)

round(sum(probabilities.2),4)

## [1] 0.2226

There is a .2226 probability that more than 3 and less than 6 lightbulbs are defective from the production lot. Probability was computed using the Probability Mass Function.

Question 3: The auto parts department of an automotive dealership sends out a mean of 4.2 special orders daily. What is the probability that, for any day, the number of special orders sent out will be no more than 3?

#defining parameters
lambda <- 4.2

#plotting distribution
plot(x = 0:12, 
     y = dpois(x = 0:12, lambda = lambda),
     type = "h",
     col = "green",
     main = "Poisson Distribution - Daily Special Orders",
     sub = "lambda = 4.2",
     xlab = "Number of Special Orders",
     ylab = "Probability"
     )

#calculating probabilities
probabilities.3 <- dpois(x = 0:3, lambda = lambda)
round(sum(probabilities.3),4)

## [1] 0.3954

There is a .3954 probability that there will be no more than 3 special orders in a given day. Probability was computed using the Probability Mass Function.

Question 4: A pharmacist receives a shipment of 17 bottles of a drug and has 3 of the bottles tested. If 6 of the 17 bottles are contaminated, what is the probability that less than 2 of the tested bottles are contaminated?

#defining parameters
shipment.size <- 17
q <- 1 #number of shipments
m <- 6 #number of contaminated bottles (successes)
n.3 <- shipment.size - m #number of uncontaminated bottles (failures)
k <- 3 #number of bottles tested (sample size)

#plotting distribution
plot(x = 0:k, 
     y = dhyper(x = 0:k,
                m = m,
                n = n.3,
                k = k),
     type = "h",
     col = "green",
     main = "Hypergeometric Distribution - Contaminated bottles",
     xlab = "Number of Contaminated Bottles",
     ylab = "Probability"
     )

#calculating probabilities
probabilities.4 <- dhyper(x = 0:1,
                          m = m,
                          n = n.3,
                          k = k)

round(sum(probabilities.4),4)

## [1] 0.7279

There is a .7279 probability that less than 2 of the tested bottles will be contaminated. Probability was computed using the Probability Mass Function.

Question 5: A town recently dismissed 6 employees in order to meet their new budget reductions. The town had 6 employees over 50 years of age and 19 under 50. If the dismissed employees were selected at random, what is the probability that more than 1 employee was over 50?

#defining parameters
total.employees <- 19 + 6
m <- 6 #employees 50+ years old (successes)
n.4 <- total.employees - m #employees under 50 years old (failures)
k <- 6 #number of dismissed employees (sample size)

#plotting distributions
plot(x = 0:k, 
     y = dhyper(x = 0:k,
                m = m,
                n = n.4,
                k = k),
     type = "h",
     col = "green",
     main = "Hypergeometric Distribution - Dismissed Employees",
     xlab = "Number of Dismissed Employees over 50 Years Old",
     ylab = "Probability"
     )

#calculating probabilty that more than one dimissed employee was over 50 years old
probabilities.5 <- dhyper(x = 2:6,
                          m = m,
                          n = n.4,
                          k = k)

round(sum(probabilities.5),4)

## [1] 0.4529

There is a .4529 probability that more than one of the six dismissed employees was over the age of 50. Probability was computed using the Probability Mass Function.

Question 6: The weights of steers in a herd are distributed normally. The variance is 90,000 and the mean steer weight is 800 lbs. Find the probability that the weight of a randomly selected steer is between 1040 and 1460 lbs.

#defining parameters
mu.1 <- 800 #mean
std.dev.1 <- sqrt(90000) #squaring variance to determine standard deviation

#plotting distribution
x <- seq(from =  mu.1 - 3*std.dev.1,
         to   =  mu.1 + 3*std.dev.1,
         length.out =  1000)

plot(x = x, 
     y = dnorm(x = x, mean = mu.1, sd = std.dev.1),
     type = 'l', 
     col  = 'Green', 
     lwd  = 2, 
     main = "Normal Distribution - Steer Weight",
     sub = "mean = 800, sd = 300",
     xlab = "Steer Weight (pounds)",
     ylab = "Probability",
     )
     abline(v = mu.1, col = 'red', lty = 2)

#calculating probabilities
upper.range <- 1460
lower.range <- 1040

probabilities.6 <- pnorm(q = upper.range, mean = mu.1,sd = std.dev.1) - pnorm(q = lower.range, mean = mu.1,sd = std.dev.1)

round(sum(probabilities.6),4)

## [1] 0.198

There is a .198 probability of randomly selecting a steer between 1,040 and 1,460 pounds. Probability was computed using the Cumulative Density Function given we are attempting to solve for a random variable, X, that falls within the interval 1040 to 1460 pounds.

Question 7: The diameters of ball bearings are distributed normally. The mean diameter is 106 millimeters and the standard deviation is 4 millimeters. Find the probability that the diameter of a selected bearing is between 103 and 111 millimeters.

#defining parameters
mu.2 <- 106 #mean
std.dev.2 <- 4

#plotting distribution
x <- seq(from =  mu.2 - 3*std.dev.2,
         to   =  mu.2 + 3*std.dev.2,
         length.out =  1000)
  
plot(x = x, 
     y = dnorm(x = x, mean = mu.2, sd = std.dev.2),
     type = 'l', 
     col  = 'Green', 
     lwd  = 2, 
     main = "Normal Distribution - Ball Bearing Diameter",
     sub = "mean = 106, sd = 4",
     xlab = "Ball Bearing Diameter (mm)",
     ylab = "Probability",
     )
     abline(v = mu.2, col = 'red', lty = 2)

#calculating probabilities
upper.range <- 111
lower.range <- 103

probabilities.7 <- pnorm(q = upper.range,mean = mu.2,sd = std.dev.2) - pnorm(q = lower.range,mean = mu.2,sd = std.dev.2)

round(sum(probabilities.7),4)

## [1] 0.6677

There is a .6677 probability of randomly selecting a ball bearing with a diameter between 103 and 111 millimeters. Probability was computed using the Cumulative Density Function given we are attempting to solve for a random variable, X, that falls within the interval 103 to 111 millimeters.

Question 8: The lengths of nails produced in a factory are normally distributed with a mean of 3.34 centimeters and a standard deviation of 0.07 centimeters. Find the two lengths that separate the top 3% and the bottom 3%. These lengths could serve as limits used to identify which nails should be rejected.

Round your answer to the nearest hundredth (2 decimal places), if necessary. You will have to use the quantile function1, qnorm() here. In fact, we have seen a little bit of quintiles already when we talked about median and boxplots.

#defining parameters
mu.3 <- 3.34 
std.dev.3 <- .07

#plotting distribution
x <- seq(from =  mu.3 - 3*std.dev.3,
         to   =  mu.3 + 3*std.dev.3,
         length.out =  1000)

plot(x = x, 
     y = dnorm(x = x, mean = mu.3, sd = std.dev.3),
     type = 'l', 
     col  = 'Green', 
     lwd  = 2, 
     main = "Normal Distribution - Nail Length",
     sub = "mean = 3.34, sd = .07",
     xlab = "Nail Length (cm)",
     ylab = "Probability",
     )
     abline(v = mu.3, col = 'red', lty = 2)
     abline(v = qnorm(p = .03, mean = mu.3, sd = std.dev.3), col = "blue")
     abline(v = qnorm(p = .97, mean = mu.3, sd = std.dev.3), col = "blue")

#calculating 3rd percentile
p.third.percentile <- .03
third.percentile <- qnorm(p = p.third.percentile, mean = mu.3, sd = std.dev.3)

#calculating 97th percentile
p.ninetyseventh.percentile <- .97
ninetyseventh.percentile <- qnorm(p = p.ninetyseventh.percentile, mean = mu.3, sd = std.dev.3)

#calculating limits
p <- round(c(third.percentile, ninetyseventh.percentile),2)
print(p)

## [1] 3.21 3.47

The top 3% and bottom 3% of nails produced by the factory are 3.21cm and 3.47cm respectively. The factory should reject any nail less than 3.21cm in length and over 3.47cm in length.

Question 9: A psychology professor assigns letter grades on a test according to the following scheme.

A: Top 9% of scores

B: Scores below the top 9% and above the bottom 63%

C: Scores below the top 37% and above the bottom 17%

D: Scores below the top 83% and above the bottom 8%

F: Bottom 8% of scores

Scores on the test are normally distributed with a mean of 75.8 and a standard deviation of 8.1.

Find the minimum score required for an A grade. Round your answer to the nearest whole number, if necessary.

#defining parameters
mu.4 <- 75.8 
std.dev.4 <- 8.1

#plotting distribution
x <- seq(from =  mu.4 - 3*std.dev.4,
         to   =  mu.4 + 3*std.dev.4,
         length.out =  1000)

plot(x = x, 
     y = dnorm(x = x, mean = mu.4, sd = std.dev.4),
     type = 'l', 
     col  = 'Green', 
     lwd  = 2, 
     main = "Normal Distribution - Psychology Test Scores",
     sub = "mean = 75.8, sd = 8.1",
     xlab = "Test Scores",
     ylab = "Probability",
     )
     abline(v = mu.4, col = 'red', lty = 2)

#calculating minimum score
A.score <- .91

minimum.score <- qnorm(p = A.score, mean = mu.4, sd = std.dev.4)
round(minimum.score,0)

## [1] 87

The minimum score required for an A is 87.

Question 10: Consider the probability that exactly 96 out of 155 computers will not crash in a day. Assume the probability that a given computer will not crash in a day is 61%. Approximate the (binomial) probability using the normal distribution.

#defining parameters
mu.5 <- .61 * 155 #mean = n * p
std.dev.5 <- sqrt((mu.5*(1-.61))) #sd = square root of n*p(1-p)

#plotting distribution
x <- seq(from =  mu.5 - 3*std.dev.5,
         to   =  mu.5 + 3*std.dev.5,
         length.out =  1000)

plot(x = x, 
     y = dnorm(x = x, mean = mu.5, sd = std.dev.5),
     type = 'l', 
     col  = 'Green', 
     lwd  = 2, 
     main = "Normal Distribution - Computer Performance",
     sub = "mean = 94.55 sd = 6.07",
     xlab = "Number of Working Computers",
     ylab = "Probability",
     )
     abline(v = mu.5, col = 'red', lty = 2)

#calculating probability
x <- 96
probability.10 <- dnorm(x = x, mean = mu.5, sd = std.dev.5)

round(probability.10,4)

## [1] 0.0639

There is a .0639 probability that exactly 96 out of 155 computers will not crash in a given day. Probability was computed using the Probability Density Function given we are attempting to solve for a specific value of the random variable, X.