Q 1: The 2010 U.S. Census found the chance of a household being a certain size. The data is in the table (”Households by age,” 2013). Draw a histogram of the probability distribution. Size of household 1 2 3 4 5 6 7 or more Probability 26.7% 33.6% 15.8% 13.7% 6.3% 2.4% 1.5%

  1. State the random variable of interest for the context given in the question
  1. Draw a histogram where the x values are on the horizontal axis and are the x values of the classes and the probabilities are on the vertical axis with appropriate labeling of the entire graph.
# Household sizes
x = c("1","2","3","4","5","6","7 or more")

# Probabilities (convert % to decimals)
p = c(26.7, 33.6, 15.8, 13.7, 6.3, 2.4, 1.5) / 100

# "7 or more" is 7
x_num = as.numeric(as.character(replace(x, x == "7 or more", "7")))

# Draw histogram (bar plot for discrete distribution)
bp = barplot(p,
        names.arg = x,
        xlab = "Household size",
        ylab = "Probability",
        main = "Probability Distribution: Household Size (2010 U.S. Census)",
        ylim = c(0, max(p) + 0.05),  # add extra space at the top
        col = "lightblue")

# Add probability labels on top
text(x = bp, y = p, labels = round(p, 3), pos = 3)

  1. Find the mean of the distribution given above.
mean_x = sum(x_num * p)
mean_x
## [1] 2.525
  1. Find the variance of the distribution above.
variance_x = sum((x_num - mean_x)^2 * p)
variance_x
## [1] 2.023375
  1. Find the standard deviation of the random variable.
sd_x = sqrt(variance_x)
sd_x
## [1] 1.422454

Q 2: Eyeglassomatic manufactures eyeglasses for different retailers. The number of days it takes to fix defects in an eyeglass and the probability that it will take that number of days are in the table. Table 1: Number of Days to Fix Defects Number of Days Probabilities 1 24.90% 2 10.80% 3 9.10% 4 12.30% 5 13.30% 6 11.40% 7 7.00% 8 4.60% 9 1.90% 10 1.30% 11 1.00% 12 0.80% 13 0.60% 14 0.40% 15 0.20% 16 0.20% 17 0.10% 18 0.10%

  1. State the random variable.
  1. Draw a histogram of the number of days to fix defects.
# x as number of days to Fix Defects
x = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18)

# Probabilities (convert % to decimals)
p = c(24.90, 10.80, 9.10, 12.30, 13.30, 11.40, 7.00, 4.60, 1.90, 1.30, 1.00, 0.80, 0.60, 0.40, 0.20, 0.20, 0.10, 0.10
) / 100

# Draw histogram (bar plot for discrete distribution)
bp = barplot(p,
             names.arg = x,
             xlab = "Num days to Fix Defects",
             ylab = "Probability",
             main = "Probability Distribution: Number of Days to Fix Defects",
             ylim = c(0, max(p) + 0.05),  # add extra space at the top
             col = "lightblue")

# Add probability labels on top
text(x = bp, y = p, labels = round(p, 2), pos = 3, cex = 0.8)

  1. Find the mean number of days to fix defects.
mean_x = sum(x * p)
mean_x
## [1] 4.175
  1. Find the variance for the number of days to fix defects.
variance_x = sum((x - mean_x)^2 * p)
variance_x
## [1] 8.414375
  1. Find the standard deviation for the number of days to fix defects.
sd_x = sqrt(variance_x)
sd_x
## [1] 2.900754
  1. Find probability that a lens will take at least 16 days to make a fix the defect.
prob_at_least_16 <- sum(p[x >= 16])
prob_at_least_16
## [1] 0.004
  1. Is it unusual for a lens to take 16 days to fix a defect?
  1. If it does take 16 days for eyeglasses to be repaired, what would you think?

Q 3: A forested nature reserve has 13 bird-viewing platforms scattered throughout a large block of land. The naturalists claim that at any point in time, there is a 75 percent chance of seeing birds at each platform. Suppose you walk through the reserve and visit every platform. If you assume that all relevant conditions are satisfied, let X be a binomial random variable representing the total number of platforms at which you see birds.

  1. Visualize the probability mass function of the binomial distribution of interest.
# number of platforms
x = c(0,1,2,3,4,5,6,7,8,9,10,11,12,13)

# probability of success (single-trial success probability)
p = 0.75 

# Probabilities for each outcome
prob = dbinom(x, size = 13, prob = p)

# Plot the PMF
bp = barplot(prob,
        names.arg = x,
        xlab = "Number of Platforms with Birds (X)",
        ylab = "Probability",
        main = "PMF of X ~ Binomial(13, 0.75)",
        ylim = c(0, 1.1 * max(prob)), # add extra space to the top
        col = "lightblue")

# Optionally, add probability values above each bar:
text(x = bp, y = prob, label = round(prob, 3), pos = 3, cex = 0.7)

  1. What is the probability you see birds at all sites?
prob_x_13 = dbinom(x=13, size = 13, prob = 0.75)
round(prob_x_13, 5)
## [1] 0.02376
  1. What is the probability you see birds at more than 9 platforms?
prob_x_10_13 = sum(dbinom(x= 10:13, size = 13, prob = 0.75))
round(prob_x_10_13, 5)
## [1] 0.58425
  1. What is the probability of seeing birds at between 8 and 11 platforms (inclusive)? Confirm your answer by using only the d−function and then again using only the p−function.
#using only the d−function
prob_x_8_11_d = sum(dbinom(x= 8:11, size = 13, prob = p))
round(prob_x_8_11_d, 5)
## [1] 0.79308
# using only the p−function
prob_x_8_11_p = pbinom(11, size = 13, prob = p) - pbinom(7, size = 13, prob = p)
round(prob_x_8_11_p, 5)
## [1] 0.79308
  1. Say that, before your visit, you decide that if you see birds at fewer than 9 sites, you’ll make a scene and demand your entry fee back. What’s the probability of your embarrassing yourself in this way?
prob_embarrasing = sum(dbinom(x=0:8, size = 13, prob = 0.75))
round(prob_embarrasing, 5)
## [1] 0.20604
  1. Simulate realizations of X that represent 10 different visits to the reserve; store your resulting vector as an object.
visits = 10
X_sim = rbinom(visits, size = 13, prob = 0.75)
X_sim
##  [1]  8 10 11 11  9  8 11  7 12 12
  1. Compute the mean and standard deviation of the distribution of interest.
#mean
mean_x = sum(x * prob)
mean_x
## [1] 9.75
#Standard deviation
variance_x = sum((x - mean_x)^2 * prob)
sd_x = sqrt(variance_x)
sd_x
## [1] 1.561249

Q 4: Every Saturday, at the same time, an individual stands by the side of a road and tallies the number of cars going by within a 120-minute window. Based on previous knowledge, she believes that the mean number of cars going by during this time is exactly 107. Let X represent the appropriate Poisson random variable of the number of cars passing her position in each Saturday session.

  1. What is the probability that more than 100 cars pass her on any given Saturday?
# Parameters
lambda = 107

# Probability X > 100
prob_more_than_100 = 1 - ppois(100, lambda = lambda)
round(prob_more_than_100, 5)
## [1] 0.73191
  1. Determine the probability that no cars pass
no_car = ppois(0, lambda = lambda)
round(no_car, 5)
## [1] 0
  1. Plot the relevant Poisson mass function over the values in 60 ≤ x ≤ 150.
# Parameters
lambda = 107
x = 60:150

# PMF
pmf = dpois(x, lambda = lambda)

# Bar plot
bp = barplot(pmf,
              names.arg = x,
              xlab = "Number of Cars (X)",
              ylab = "Probability",
              main = "Poisson PMF: Number of Cars (lambda = 107)",
              col = "lightblue",
              border = NA,
              ylim = c(0, max(pmf)*1.1))

  1. Simulate 260 results from this distribution (about five years of weekly Saturday monitoring sessions). Plot the simulated results using hist; use xlim to set the horizontal limits from 60 to 150. Compare your histogram to the shape of your mass function from (c)
#Simulate 260 weekly sessions
set.seed(123)  # for reproducibility
lambda <- 107
n_weeks <- 260

X_sim <- rpois(n_weeks, lambda = lambda)
head(X_sim)  # preview first few simulated values
## [1] 101 119  89 108 124 111
#Plot histogram of simulated results
hist(X_sim,
     breaks = 50,         # number of bins (adjust for clarity)
     xlim = c(60, 150),   # horizontal limits
     col = "lightgreen",
     border = "black",
     xlab = "Number of Cars",
     ylab = "Frequency",
     main = "Simulated Weekly Car Counts (260 Weeks)")

#Compare to the Poisson PMF from part (c)
# Compute PMF
x <- 60:150
pmf <- dpois(x, lambda = lambda)

# Scale PMF to match histogram
pmf_scaled <- pmf * n_weeks  # multiply by total simulated samples

# Overlay PMF
lines(x, pmf_scaled, type = "h", col = "red", lwd = 2)

Q 5: A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2.

  1. What is the probability that the fourth person the researcher talks to is the first person to support the law?
p = 0.2
k = 4

prob_X4 = dgeom(k-1, prob = p)  # R counts number of failures before first success
round(prob_X4, 4)
## [1] 0.1024
  1. Plot the probability mass function for a the above geometric function.
# Parameters
p = 0.2
k = 1:10  # trial numbers

# Probability that the k-th person is the first supporter
pmf = (1 - p)^(k - 1) * p

# Bar plot
bp = barplot(pmf,
              names.arg = k,
              xlab = "Trial number (first success occurs)",
              ylab = "Probability",
              main = "Geometric PMF: First Supporter",
             ylim = c(0, 1.1 * max(p)), # add extra space to the top
              col = "lightblue")

# Add probability labels
text(x = bp, y = pmf, labels = round(pmf, 3), pos = 3, cex = 0.8)

Q 6: A deck of cards contains 20 cards: 6 red cards and 14 black cards. 5 cards are drawn randomly without replacement.

  1. What is the probability that exactly 4 red cards are drawn?
# Parameters
N <- 20    # total cards
K <- 6     # red cards
n <- 5     # cards drawn
x <- 4     # red cards desired

# Hypergeometric probability
prob_4_red <- dhyper(x, m = K, n = N - K, k = n)
round(prob_4_red, 5)
## [1] 0.01354
  1. Plot the above probability mass function for the given probability of interest in the question.
# Parameters
N <- 20    # total cards
K <- 6     # red cards
n <- 5     # cards drawn

# Possible values of X (number of red cards drawn)
x <- 0:5

# Hypergeometric probabilities
pmf <- dhyper(x, m = K, n = N - K, k = n)

# Bar plot of PMF
bp <- barplot(pmf,
              names.arg = x,
              xlab = "Number of Red Cards Drawn",
              ylab = "Probability",
              main = "Hypergeometric PMF: Red Cards Drawn",
              ylim = c(0, 1.1 * max(pmf)), # add extra space to the top
              col = "lightblue")

# Add probability labels above bars
text(x = bp, y = pmf, labels = round(pmf, 3), pos = 3, cex = 0.8)