Q 1: The 2010 U.S. Census found the chance of a household being a certain size. The data is in the table (”Households by age,” 2013). Draw a histogram of the probability distribution. Size of household 1 2 3 4 5 6 7 or more Probability 26.7% 33.6% 15.8% 13.7% 6.3% 2.4% 1.5%
# Household sizes
x = c("1","2","3","4","5","6","7 or more")
# Probabilities (convert % to decimals)
p = c(26.7, 33.6, 15.8, 13.7, 6.3, 2.4, 1.5) / 100
# "7 or more" is 7
x_num = as.numeric(as.character(replace(x, x == "7 or more", "7")))
# Draw histogram (bar plot for discrete distribution)
bp = barplot(p,
names.arg = x,
xlab = "Household size",
ylab = "Probability",
main = "Probability Distribution: Household Size (2010 U.S. Census)",
ylim = c(0, max(p) + 0.05), # add extra space at the top
col = "lightblue")
# Add probability labels on top
text(x = bp, y = p, labels = round(p, 3), pos = 3)
mean_x = sum(x_num * p)
mean_x
## [1] 2.525
variance_x = sum((x_num - mean_x)^2 * p)
variance_x
## [1] 2.023375
sd_x = sqrt(variance_x)
sd_x
## [1] 1.422454
Q 2: Eyeglassomatic manufactures eyeglasses for different retailers. The number of days it takes to fix defects in an eyeglass and the probability that it will take that number of days are in the table. Table 1: Number of Days to Fix Defects Number of Days Probabilities 1 24.90% 2 10.80% 3 9.10% 4 12.30% 5 13.30% 6 11.40% 7 7.00% 8 4.60% 9 1.90% 10 1.30% 11 1.00% 12 0.80% 13 0.60% 14 0.40% 15 0.20% 16 0.20% 17 0.10% 18 0.10%
# x as number of days to Fix Defects
x = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18)
# Probabilities (convert % to decimals)
p = c(24.90, 10.80, 9.10, 12.30, 13.30, 11.40, 7.00, 4.60, 1.90, 1.30, 1.00, 0.80, 0.60, 0.40, 0.20, 0.20, 0.10, 0.10
) / 100
# Draw histogram (bar plot for discrete distribution)
bp = barplot(p,
names.arg = x,
xlab = "Num days to Fix Defects",
ylab = "Probability",
main = "Probability Distribution: Number of Days to Fix Defects",
ylim = c(0, max(p) + 0.05), # add extra space at the top
col = "lightblue")
# Add probability labels on top
text(x = bp, y = p, labels = round(p, 2), pos = 3, cex = 0.8)
mean_x = sum(x * p)
mean_x
## [1] 4.175
variance_x = sum((x - mean_x)^2 * p)
variance_x
## [1] 8.414375
sd_x = sqrt(variance_x)
sd_x
## [1] 2.900754
prob_at_least_16 <- sum(p[x >= 16])
prob_at_least_16
## [1] 0.004
Q 3: A forested nature reserve has 13 bird-viewing platforms scattered throughout a large block of land. The naturalists claim that at any point in time, there is a 75 percent chance of seeing birds at each platform. Suppose you walk through the reserve and visit every platform. If you assume that all relevant conditions are satisfied, let X be a binomial random variable representing the total number of platforms at which you see birds.
# number of platforms
x = c(0,1,2,3,4,5,6,7,8,9,10,11,12,13)
# probability of success (single-trial success probability)
p = 0.75
# Probabilities for each outcome
prob = dbinom(x, size = 13, prob = p)
# Plot the PMF
bp = barplot(prob,
names.arg = x,
xlab = "Number of Platforms with Birds (X)",
ylab = "Probability",
main = "PMF of X ~ Binomial(13, 0.75)",
ylim = c(0, 1.1 * max(prob)), # add extra space to the top
col = "lightblue")
# Optionally, add probability values above each bar:
text(x = bp, y = prob, label = round(prob, 3), pos = 3, cex = 0.7)
prob_x_13 = dbinom(x=13, size = 13, prob = 0.75)
round(prob_x_13, 5)
## [1] 0.02376
prob_x_10_13 = sum(dbinom(x= 10:13, size = 13, prob = 0.75))
round(prob_x_10_13, 5)
## [1] 0.58425
#using only the d−function
prob_x_8_11_d = sum(dbinom(x= 8:11, size = 13, prob = p))
round(prob_x_8_11_d, 5)
## [1] 0.79308
# using only the p−function
prob_x_8_11_p = pbinom(11, size = 13, prob = p) - pbinom(7, size = 13, prob = p)
round(prob_x_8_11_p, 5)
## [1] 0.79308
prob_embarrasing = sum(dbinom(x=0:8, size = 13, prob = 0.75))
round(prob_embarrasing, 5)
## [1] 0.20604
visits = 10
X_sim = rbinom(visits, size = 13, prob = 0.75)
X_sim
## [1] 8 10 11 11 9 8 11 7 12 12
#mean
mean_x = sum(x * prob)
mean_x
## [1] 9.75
#Standard deviation
variance_x = sum((x - mean_x)^2 * prob)
sd_x = sqrt(variance_x)
sd_x
## [1] 1.561249
Q 4: Every Saturday, at the same time, an individual stands by the side of a road and tallies the number of cars going by within a 120-minute window. Based on previous knowledge, she believes that the mean number of cars going by during this time is exactly 107. Let X represent the appropriate Poisson random variable of the number of cars passing her position in each Saturday session.
# Parameters
lambda = 107
# Probability X > 100
prob_more_than_100 = 1 - ppois(100, lambda = lambda)
round(prob_more_than_100, 5)
## [1] 0.73191
no_car = ppois(0, lambda = lambda)
round(no_car, 5)
## [1] 0
# Parameters
lambda = 107
x = 60:150
# PMF
pmf = dpois(x, lambda = lambda)
# Bar plot
bp = barplot(pmf,
names.arg = x,
xlab = "Number of Cars (X)",
ylab = "Probability",
main = "Poisson PMF: Number of Cars (lambda = 107)",
col = "lightblue",
border = NA,
ylim = c(0, max(pmf)*1.1))
#Simulate 260 weekly sessions
set.seed(123) # for reproducibility
lambda <- 107
n_weeks <- 260
X_sim <- rpois(n_weeks, lambda = lambda)
head(X_sim) # preview first few simulated values
## [1] 101 119 89 108 124 111
#Plot histogram of simulated results
hist(X_sim,
breaks = 50, # number of bins (adjust for clarity)
xlim = c(60, 150), # horizontal limits
col = "lightgreen",
border = "black",
xlab = "Number of Cars",
ylab = "Frequency",
main = "Simulated Weekly Car Counts (260 Weeks)")
#Compare to the Poisson PMF from part (c)
# Compute PMF
x <- 60:150
pmf <- dpois(x, lambda = lambda)
# Scale PMF to match histogram
pmf_scaled <- pmf * n_weeks # multiply by total simulated samples
# Overlay PMF
lines(x, pmf_scaled, type = "h", col = "red", lwd = 2)
Q 5: A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2.
p = 0.2
k = 4
prob_X4 = dgeom(k-1, prob = p) # R counts number of failures before first success
round(prob_X4, 4)
## [1] 0.1024
# Parameters
p = 0.2
k = 1:10 # trial numbers
# Probability that the k-th person is the first supporter
pmf = (1 - p)^(k - 1) * p
# Bar plot
bp = barplot(pmf,
names.arg = k,
xlab = "Trial number (first success occurs)",
ylab = "Probability",
main = "Geometric PMF: First Supporter",
ylim = c(0, 1.1 * max(p)), # add extra space to the top
col = "lightblue")
# Add probability labels
text(x = bp, y = pmf, labels = round(pmf, 3), pos = 3, cex = 0.8)
Q 6: A deck of cards contains 20 cards: 6 red cards and 14 black cards.
5 cards are drawn randomly without replacement.
# Parameters
N <- 20 # total cards
K <- 6 # red cards
n <- 5 # cards drawn
x <- 4 # red cards desired
# Hypergeometric probability
prob_4_red <- dhyper(x, m = K, n = N - K, k = n)
round(prob_4_red, 5)
## [1] 0.01354
# Parameters
N <- 20 # total cards
K <- 6 # red cards
n <- 5 # cards drawn
# Possible values of X (number of red cards drawn)
x <- 0:5
# Hypergeometric probabilities
pmf <- dhyper(x, m = K, n = N - K, k = n)
# Bar plot of PMF
bp <- barplot(pmf,
names.arg = x,
xlab = "Number of Red Cards Drawn",
ylab = "Probability",
main = "Hypergeometric PMF: Red Cards Drawn",
ylim = c(0, 1.1 * max(pmf)), # add extra space to the top
col = "lightblue")
# Add probability labels above bars
text(x = bp, y = pmf, labels = round(pmf, 3), pos = 3, cex = 0.8)