Problem 1
1.(Bayesian):
A new credit scoring system has been developed to predict the
likelihood of loan defaults. The system has a 90% sensitivity, meaning
that it correctly identifies 90% of those who will default on their
loans. It also has a 95% specificity, meaning that it correctly
identifies 95% of those who will not default. The default rate among
borrowers is 2%. Given these prevalence, sensitivity, and specificity
estimates, what is the probability that a borrower flagged by the system
as likely to default will actually default? If the average loss per
defaulted loan is $200,000 and the cost to run the credit scoring test
on each borrower is $500, what is the total first-year cost for
evaluating 10,000 borrowers?
With the given information, we can proceed to perform the
calculations:
\[
P(\text{default} \mid \text{positive test}) = \frac{P(\text{positive
test} \mid \text{default}) \times P(\text{default})}{P(\text{positive
test})}
\]
\[
P(\text{positive test}) = P(\text{positive test} \mid \text{default})
\times P(\text{default}) + P(\text{positive test} \mid \text{no
default}) \times P(\text{no default})
\]
\[
P(\text{positive test}) = 0.9 \times 0.02 + (1 - 0.95) \times (1 - 0.02)
= 0.018 + 0.05 \times 0.98 = 0.018 + 0.049 = 0.067
\]
\[
P(\text{default} \mid \text{positive test}) = \frac{0.9 \times
0.02}{0.067} = \frac{0.018}{0.067} \approx 0.269 \text{ or about }
26.9\%
\]
# to get the results in R
# Given values
sensitivity <- 0.90
specificity <- 0.95
default_rate <- 0.02
# Complementary probabilities
no_default_rate <- 1 - default_rate
false_positive_rate <- 1 - specificity
# Total probability of a positive test
p_positive_test <- (sensitivity * default_rate) + (false_positive_rate * no_default_rate)
# Probability of default given a positive test
p_default_given_positive <- (sensitivity * default_rate) / p_positive_test
# Display result
p_default_given_positive
## [1] 0.2686567
In the second part I will calculate the total first year cost:
\[
\begin{align*}
\text{Cost to run tests} &= 10{,}000 \text{ borrowers} \times \$500
= \$5{,}000{,}000 \\
\text{Expected number of positive tests} &= 10{,}000 \times 0.067 =
670 \text{ borrowers} \\
\text{True defaulters among positives} &= 670 \times 0.269 = 180
\text{ borrowers} \\
\text{Expected loss from defaults} &= 180 \times \$200{,}000 =
\$36{,}000{,}000 \\
\text{Total first-year cost} &= \$5{,}000{,}000 + \$36{,}000{,}000 =
\$41{,}000{,}000
\end{align*}
\]
# Cost calculations
num_borrowers <- 10000
test_cost_per_borrower <- 500
loss_per_default <- 200000
# Total testing cost
total_test_cost <- num_borrowers * test_cost_per_borrower
# Expected number of positive tests
expected_positives <- num_borrowers * p_positive_test
# Expected number of true defaulters among positives
expected_defaulters <- expected_positives * p_default_given_positive
# Expected loss from defaults
expected_loss <- expected_defaulters * loss_per_default
# Total first-year cost
total_first_year_cost <- total_test_cost + expected_loss
# Display results
total_first_year_cost
## [1] 4.1e+07
2.(Binomial):
The probability that a stock will pay a dividend in any given quarter
is 0.7. What is the probability that the stock pays dividends exactly 6
times in 8 quarters? What is the probability that it pays dividends 6 or
more times? What is the probability that it pays dividends fewer than 6
times? What is the expected number of dividend payments over 8 quarters?
What is the standard deviation?
The probability mass function for a binomial distribution is given
by: \[
P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}
\]
Where: - \(n\) = number of trials -
\(k\) = number of successes - \(p\) = probability of success on each
trial
For \(n = 8\) and \(p = 0.7\):
- \(P(X = 6) = \binom{8}{6} (0.7)^6
(0.3)^2\)
- \(P(X = 7) = \binom{8}{7} (0.7)^7
(0.3)^1\)
- \(P(X = 8) = \binom{8}{8} (0.7)^8
(0.3)^0\)
The cumulative probability for \(X \geq
6\) is:
\[
P(X \geq 6) = P(X = 6) + P(X = 7) + P(X = 8)
\]
The expected value \(E(X)\) and
standard deviation \(\sigma\) are
calculated as:
\[
E(X) = n \times p
\]
\[
\sigma = \sqrt{n \times p \times (1-p)}
\]
#code in R
# Parameters
n <- 8
p <- 0.7
# Probability of exactly 6 successes
P_X_equals_6 <- dbinom(6, size = n, prob = p)
# Probability of exactly 7 successes
P_X_equals_7 <- dbinom(7, size = n, prob = p)
# Probability of exactly 8 successes
P_X_equals_8 <- dbinom(8, size = n, prob = p)
# Cumulative probability of 6 or more successes
P_X_geq_6 <- P_X_equals_6 + P_X_equals_7 + P_X_equals_8
# Alternatively, using pbinom for cumulative probability
P_X_geq_6_alt <- pbinom(5, size = n, prob = p, lower.tail = FALSE)
# Expected value
expected_value <- n * p
# Standard deviation
standard_deviation <- sqrt(n * p * (1 - p))
# Results
P_X_equals_6
## [1] 0.2964755
P_X_equals_7
## [1] 0.1976503
P_X_equals_8
## [1] 0.05764801
P_X_geq_6
## [1] 0.5517738
P_X_geq_6_alt
## [1] 0.5517738
expected_value
## [1] 5.6
standard_deviation
## [1] 1.296148
3.(Poisson):
A financial analyst notices that there are an average of 12 trading
days each month when a certain stock’s price increases by more than 2%.
What is the probability that exactly 4 such days occur in a given month?
What is the probability that more than 12 such days occur in a given
month? How many such days would you expect in a 6-month period? What is
the standard deviation of the number of such days? If an investment
strategy requires at least 70 days of such price increases in a year for
profitability, what is the percent utilization and what are your
recommendations?
\[
P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}
\]
\[
P(X > 12) = 1 - P(X \leq 12)
\]
\[
\text{Expected number in 6 months} = 6 \times 12 = 72 \text{ days}
\]
\[
\sigma = \sqrt{\lambda} = \sqrt{12}
\]
# Code in R
# Parameter
lambda <- 12
# Probability of X = 4
p_x_equals_4 <- dpois(4, lambda = lambda)
# Probability of X > 12
p_x_gt_12 <- 1 - ppois(12, lambda = lambda)
# Expected number in 6 months
expected_6_months <- 6 * lambda
# Standard deviation
std_dev <- sqrt(lambda)
# Display results
p_x_equals_4
## [1] 0.005308599
p_x_gt_12
## [1] 0.4240348
expected_6_months
## [1] 72
std_dev
## [1] 3.464102
4.(Hypergeometric):
A hedge fund has a portfolio of 25 stocks, with 15 categorized as
high-risk and 10 as low-risk. The fund manager randomly selects 7 stocks
to closely monitor. If the manager selected 5 high-risk stocks and 2
low-risk stocks, what is the probability of selecting exactly 5
high-risk stocks if the selection was random? How many high-risk and
low-risk stocks would you expect to be selected?
For this problem, I have 15 high-risk stocks and 10 low-risk stocks
in portfolio, 7 stocks randomly selected and 5 high-risk stocks
selected
First I will calculate the probability of selecting exactly 5
high-risk stocks. \[
P(X=5)=(155)×(102)(257)=3003×45480700=135135480700≈0.2811P\\(X = 5) =
\frac{\binom{15}{5} \times \binom{10}{2}}{\binom{25}{7}} = \frac{3003
\times 45}{480700}\\ = \frac{135135}{480700} \approx
0.2811P(X=5)\\=(725)(515)×(210)=4807003003×45=480700135135≈0.2811
\] Calculate the expected number of high-risk stocks to be
selected. \[
E[Xhigh-risk]=n×KN=7×1525=7×0.6=4.2E[X_{\text{high-risk}}]\\ = n \times
\frac{K}{N} = 7 \times \frac{15}{25} = 7 \times 0.6 =
4.2E[Xhigh-risk]=n×NK=7×2515=7×0.6=4.2
\] Finally, I will calculate the expected number of low-risk
stocks to be selected. \[
E[Xlow-risk]=n×N−KN=7×1025=7×0.4=2.8E[X_{\text{low-risk}}]\\ = n \times
\frac{N-K}{N} = 7 \times \frac{10}{25}\\ = 7 \times 0.4 =
2.8E[Xlow-risk]=n×NN−K=7×2510=7×0.4=2.8
\]
# Code in R
# Given information
high_risk <- 15
low_risk <- 10
total_stocks <- high_risk + low_risk # Total number of stocks
selected <- 7 # Number of stocks selected
high_risk_selected <- 5 # Number of high-risk stocks selected
# Calculate probability of exactly 5 high-risk stocks
p_exactly_5_high_risk <- dhyper(high_risk_selected, high_risk, low_risk, selected)
# Expected number of high-risk and low-risk stocks
expected_high_risk <- selected * (high_risk / total_stocks)
expected_low_risk <- selected * (low_risk / total_stocks)
cat("Probability of selecting exactly 5 high-risk stocks:", round(p_exactly_5_high_risk, 4), "\n")
## Probability of selecting exactly 5 high-risk stocks: 0.2811
cat("Expected number of high-risk stocks selected:", round(expected_high_risk, 4), "\n")
## Expected number of high-risk stocks selected: 4.2
cat("Expected number of low-risk stocks selected:", round(expected_low_risk, 4), "\n\n")
## Expected number of low-risk stocks selected: 2.8
5.(Geometric):
The probability that a bond defaults in any given year is 0.5%. A
portfolio manager holds this bond for 10 years. What is the probability
that the bond will default during this period? What is the probability
that it will default in the next 15 years? What is the expected number
of years before the bond defaults? If the bond has already survived 10
years, what is the probability that it will default in the next 2
years?
For this exercise, I have a probability of bond default in a year =
0.5% = 0.005,bond held for 10 years
First, I have to calculate the probability of default during the
10-year period. \[
P(default in 10 years)=1−P(no default in 10 years)=1−(1−p)10=1−(0.995)\\10=1−0.9512=0.0488P(\text{default
in 10 years}) = 1 - P(\text{no default in 10 years})\\ = 1 - (1-p)^{10}
= 1 - (0.995)^{10} = 1 - 0.9512 =
0.0488P(default in 10 years)\\=1−P(no default in 10 years)=1−(1−p)10=1−(0.995)10=1−0.9512=0.0488
\] Now I will calculate the probability of default during the
next 15 years. \[
P(default in 15 years)=1−(1−p)15=1−(0.995)15=1−0.9277=0.0723P(\text{default
in 15 years})\\ = 1 - (1-p)^{15} = 1 - (0.995)^{15} = 1 - 0.9277 =
0.0723P(default in 15 years)\\=1−(1−p)15=1−(0.995)15=1−0.9277=0.0723
\] Calculate the expected number of years before the bond
defaults. \[
E[X]=1p=10.005=200 yearsE[X]\\ = \frac{1}{p} = \frac{1}{0.005} = 200
\text{ years}E[X]=p1=0.0051=200 years
\] Finally, I’m going to calculate the probability of default in
the next 2 years after surviving 10 years using the property of the
geometric distribution: \[
P(default in next 2 years∣survived 10 years)=1−(1−p)2=1−(0.995)2=1−0.9900\\=0.0100P(\text{default
in next 2 years} | \text{survived 10 years})\\ = 1 - (1-p)^2 = 1 -
(0.995)^2 = 1 - 0.9900 =
0.0100P(default in next 2 years∣survived 10 years)\\=1−(1−p)2=1−(0.995)2=1−0.9900=0.0100
\]
# To solve in R
# Given information
p_default_year <- 0.005
years_held_10 <- 10
years_held_15 <- 15
# Calculate default probabilities
p_default_10 <- 1 - (1 - p_default_year)^years_held_10
p_default_15 <- 1 - (1 - p_default_year)^years_held_15
# Expected years before default = 1/p
expected_years <- 1 / p_default_year
# Probability of default in next 2 years, given survived 10 years
p_default_next_2 <- 1 - (1 - p_default_year)^2
cat("Probability of bond default in 10 years:", round(p_default_10, 4), "\n")
## Probability of bond default in 10 years: 0.0489
cat("Probability of bond default in 15 years:", round(p_default_15, 4), "\n")
## Probability of bond default in 15 years: 0.0724
cat("Expected number of years before default:", round(expected_years, 2), "\n")
## Expected number of years before default: 200
cat("Probability of default in next 2 years after surviving 10 years:", round(p_default_next_2, 4), "\n\n")
## Probability of default in next 2 years after surviving 10 years: 0.01
6.(Poisson):
A high-frequency trading algorithm experiences a system failure about
once every 1500 trading hours. What is the probability that the
algorithm will experience more than two failures in 1500 hours? What is
the expected number of failures?
Based on the given information, I have an average of 1 system failure
per 1500 trading hours
The first step is to calculate the probability of more than 2
failures in 1500 hours. \[
P(X>2)=1−P(X≤2)=1−(P(X=0)+P(X=1)+P(X=2))\\
P(X > 2) = 1 - P(X \leq 2) = 1 - (P(X=0) + P(X=1) + P(X=2))\\
P(X>2)=1−P(X≤2)=1−(P(X=0)+P(X=1)+P(X=2))\\
=1−(e−1×100!+e−1×111!+e−1×122!)\\= 1 - \left(\frac{e^{-1} \times
1^0}{0!} + \frac{e^{-1} \times 1^1}{1!} + \frac{e^{-1} \times\\
1^2}{2!}\right)=1−(0!e−1×10+1!e−1×11+2!e−1×12)\\
=1−(e−1+e−1+e−12)= 1 - (e^{-1} + e^{-1} +
\frac{e^{-1}}{2})=1−(e−1+e−1+2e−1)\\
=1−e−1(1+1+12)= 1 - e^{-1}(1 + 1 + \frac{1}{2})=1−e−1(1+1+21)\\
=1−e−1×2.5= 1 - e^{-1} \times 2.5=1−e−1×2.5\\
=1−0.3679×2.5=1−0.9198=0.0802= 1 - 0.3679 \times 2.5 = 1 - 0.9198\\
= 0.0802=1−0.3679×2.5=1−0.9198=0.0802
\] Now I need to find the expected number of failures is simply
the lambda value.
\[E[X]=λ=1E[X] = \lambda =
1E[X]=λ=1\]
# R code
# Given values
lambda_failures <- 1
# Calculate probability of more than 2 failures
p_more_than_2 <- ppois(2, lambda_failures, lower.tail = FALSE) # P(X > 2) = 1 - P(X ≤ 2)
# Expected number of failures
expected_failures <- lambda_failures
cat("Probability of more than 2 failures in 1500 hours:", round(p_more_than_2, 4), "\n")
## Probability of more than 2 failures in 1500 hours: 0.0803
cat("Expected number of failures in 1500 hours:", expected_failures, "\n\n")
## Expected number of failures in 1500 hours: 1
7.(Uniform Distribution):
An investor is trying to time the market and is monitoring a stock
that they believe has an equal chance of reaching a target price between
20 and 60 days. What is the probability that the stock will reach the
target price in more than 40 days? If it hasn’t reached the target price
by day 40, what is the probability that it will reach it in the next 10
days? What is the expected time for the stock to reach the target
price?
Based on the given values, stock has equal chance of reaching target
price between 20 and 60 days, the uniform distribution over [20, 60]
First, I will calculate the probability that the stock will reach the
target price in more than 40 days. \[
P(X>40)=b−40b−a=60−4060−20=2040\\
=0.5P(X > 40) = \frac{b - 40}{b - a} = \frac{60 - 40}{60 - 20}\\ =
\frac{20}{40} = 0.5P(X>40)=b−ab−40=60−2060−40=4020=0.5
\] Now I’m going to calculate the conditional probability that
the stock will reach the target in the next 10 days, given it hasn’t
reached it by day 40. \[
P(40<X≤50∣X>40)=P(40<X≤50)\\P(X>40)=50−4060−40=1020\\=0.5P(40
< X \leq 50 | X > 40) = \frac{P(40 < X \leq 50)}{P(X > 40)}
= \frac{50 - 40}{60 - 40} = \frac{10}{20}\\ =
0.5P(40<X≤50∣X>40)=P(X>40)P(40<X≤50)=60−4050−40=2010=0.5
\] Finally, I will calculate the expected time for the stock to
reach the target price. \[
E[X]=a+b2=20+602=40 daysE[X]\\ = \frac{a + b}{2} = \frac{20 + 60}{2} =
40 \text{ days}E[X]=2a+b=220+60=40 days
\]
#R code
# Given Values
a <- 20
b <- 60
# Calculate probability of more than 40 days
p_more_than_40 <- (b - 40) / (b - a) # P(X > 40) = (60 - 40) / (60 - 20)
# Calculate conditional probability
p_next_10_given_after_40 <- (50 - 40) / (b - 40)
# Expected time
expected_days <- (a + b) / 2
cat("Probability stock reaches target price after 40 days:", round(p_more_than_40, 4), "\n")
## Probability stock reaches target price after 40 days: 0.5
cat("Probability stock reaches target in next 10 days after day 40:", round(p_next_10_given_after_40, 4), "\n")
## Probability stock reaches target in next 10 days after day 40: 0.5
cat("Expected time for stock to reach target price:", expected_days, "days\n\n")
## Expected time for stock to reach target price: 40 days
8.(Exponential Distribution):
A financial model estimates that the lifetime of a successful
start-up before it either goes public or fails follows an exponential
distribution with an expected value of 8 years. What is the expected
time until the start-up either goes public or fails? What is the
standard deviation? What is the probability that the start-up will go
public or fail after 6 years? Given that the start-up has survived for 6
years, what is the probability that it will go public or fail in the
next 2 years?
With the given values, I can say that the exponential distribution
with expected value of 8 years rate parameter λ = 1/8
First I need to find the expected time is the given mean. \[
E[X]=1λ=8 yearsE[X] = \frac{1}{\lambda} = 8 \text{ years}E[X]=λ1=8 years
\] Now I have to get the standard deviation equals the mean.
\[
σ=1λ=8 years\sigma = \frac{1}{\lambda} = 8 \text{ years}σ=λ1=8 years
\] Calculate the probability that the startup will go public or
fail after 6 years. \[
P(X>6)=e−λ×6=e−68=e−0.75=0.4724P(X > 6)\\ = e^{-\lambda \times 6}
= e^{-\frac{6}{8}} = e^{-0.75} =
0.4724P(X>6)=e−λ×6=e−86=e−0.75=0.4724
\] The final step would be the calculation of the probability
that, given the startup has survived for 6 years, it will go public or
fail in the next 2 years. \[
P(X≤2∣X>6)=P(X≤2)=1−e−λ×2=1−e−28=1−e−0.25\\=1−0.7788=0.2212p (X \leq
2 | X > 6) = P(X \leq 2) = 1 - e^{-\lambda \times 2} = 1 -
e^{-\frac{2}{8}} = 1 - e^{-0.25} \\= 1 - 0.7788 =
0.2212P(X≤2∣X>6)=P(X≤2)=1−e−λ×2\\=1−e−82=1−e−0.25=1−0.7788=0.2212
\]
# R code
# Given information
lambda <- 1/8
# Expected time and standard deviation
expected_lifetime <- 1 / lambda
sd_lifetime <- 1 / lambda
# Probability of startup going public or failing after 6 years
p_after_6 <- exp(-lambda * 6)
# Conditional probability of going public or failing in next 2 years
p_next_2_given_survived_6 <- 1 - exp(-lambda * 2)
cat("Expected lifetime of startup:", expected_lifetime, "years\n")
## Expected lifetime of startup: 8 years
cat("Standard deviation of lifetime:", sd_lifetime, "years\n")
## Standard deviation of lifetime: 8 years
cat("Probability startup goes public or fails after 6 years:", round(p_after_6, 4), "\n")
## Probability startup goes public or fails after 6 years: 0.4724
cat("Probability startup goes public or fails in next 2 years after surviving 6 years:", round(p_next_2_given_survived_6, 4), "\n")
## Probability startup goes public or fails in next 2 years after surviving 6 years: 0.2212
Problem 2
# Helper function for combinations
comb <- function(n, k) {
return(factorial(n) / (factorial(k) * factorial(n - k)))
}
1.(Product Selection):
A company produces 5 different types of green pens and 7 different
types of red pens. The marketing team needs to create a new promotional
package that includes 5 pens. How many different ways can the package be
created if it contains fewer than 2 green pens?
I need to breakdown this problem in parts, first I need to find the
number of ways to create a package with 5 pens where fewer than 2 green
pens are used. “Fewer than 2 green pens” means either 0 or 1 green pen.
\[
Case 1: 0\ green\ pens, 5\ red\ pens\\ Ways\ to\ select\ 5\ red\ pens\
from\ 7\ types: C(7,5) = 21
\] \[
Case 2: 1\ green\ pen,\ 4\ red\ pens\\
Ways\ to\ select\ 1\ green\ pen\ from\ 5\ types: C(5,1) = 5\\
Ways\ to\ select\ 4\ red\ pens\ from\ 7\ types: C(7,4) = 35\\
Total\ for\ this\ case: 5 × 35 = 175
\] \[
Total\ number\ of\ ways = 21 + 175 = 196\\
Answer: 196
\]
# R code
# Case 1: 0 green pens, 5 red pens
case1 <- comb(7, 5) # Ways to select 5 red pens from 7 types
# Case 2: 1 green pen, 4 red pens
case2 <- comb(5, 1) * comb(7, 4) # Ways to select 1 green pen and 4 red pens
# Total number of ways
total_ways_p1 <- case1 + case2
cat("Problem 2.1 Answer:", total_ways_p1, "\n\n")
## Problem 2.1 Answer: 196
3.(Marketing Campaign Outcomes):
A marketing campaign involves three stages: first, a customer is sent
5 email offers; second, the customer is targeted with 2 different online
ads; and third, the customer is presented with 3 personalized product
recommendations. If the email offers, online ads, and product
recommendations are selected randomly, how many different possible
outcomes are there for the entire campaign?
First, I need to find the total number of possible outcomes across
all three stages.
\[
Stage\ 1: 5\ email\ offers\\
Stage\ 2: 2\ online\ ads\\
Stage\ 3: 3\ personalized\ product\ recommendations\\
Total\ outcomes = 5 × 2 × 3 = 30\\
Answer: 30
\]
#rcode
total_outcomes_p3 <- 5 * 2 * 3
cat("Problem 2.3 Answer:", total_outcomes_p3, "\n\n")
## Problem 2.3 Answer: 30
4.(Product Defect Probability):
A quality control team draws 3 products from a batch of size N
without replacement. What is the probability that at least one of the
products drawn is defective if the defect rate is known to be
consistent?
This problem has a defect rate denoted as p, the probability of a
non-defective product would be 1-p. To find a solution: I have to use
this formula: \[
(1-p)*(1-p)*(1-p)^3 = 1-(1-p)^3
\]
prob_at_least_one_defective <- function(p) {
1 - (1 - p)^3
}
prob_at_least_one_defective
## function (p)
## {
## 1 - (1 - p)^3
## }
5.(Business Strategy Choices):
A business strategist is choosing potential projects to invest in,
focusing on 17 high-risk, high-reward projects and 14 low-risk,
steady-return projects.
Step 1: How many different combinations of 5 projects can the
strategist select?
based on the given information, the total number of ways to select 5
projects from 31 projects (17 high-risk + 14 low-risk) \[
C(31,5) = 169,911\\
Answer: 169,911
\]
# R code
total_projects <- 17 + 14
step1 <- comb(total_projects, 5)
cat("Step 1 Answer:", step1, "\n")
## Step 1 Answer: 169911
Step 2: How many different combinations of 5 projects can the
strategist select if they want at least one low-risk project?
For step 2, I need to find the number of ways to select 5 projects
with at least one low-risk project. \[
Total\ ways - Ways\ with\ no\ low/risk\ projects\\
C(31,5) - C(17,5) = 169,911 - 6,188 = 163,723\\
Answer: 163,723
\]
# R code
step2 <- step1 - comb(17, 5)
cat("Step 2 Answer:", step2, "\n\n")
## Step 2 Answer: 163723
6.(Event Scheduling):
A business conference needs to schedule 9 different keynote sessions
from three different industries: technology, finance, and healthcare.
There are 4 potential technology sessions, 104 finance sessions, and 17
healthcare sessions to choose from. How many different schedules can be
made? Express your answer in scientific notation rounding to the
hundredths place.
The total number of ways to schedule 9 sessions from all available
sessions: \[
C(4+104+17,9) = C(125,9)\\
C(125,9) = 125!/(9! × 116!) ≈ 1.9906 × 10^14\\
Answer: 1.99 × 10^14
\]
# R code
total_sessions <- 4 + 104 + 17
ways_p6 <- comb(total_sessions, 9)
scientific_notation_p6 <- format(ways_p6, scientific = TRUE, digits = 3)
cat("Problem 2.6 Answer:", scientific_notation_p6, "\n\n")
## Problem 2.6 Answer: 1.53e+13
7.(Book Selection for Corporate Training):
An HR manager needs to create a reading list for a corporate
leadership training program, which includes 13 books in total. The books
are categorized into 6 novels, 6 business case studies, 7 leadership
theory books, and 5 strategy books.
Step 1: If the manager wants to include no more than 4 strategy
books, how many different reading schedules are possible? Express your
answer in scientific notation rounding to the hundredths place.
I have the given values: Total books = 6 novels + 6 business cases +
7 leadership + 5 strategy = 24 books, now I need to select 13 books with
≤ 4 strategy books: \[
= C(19,13) + C(19,12)×C(5,1) + C(19,11)×C(5,2) + C(19,10)×C(5,3) +
C(19,9)×C(5,4)\\
= 27,132 + 50,388×5 + 75,582×10 + 92,378×10 + 92,378×5\\
= 27,132 + 251,940 + 755,820 + 923,780 + 461,890\\
= 2,420,562 ≈ 2.42 × 10^6\\
Answer: 2.42 × 10^6
\]
# Step 1 R code
novels <- 6
business <- 6
leadership <- 7
strategy <- 5
non_strategy <- novels + business + leadership
# Calculate ways to include 0, 1, 2, 3, or 4 strategy books
ways_step1 <- comb(non_strategy, 13) +
comb(non_strategy, 12) * comb(strategy, 1) +
comb(non_strategy, 11) * comb(strategy, 2) +
comb(non_strategy, 10) * comb(strategy, 3) +
comb(non_strategy, 9) * comb(strategy, 4)
scientific_notation_step1 <- format(ways_step1, scientific = TRUE, digits = 3)
cat("Step 1 Answer:", scientific_notation_step1, "\n")
## Step 1 Answer: 2.42e+06
Step 2: If the manager wants to include all 6 business case studies,
how many different reading schedules are possible? Express your answer
in scientific notation rounding to the hundredths place.
For step 2, I need the number of ways to include all 6 business case
studies Since all 6 business cases must be included, we need to select
the remaining 7 books from the other categories: \[
= C(6+7+5,7) = C(18,7) = 31,824\\
Answer: 3.18 × 10^4
\]
remaining_books <- novels + leadership + strategy
ways_step2 <- comb(remaining_books, 7)
scientific_notation_step2 <- format(ways_step2, scientific = TRUE, digits = 3)
cat("Step 2 Answer:", scientific_notation_step2, "\n\n")
## Step 2 Answer: 3.18e+04
8.(Product Arrangement):
A retailer is arranging 10 products on a display shelf. There are 5
different electronic gadgets and 5 different accessories. What is the
probability that all the gadgets are placed together and all the
accessories are placed together on the shelf? Express your answer as a
fraction or a decimal number rounded to four decimal places.
For this problem, I have 10 positions with 5 gadgets and 5
accessories, the gadgets and the accessories must be placed together.
This reduces to arranging 2 blocks (1 gadget block, 1 accessory block),
and then arranging items within each block. \[
Ways\ to\ arrange\ 2\ blocks: 2! = 2\\
Ways\ to\ arrange\ 5\ gadgets: 5! = 120\\
Ways\ to\ arrange\ 5\ accessories: 5! = 120\\
Total\ arrangements\ with\ blocks\ together: 2 × 120 × 120 = 28,800\\
Total\ possible\ arrangements\ of\ 10\ items: 10! = 3,628,800\\
Probability = 28,800/3,628,800 = 1/126 ≈ 0.0079\\
Answer: 0.0079
\]
# R code
block_arrangements <- factorial(2)
# Ways to arrange items within each block
gadget_arrangements <- factorial(5)
accessory_arrangements <- factorial(5)
# Total arrangements with blocks together
favorable_outcomes <- block_arrangements * gadget_arrangements * accessory_arrangements
# Total possible arrangements of 10 items
total_outcomes <- factorial(10)
# Probability
probability_p8 <- favorable_outcomes / total_outcomes
cat("Problem 2.8 Answer:", round(probability_p8, 4), "\n\n")
## Problem 2.8 Answer: 0.0079
9.(Expected Value of a Business Deal):
A company is evaluating a deal where they either gain $4 for every
successful contract or lose $16 for every unsuccessful contract. A
“successful” contract is defined as drawing a queen or lower from a
standard deck of cards. (Aces are considered the highest card in the
deck.)
Step 1: Find the expected value of the deal. Round your answer to
two decimal places. Losses must be expressed as negative values.
to find the expected value I have to calculate the “successful”
contract, which means drawing a queen or lower from a standard deck of
cards.
\[Jacks,\ Queens,\ and\ cards\ 2-10\ are\
queen\ or\ lower = 44\ cards\\
Kings\ and\ Aces\ are\ higher\ than\ queens\ = 8\ cards\\
Probability\ of\ success\ = 44/52 = 11/13\\
Probability\ of\ failure\ = 8/52 = 2/13\] \(Expected\ value = (\$\ 4 × 11/13) + (-\$16 × 2/13)
= \$44/13 - \$32/13 = \$12/13 ≈ \$0.92\) \(Answer: \$\ 0.92\)
# R code
p_success <- 44/52 # 11/13
# Probability of failure (kings and aces)
p_failure <- 8/52 # 2/13
# Step 1: Expected value calculation
expected_value <- 4 * p_success + (-16) * p_failure
cat("Step 1 Answer: $", round(expected_value, 2), "\n")
## Step 1 Answer: $ 0.92
Step 2: If the company enters into this deal 833 times, how much
would they expect to win or lose? Round your answer to two decimal
places. Losses must be expressed as negative values.
Expected value for 833 deals \[
Total\ expected\ value = 833 × $0.92 = $768.92//
Answer: $768.92
\]
# R code
total_expected_value <- 833 * expected_value
cat("Problem 2.9 Step 2 Answer: $", round(total_expected_value, 2), "\n")
## Problem 2.9 Step 2 Answer: $ 768.92
Problem 3
1.(Supply Chain Risk Assessment):
Let \(X1,X2,....,Xn\) represent the
lead times (in days) for the delivery of key components from \(n = 5\) different suppliers. Each lead time
is uniformly distributed across a range of \(1
to\ k = 20 days\), reflecting the uncertainty in delivery times.
Let \(Y\) denote the minimum delivery
time among all suppliers. Understanding the distribution of \(Y\) is crucial for assessing the earliest
possible time you can begin production. Determine the distribution of Y
to better manage your supply chain and minimize downtime.
For the minimum lead time \(Y\)
among \(n=5\) suppliers with uniformly
distributed lead times between 1 and 20 days: The cumulative
distribution function of \(Y\) is:
\[F_Y(y) = 1 - [(20-y)/19]^5 for 1 ≤ y ≤
20\] The probability density function is: \[f_Y(y) = 5[(20-y)/19]^4 × (1/19) for 1 ≤ y ≤
20\] The expected value of Y is approximately 2.83 days, with a
standard deviation of about 1.47 days. This means that with 5 suppliers,
you can expect the earliest available component to arrive in about 3
days, which helps with production planning.
# R code
# Given values
n <- 5
k <- 20
# Define the CDF and PDF of the minimum (properly vectorized)
min_cdf <- function(y) {
result <- numeric(length(y))
for (i in 1:length(y)) {
if (y[i] < 1) result[i] <- 0
else if (y[i] > k) result[i] <- 1
else result[i] <- 1 - ((k - y[i])/(k - 1))^n
}
return(result)
}
min_pdf <- function(y) {
result <- numeric(length(y))
for (i in 1:length(y)) {
if (y[i] < 1 || y[i] > k) result[i] <- 0
else result[i] <- n * ((k - y[i])/(k - 1))^(n-1) * (1/(k - 1))
}
return(result)
}
# For numerical integration
min_pdf_integrand <- function(y) {
result <- numeric(length(y))
for (i in 1:length(y)) {
if (y[i] < 1 || y[i] > k) result[i] <- 0
else result[i] <- n * ((k - y[i])/(k - 1))^(n-1) * (1/(k - 1))
}
return(result)
}
# Create vectors for plotting
y_values <- seq(1, k, 0.1)
cdf_values <- min_cdf(y_values)
pdf_values <- min_pdf(y_values)
# Plot the PDF and CDF
par(mfrow=c(2,1))
plot(y_values, pdf_values, type="l", main="PDF of Minimum Lead Time (Y)",
xlab="y (days)", ylab="f_Y(y)", col="blue", lwd=2)
plot(y_values, cdf_values, type="l", main="CDF of Minimum Lead Time (Y)",
xlab="y (days)", ylab="F_Y(y)", col="red", lwd=2)

# Calculate expected value and variance of Y using numerical integration
expected_y <- integrate(function(y) y * min_pdf_integrand(y), lower=1, upper=k)$value
var_y <- integrate(function(y) (y - expected_y)^2 * min_pdf_integrand(y), lower=1, upper=k)$value
sd_y <- sqrt(var_y)
cat("Expected minimum lead time:", round(expected_y, 4), "days\n")
## Expected minimum lead time: 4.1667 days
cat("Standard deviation of minimum lead time:", round(sd_y, 4), "days\n\n")
## Standard deviation of minimum lead time: 2.6763 days
# Alternative calculation using numerical methods
y_grid <- seq(1, k, 0.001)
pdf_grid <- min_pdf(y_grid) # Use the vectorized min_pdf function
expected_y_num <- sum(y_grid * pdf_grid * 0.001)
var_y_num <- sum((y_grid - expected_y_num)^2 * pdf_grid * 0.001)
sd_y_num <- sqrt(var_y_num)
cat("Using numerical approximation:\n")
## Using numerical approximation:
cat("Expected minimum lead time:", round(expected_y_num, 4), "days\n")
## Expected minimum lead time: 4.1668 days
cat("Standard deviation of minimum lead time:", round(sd_y_num, 4), "days\n\n")
## Standard deviation of minimum lead time: 2.6766 days
2.(Maintenance Planning for Critical Equipment):
Your organization owns a critical piece of equipment, such as a
high-capacity photocopier (for a law firm) or an MRI machine (for a
healthcare provider). The manufacturer estimates the expected lifetime
of this equipment to be 8 years, meaning that, on average, you expect
one failure every 8 years. It’s essential to understand the likelihood
of failure over time to plan for maintenance and replacements.
a. Geometric Model:
Calculate the probability that the machine will not fail for the
first 6 years. Also, provide the expected value and standard deviation.
This model assumes each year the machine either fails or does not,
independently of previous years.
Given the values for this exercise, an equipment with an average
lifetime of 8 years, \[
Probability\ of\ no\ failure\ in\ first\ 6\ years: 0.4503\\
Expected\ time\ to\ failure: 8\ years\\
Standard\ deviation: 7.4833\ years
\]
# Given parameters
lambda <- 1/8
years <- 6
# Geometric Model
p_failure <- lambda
p_no_failure_geo <- (1 - p_failure)^years
expected_geo <- 1/p_failure # Expected value for geometric distribution
sd_geo <- sqrt((1 - p_failure)/(p_failure^2)) # Standard deviation for geometric
cat("Geometric Model:\n")
## Geometric Model:
cat("Probability of no failure in first 6 years:", round(p_no_failure_geo, 4), "\n")
## Probability of no failure in first 6 years: 0.4488
cat("Expected time to failure:", round(expected_geo, 4), "years\n")
## Expected time to failure: 8 years
cat("Standard deviation:", round(sd_geo, 4), "years\n\n")
## Standard deviation: 7.4833 years
b. Exponential Model:
Calculate the probability that the machine will not fail for the
first 6 years. Provide the expected value and standard deviation,
modeling the time to failure as a continuous process. \[
Probability\ of\ no\ failure\ in\ first\ 6\ years: 0.4724\\
Expected\ time\ to\ failure: 8\ years\\
Standard\ deviation: 8\ years\\
\]
# Exponential Model
p_no_failure_exp <- exp(-lambda * years)
expected_exp <- 1/lambda
sd_exp <- 1/lambda
cat("Exponential Model:\n")
## Exponential Model:
cat("Probability of no failure in first 6 years:", round(p_no_failure_exp, 4), "\n")
## Probability of no failure in first 6 years: 0.4724
cat("Expected time to failure:", round(expected_exp, 4), "years\n")
## Expected time to failure: 8 years
cat("Standard deviation:", round(sd_exp, 4), "years\n\n")
## Standard deviation: 8 years
c. Binomial Model:
Calculate the probability that the machine will not fail during the
first 6 years, given that it is expected to fail once every 8 years.
Provide the expected value and standard deviation, assuming a fixed
number of trials (years) with a constant failure probability each year.
\[
Probability\ of\ no\ failure\ in\ first\ 6\ years: 0.4501\\
Expected\ number\ of\ failures\ in\ 6\ years: 0.75\\
Standard\ deviation: 0.8292
\]
# Binomial Model
n_years <- years
p_annual_failure <- 1/8
p_no_failure_bin <- dbinom(0, size=n_years, prob=p_annual_failure)
expected_bin <- n_years * p_annual_failure
sd_bin <- sqrt(n_years * p_annual_failure * (1 - p_annual_failure))
cat("Binomial Model:\n")
## Binomial Model:
cat("Probability of no failure in first 6 years:", round(p_no_failure_bin, 4), "\n")
## Probability of no failure in first 6 years: 0.4488
cat("Expected number of failures in 6 years:", round(expected_bin, 4), "\n")
## Expected number of failures in 6 years: 0.75
cat("Standard deviation:", round(sd_bin, 4), "\n\n")
## Standard deviation: 0.8101
d. Poisson Model:
Calculate the probability that the machine will not fail during the
first 6 years, modeling the failure events as a Poisson process. Provide
the expected value and standard deviation. \[
Probability\ of\ no\ failure\ in\ first\ 6\ years: 0.4724\\
Expected\ number\ of\ failures\ in\ 6\ years: 0.75\\
Standard\ deviation: 0.8660
\]
# Poisson Model
lambda_rate <- 1/8 # Failures per year
time_period <- years
p_no_failure_pois <- dpois(0, lambda=lambda_rate * time_period)
expected_pois <- lambda_rate * time_period # Expected value for Poisson
sd_pois <- sqrt(lambda_rate * time_period)
cat(" Poisson Model:\n")
## Poisson Model:
cat("Probability of no failure in first 6 years:", round(p_no_failure_pois, 4), "\n")
## Probability of no failure in first 6 years: 0.4724
cat("Expected number of failures in 6 years:", round(expected_pois, 4), "\n")
## Expected number of failures in 6 years: 0.75
cat("Standard deviation:", round(sd_pois, 4), "\n")
## Standard deviation: 0.866
Problem 4
1. Scenario:
You are managing two independent servers in a data center. The time
until the next failure for each server follows an exponential
distribution with different rates: Server A has a failure rate of \(λB = 0.5\) failures per hour. Server B has
a failure rate of \(λB = 0.3\) failures
per hour.
Question: What is the distribution of the total time until both
servers have failed at least once? Use the moment generating function
(MGF) to find the distribution of the sum of the times to failure.
To find the distribution of the total time until both servers have
failed, I’ll use moment generating functions. For an exponential
distribution with parameter \(λ\), the
MGF is: \[M(t) = λ/(λ-t), for\ t <
λ\] The MGF of the time until Server A fails is: \[M_A(t) = 0.5/(0.5-t)\] The MGF of the time
until Server B fails is: \[M_B(t) =
0.3/(0.3-t)\] Since the failures are independent, the MGF of
their sum is the product of their MGFs: \[M_Z(t) = M_A(t) × M_B(t) = [0.5/(0.5-t)] ×
[0.3/(0.3-t)]\] After partial fraction decomposition: \[M_Z(t) = 0.75[(1/(0.3-t)) - (1/(0.5-t))]\]
This is the MGF of a hypoexponential distribution with parameters \[λA = 0.5\ and\ λB = 0.3.\] The PDF of this
distribution is: \[f(t) = 0.75(e^(-0.3t) -
e^(-0.5t)) for t > 0\] The mean time until both servers fail
is: \[E[Z] = 1/λA + 1/λB = 1/0.5 + 1/0.3 = 2
+ 3.33 = 5.33 hours\] The standard deviation is approximately
\(3.89\ hours.\)
# Given Parameters
lambda_A <- 0.5
lambda_B <- 0.3
# Probability density function for the total time until both servers fail
hypoexp_pdf <- function(t, lambda1, lambda2) {
if (lambda1 == lambda2) {
return(lambda1^2 * t * exp(-lambda1 * t))
} else {
return(lambda1 * lambda2 / (lambda1 - lambda2) *
(exp(-lambda2 * t) - exp(-lambda1 * t)))
}
}
# CDF for the total time until both servers fail
hypoexp_cdf <- function(t, lambda1, lambda2) {
if (lambda1 == lambda2) {
return(1 - exp(-lambda1 * t) * (1 + lambda1 * t))
} else {
return(1 - lambda1/(lambda1-lambda2)*exp(-lambda2*t) +
lambda2/(lambda1-lambda2)*exp(-lambda1*t))
}
}
# Generate plot values
t_values <- seq(0, 15, 0.1)
pdf_values <- sapply(t_values, hypoexp_pdf, lambda1=lambda_A, lambda2=lambda_B)
cdf_values <- sapply(t_values, hypoexp_cdf, lambda1=lambda_A, lambda2=lambda_B)
# Calculate mean and variance
mean_hypoexp <- 1/lambda_A + 1/lambda_B
var_hypoexp <- 1/lambda_A^2 + 1/lambda_B^2
# Plot the PDF and CDF
par(mfrow=c(2,1))
plot(t_values, pdf_values, type="l", main="PDF of Time Until Both Servers Fail",
xlab="Time (hours)", ylab="Density", col="blue", lwd=2)
abline(v=mean_hypoexp, col="red", lty=2)
text(mean_hypoexp+1, max(pdf_values)*0.8, paste("Mean =", round(mean_hypoexp, 2)))
plot(t_values, cdf_values, type="l", main="CDF of Time Until Both Servers Fail",
xlab="Time (hours)", ylab="Probability", col="blue", lwd=2)

cat("Server Failure Analysis\n")
## Server Failure Analysis
cat("Mean time until both servers fail:", round(mean_hypoexp, 4), "hours\n")
## Mean time until both servers fail: 5.3333 hours
cat("Standard deviation:", round(sqrt(var_hypoexp), 4), "hours\n\n")
## Standard deviation: 3.8873 hours
2. Sum of Independent Normally Distributed Random Variables
Scenario: An investment firm is analyzing the returns of two
independent assets, Asset X and Asset Y. The returns on these assets are
normally distributed:
Asset X returns: \(X ~ N(μX = 5%, σX² =
4%)\) Asset Y returns: \(Y ~ N(μY = 7%,
σY² = 9%)\)
Question: Find the distribution of the combined return of the
portfolio consisting of these two assets using the moment generating
function (MGF).
For a normal distribution with parameters \(μ\ and\ σ²\), the MGF is: \[M(t) = exp(μt + σ²t²/2)\] For Asset \(X: M_X(t) = exp(5t + 2t²)\) For Asset \(Y: M_Y(t) = exp(7t + 4.5t²)\) The MGF of
the combined return Z = X + Y is: \[M_Z(t) =
M_X(t) × M_Y(t) = exp(5t + 2t²) × exp(7t + 4.5t²) = exp(12t +
6.5t²)\] This is the MGF of a normal distribution with: \[μZ = 12% and σZ² = 13%\] Therefore, the
combined return Z follows a normal distribution: \[Z ~ N(12%, 13%)\] with a standard
deviation of approximately \(3.61\%.\)
# Given Parameters
mu_X <- 5
sigma2_X <- 4
sigma_X <- sqrt(sigma2_X)
mu_Y <- 7
sigma2_Y <- 9
sigma_Y <- sqrt(sigma2_Y)
mu_sum <- mu_X + mu_Y
sigma2_sum <- sigma2_X + sigma2_Y
sigma_sum <- sqrt(sigma2_sum)
# Generate plot values for the individual distributions and their sum
x_values <- seq(-5, 20, 0.1)
pdf_X <- dnorm(x_values, mean=mu_X, sd=sigma_X)
pdf_Y <- dnorm(x_values, mean=mu_Y, sd=sigma_Y)
pdf_sum <- dnorm(x_values, mean=mu_sum, sd=sigma_sum)
# Plot the PDFs
par(mfrow=c(1,1))
plot(x_values, pdf_X, type="l", main="PDFs of Asset Returns",
xlab="Return (%)", ylab="Density", col="blue", lwd=2, ylim=c(0, max(pdf_X, pdf_Y, pdf_sum)))
lines(x_values, pdf_Y, col="red", lwd=2)
lines(x_values, pdf_sum, col="green", lwd=2)
legend("topright", legend=c("Asset X", "Asset Y", "Combined"),
col=c("blue", "red", "green"), lwd=2)

cat("Asset Return Analysis\n")
## Asset Return Analysis
cat("Combined return distribution: Normal\n")
## Combined return distribution: Normal
cat("Mean:", round(mu_sum, 2), "%\n")
## Mean: 12 %
cat("Variance:", round(sigma2_sum, 2), "%\n")
## Variance: 13 %
cat("Standard deviation:", round(sigma_sum, 2), "%\n\n")
## Standard deviation: 3.61 %
3.Scenario:
A call center receives calls independently from two different
regions. The number of calls received from Region A and Region B in an
hour follows a Poisson distribution:
Region A calls: \(XA ~ Poisson(λA =
3)\) Region B calls: \(XB ~ Poisson(λB
= 5)\)
Question: Determine the distribution of the total number of calls
received in an hour from both regions using the moment generating
function (MGF).
For a Poisson distribution with parameter \(λ\), the MGF is: \[M(t) = exp(λ(e^t - 1))\] For Region A:
\(M_A(t) = exp(3(e^t - 1))\) For Region
B: \(M_B(t) = exp(5(e^t - 1))\) The MGF
of the total calls Z = XA + XB is: \[M_Z(t) =
M_A(t) × M_B(t) = exp(3(e^t - 1)) × exp(5(e^t - 1)) = exp(8(e^t -
1))\] This is exactly the MGF of a Poisson distribution with
parameter \(λ = 8.\) Therefore, the
total number of calls follows a Poisson distribution: \(Z ~ Poisson(8)\) The mean and variance of
the total calls are both equal to \(8.\)
# Given Parameters
lambda_A_calls <- 3
lambda_B_calls <- 5
# For Poisson distributions
lambda_sum <- lambda_A_calls + lambda_B_calls
# Generate plot values for the PMFs
x_range <- 0:20
pmf_A <- dpois(x_range, lambda=lambda_A_calls)
pmf_B <- dpois(x_range, lambda=lambda_B_calls)
pmf_sum <- dpois(x_range, lambda=lambda_sum)
# Plot the PMFs
par(mfrow=c(1,1))
barplot(rbind(pmf_A, pmf_B, pmf_sum), beside=TRUE, col=c("blue", "red", "green"),
main="PMFs of Call Counts", xlab="Number of Calls", ylab="Probability",
names.arg=x_range, legend.text=c("Region A", "Region B", "Combined"))

cat("Problem 4.3: Call Center Analysis\n")
## Problem 4.3: Call Center Analysis
cat("Combined call distribution: Poisson\n")
## Combined call distribution: Poisson
cat("Parameter λ (mean):", lambda_sum, "calls per hour\n")
## Parameter λ (mean): 8 calls per hour
cat("Variance:", lambda_sum, "calls² per hour²\n")
## Variance: 8 calls² per hour²