1 Problem 1.

1.1 1. (Bayesian):

A new credit scoring system has been developed to predict the likelihood of loan defaults. The system has a 90% sensitivity, meaning that it correctly identifies 90% of those who will default on their loans. It also has a 95% specificity, meaning that it correctly identifies 95% of those who will not default. The default rate among borrowers is 2%. Given these prevalence, sensitivity, and specificity estimates, what is the probability that a borrower flagged by the system as likely to default will actually default? If the average loss per defaulted loan is $200,000 and the cost to run the credit scoring test on each borrower is $500, what is the total first-year cost for evaluating 10,000 borrowers? NOTE: There were many ways to think about this problem.

dformat=function(w,x) noquote(paste0(w,"=","$", format(x, big.mark=",",nsmall=2)))
pformat=function(x) noquote(paste0("Pr=",x))
myformat=function(w,x) noquote(paste0(w,"=",x))


pformat((.9*.02)/(.9*.02 +.05*.98))
## [1] Pr=0.26865671641791
dformat("TC",500*10000)
## [1] TC=$5,000,000.00
dformat("Net Cost=",(10000*500-.9*200*200000))
## [1] Net Cost==$-31,000,000.00

1.2 2. (Binomial):

The probability that a stock will pay a dividend in any given quarter is 0.7. What is the probability that the stock pays dividends exactly 6 times in 8 quarters? What is the probability that it pays dividends 6 or more times? What is the probability that it pays dividends fewer than 6 times? What is the expected number of dividend payments over 8 quarters? What is the standard deviation?

x=6; N=8; p=.7
pformat(dbinom(x,N,p))
## [1] Pr=0.29647548
pformat(sum(dbinom(x:N,N,p)))
## [1] Pr=0.55177381
pformat(pbinom(x-1,N,p))
## [1] Pr=0.44822619
myformat("EX=",N*p)
## [1] EX==5.6

1.3 3. (Poisson):

A financial analyst notices that there are an average of 12 trading days each month when a certain stock’s price increases by more than 2%. What is the probability that exactly 4 such days occur in a given month? What is the probability that more than 12 such days occur in a given month? How many such days would you expect in a 6-month period? What is the standard deviation of the number of such days? If an investment strategy requires at least 70 days of such price increases in a year for profitability, what is the percent utilization and what are your recommendations?

lambda=12; t=6
pformat(dpois(4,lambda))
## [1] Pr=0.00530859947327557
myformat("EX=", lambda*t)
## [1] EX==72
myformat("SX=", sqrt(lambda*t))
## [1] SX==8.48528137423857
pformat(ppois(69,lambda*t, lower.tail=FALSE))
## [1] Pr=0.608943690746807
myformat("Run with it","Cool")
## [1] Run with it=Cool

1.4 4. (Hypergeometric):

A hedge fund has a portfolio of 25 stocks, with 15 categorized as high-risk and 10 as low-risk. The fund manager randomly selects 7 stocks to closely monitor. If the manager selected 5 high-risk stocks and 2 low-risk stocks, what is the probability of selecting exactly 5 high-risk stocks if the selection was random? How many high-risk and low-risk stocks would you expect to be selected?

S=10; F=15; k=7; s=5
pformat(dhyper(s,S,F,k))
## [1] Pr=0.0550447264406074
myformat("EF", S*k/(S+F) )
## [1] EF=2.8
myformat("EF", F*k/(S+F) )
## [1] EF=4.2

1.5 5. (Geometric):

The probability that a bond defaults in any given year is 0.5%. A portfolio manager holds this bond for 10 years. What is the probability that the bond will default during this period? What is the probability that it will default in the next 15 years? What is the expected number of years before the bond defaults? If the bond has already survived 10 years, what is the probability that it will default in the next 2 years?

p=.995; n1=10; n2=15
pformat(1-p^n1)
## [1] Pr=0.0488898695342281
pformat(1-p^n2)
## [1] Pr=0.0724310311816722
myformat("EX", 1/(1-p))
## [1] EX=200
pformat(1-p^2) #memoryless
## [1] Pr=0.00997499999999996

1.6 6. (Poisson):

A high-frequency trading algorithm experiences a system failure about once every 1500 trading hours. What is the probability that the algorithm will experience more than two failures in 1500 hours? What is the expected number of failures?

lambda=1; x=2
pformat(ppois(2, 1, lower.tail=FALSE))
## [1] Pr=0.0803013970713942
myformat("EX", lambda)
## [1] EX=1

1.7 7. (Uniform Distribution):

An investor is trying to time the market and is monitoring a stock that they believe has an equal chance of reaching a target price between 20 and 60 days. What is the probability that the stock will reach the target price in more than 40 days? If it hasn’t reached the target price by day 40, what is the probability that it will reach it in the next 10 days? What is the expected time for the stock to reach the target price?

a=20; b=60; x=40
pformat((b-(x+1))/(b-a)) #+1 because it says MORE than
## [1] Pr=0.475
#Let A=P(Day 50); Let B=P(Day40): P(A|B)=P(B|A)P(A)/P(B)  
PB_A = 1 #if no target price by Day 50, then no target price by Day 40
PA=(b-50)/(b-a) #10 days past 40
PB=(b-x)/(b-a) #40 days
pformat(PB_A*PA/PB)
## [1] Pr=0.5
myformat("EX", (b+a)/2)
## [1] EX=40

1.8 8. (Exponential Distribution):

A financial model estimates that the lifetime of a successful start-up before it either goes public or fails follows an exponential distribution with an expected value of 8 years. What is the expected time until the start-up either goes public or fails? What is the standard deviation? What is the probability that the start-up will go public or fail after 6 years? Given that the start-up has survived for 6 years, what is the probability that it will go public or fail in the next 2 years?

gamma=1/8
myformat("EX",1/gamma)
## [1] EX=8
myformat("SX",1/gamma)
## [1] SX=8
pformat(pexp(6,gamma, lower.tail=TRUE))
## [1] Pr=0.527633447258985
pformat(pexp(2,gamma)) #memoryless
## [1] Pr=0.221199216928595

2 Problem 2.

2.1 1. (Product Selection):

A company produces 5 different types of green pens and 7 different types of red pens. The marketing team needs to create a new promotional package that includes 5 pens. How many different ways can the package be created if it contains fewer than 2 green pens?

myformat("Ways", choose(5,0)*choose(7,5)+choose(5,1)*choose(7,4))
## [1] Ways=196

2.2 2. (Team Formation for a Project):

A project committee is being formed within a company that includes 14 senior managers and 13 junior managers. How many ways can a project team of 5 members be formed if at least 4 of the members must be junior managers?

myformat("Ways", choose(13,4)*choose(14,1)+choose(13,5)*choose(14,0))
## [1] Ways=11297

2.3 3. (Marketing Campaign Outcomes):

A marketing campaign involves three stages: first, a customer is sent 5 email offers; second, the customer is targeted with 2 different online ads; and third, the customer is presented with 3 personalized product recommendations. If the email offers, online ads, and product recommendations are selected randomly, how many different possible outcomes are there for the entire campaign?

myformat("Ways", 5*2*3) #multiplication rule
## [1] Ways=30

2.4 4. (Product Defect Probability):

A quality control team draws 3 products from a batch of size N without replacement. What is the probability that at least one of the products drawn is defective if the defect rate is known to be consistent?

\[ P(\text{At least one defective}) =\sum_{k=1}^3 \frac{\binom{D}{k}\binom{N-D}{3-k}}{\binom{N}{3}},D=Defective \] ## 5. (Business Strategy Choices):

A business strategist is choosing potential projects to invest in, focusing on 17 high-risk, high-reward projects and 14 low-risk, steady-return projects.

2.4.1 o Step 1:

How many different combinations of 5 projects can the strategist select?

myformat("Ways", choose(31,5))
## [1] Ways=169911

2.4.2 o Step 2:

How many different combinations of 5 projects can the strategist select if they want at least one low-risk project?

myformat("Ways", choose(31,5)-choose(17,5)*choose(14,0))
## [1] Ways=163723

2.5 6. (Event Scheduling):

A business conference needs to schedule 9 different keynote sessions from three different industries: technology, finance, and healthcare. There are 4 potential technology sessions, 104 finance sessions, and 17 healthcare sessions to choose from. How many different schedules can be made? Express your answer in scientific notation rounding to the hundredths place.

options(scipen=0)
myformat("Ways", formatC(factorial(9)*choose(125,9), format="e", digits=2))
## [1] Ways=5.55e+18

2.6 7. (Book Selection for Corporate Training): An HR manager needs to create a reading list for a corporate leadership training program, which includes 13 books in total. The books are categorized into 6 novels, 6 business case studies, 7 leadership theory books, and 5 strategy books.

2.6.1 o Step 1:

If the manager wants to include no more than 4 strategy books, how many different reading schedules are possible? Express your answer in scientific notation rounding to the hundredths place.

myformat("Ways", factorial(13)*(choose(24,13)-choose(5,5)*choose(19,8)))
## [1] Ways=15072889921689600

2.6.2 o Step 2:

If the manager wants to include all 6 business case studies, how many different reading schedules are possible? Express your answer in scientific notation rounding to the hundredths place.

myformat("Ways", formatC(factorial(13)*choose(24-6, 13-6), format="e", digits=2))
## [1] Ways=1.98e+14

2.7 8. (Product Arrangement):

A retailer is arranging 10 products on a display shelf. There are 5 different electronic gadgets and 5 different accessories. What is the probability that all the gadgets are placed together and all the accessories are placed together on the shelf? Express your answer as a fraction or a decimal number rounded to four decimal places.

pformat(2*factorial(5)*factorial(5)/factorial(10))
## [1] Pr=0.00793650793650794

##9. (Expected Value of a Business Deal):

A company is evaluating a deal where they either gain $4 for every successful contract or lose $16 for every unsuccessful contract. A “successful” contract is defined as drawing a queen or lower from a standard deck of cards. (Aces are considered the highest card in the deck.)

2.7.1 o Step 1: Find the expected value of the deal. Round your answer to two decimal places. Losses must be expressed as negative values.

ans=4*44/52-16*8/52
myformat("EX", formatC(ans, digits=2))
## [1] EX=0.92

2.7.2 o Step 2: If the company enters into this deal 833 times, how much would they expect to win or lose? Round your answer to two decimal places. Losses must be expressed as negative values.

myformat("E833X",formatC(ans*833, format="f",digits=2))
## [1] E833X=768.92

3 Problem 3.

3.1 1. (Supply Chain Risk Assessment):

Let X1,X2,…,Xn represent the lead times (in days) for the delivery of key components from n=5 different suppliers. Each lead time is uniformly distributed across a range of 1 to k=20 days, reflecting the uncertainty in delivery times. Let Y denote the minimum delivery time among all suppliers. Understanding the distribution of Y is crucial for assessing the earliest possible time you can begin production. Determine the distribution of Y to better manage your supply chain and minimize downtime.

Let \(Y\) be the distribution of the minimum.

Equation 1 (CDF Formulation from Complement): \(P(Y \le y)=1-P(X1>y, X2>y,..Xn>y)\). All \(X_i\) must be greater than y.

The probability that all \(X_i\) are greater than y is then \((1-F(y))^n\). Substituting into Equation 1, the cumulative distribution function is then Equation 2.

Equation 2 (CDF of Minimum): \(1-(1-F(y))^n\).

Now, the CDF of a discrete uniform is well-known as \(\frac{\lfloor{x}\rfloor-a+1}{b-a+1}\). Here, as is the minimum of the uniform (in our case, 1) and b is the maximum (in our case 20). The floor of x are the integers {1,2,3..20}. Substituting into Equation 2, we have Equation 3.

Equation 3 (CDF of Discrete Uniform Minimum): \(1-(1-\frac{\lfloor{y}\rfloor-a+1}{b-a+1})^n\)

With a=1 and b=20, we have Equation 4.

Equation 4 (CDF of Our Discrete Uniform Minimum): \(1-(1-\frac{\lfloor{y}\rfloor}{20})^n\)

Plugging in 1 through 20 for the floor of X generates the distribution.

n <- 5
a <- 1
b <- 20
x_vals <- 1:20
F_Y <- 1 - (1 - (x_vals / (b - a + 1)))^n
data.frame(Day = x_vals, P_Y_leq_x = round(F_Y, 4))
##    Day P_Y_leq_x
## 1    1    0.2262
## 2    2    0.4095
## 3    3    0.5563
## 4    4    0.6723
## 5    5    0.7627
## 6    6    0.8319
## 7    7    0.8840
## 8    8    0.9222
## 9    9    0.9497
## 10  10    0.9688
## 11  11    0.9815
## 12  12    0.9898
## 13  13    0.9947
## 14  14    0.9976
## 15  15    0.9990
## 16  16    0.9997
## 17  17    0.9999
## 18  18    1.0000
## 19  19    1.0000
## 20  20    1.0000

3.2 2. (Maintenance Planning for Critical Equipment):

Your organization owns a critical piece of equipment, such as a high-capacity photocopier (for a law firm) or an MRI machine (for a healthcare provider). The manufacturer estimates the expected lifetime of this equipment to be 8 years, meaning that, on average, you expect one failure every 8 years. It’s essential to understand the likelihood of failure over time to plan for maintenance and replacements.

3.2.1 a. Geometric Model:

Calculate the probability that the machine will not fail for the first 6 years. Also, provide the expected value and standard deviation. This model assumes each year the machine either fails or does not, independently of previous years.

pformat((7/8)^6)
## [1] Pr=0.448795318603516

3.2.2 b. Exponential Model:

Calculate the probability that the machine will not fail for the first 6 years. Provide the expected value and standard deviation, modeling the time to failure as a continuous process.

fine_grain=1; lambda=8*fine_grain; beta=1/lambda; x=6*fine_grain

pformat(pexp(x,1/lambda,lower.tail=FALSE))
## [1] Pr=0.472366552741015
myformat("EX", 1/lambda)
## [1] EX=0.125
myformat("SX",1/lambda)
## [1] SX=0.125

3.2.3 c. Binomial Model:

Calculate the probability that the machine will not fail during the first 6 years, given that it is expected to fail once every 8 years. Provide the expected value and standard deviation, assuming a fixed number of trials (years) with a constant failure probability each year.

pformat(dbinom(0,6,1/8))
## [1] Pr=0.448795318603516
myformat("EX", 6*1/8)
## [1] EX=0.75
myformat("SX",sqrt(6*1/8*7/8))
## [1] SX=0.810092587300983

3.2.4 d. Poisson Model:

Calculate the probability that the machine will not fail during the first 6 years, modeling the failure events as a Poisson process. Provide the expected value and standard deviation.

pformat(dpois(0,6/8))
## [1] Pr=0.472366552741015
myformat("EX", 6/8)
## [1] EX=0.75
myformat("SX",sqrt(6/8))
## [1] SX=0.866025403784439

4 Problem 4.

4.1 1. MGF

You are managing two independent servers in a data center. The time until the next failure for each server follows an exponential distribution with different rates: • Server A has a failure rate of \(\lambda_A = 0.5\) failures per hour. • Server B has a failure rate of \(\lambda_B = 0.3\) failures per hour.

What is the distribution of the total time until both servers have failed at least once? Use the moment generating function (MGF) to find the distribution of the sum of the times to failure.

\(M_x(t)=E(e^{tx})=\frac{\lambda}{\lambda-t}\)

By independence, we can multiply both \(M_x(t)\) together.

\(M_TA(t)\times M_TB(t)=\frac{.5}{.5-t}\frac{.3}{.3-t}, t>.3\)

This distribution has a name: the hypoexponential!

4.2 2. Sum of Independent Normally Distributed Random Variables

An investment firm is analyzing the returns of two independent assets, Asset X and Asset Y. The returns on these assets are normally distributed: \(X\sim \text{N}(\mu_X = 5\%, \sigma_X^2 = 4\%)\) \(Y\sim \text{N}(\mu_Y = 7\%, \sigma_Y^2 = 9\%)\)

Question: Find the distribution of the combined return of the portfolio consisting of these two assets using the moment generating function (MGF).

First, the MGF of the normal is \(M_N(t)=exp(\mu t+ 0.5 \sigma^2t^2)\)

As above, \(M_X(t) \times M_Y(t)=exp(0.05t+.002t^2) \times exp(0.07t+.0045t^2)\)

Simplifying… \(M_{X+Y}(t)=exp(0.12t+0.065t^2)\)

And we can see this is the MGF of a Normal with the mean of 0.12 and a variance of \(0.065\).

4.3 3. Poisson MGF

• Region A: \(X_A \sim \text{Poisson}(\lambda_A = 3)\) • Region B: \(X_B \sim \text{Poisson}(\lambda_B = 5)\)

Question: Determine the distribution of the total number of calls received in an hour from both regions using the moment generating function (MGF).

Same as above.. \(M_A(t) \times M_B(t)=exp(3(e^t-1)) \times exp(5(e^t-1))=exp(8(e^t-1))\)

Which is a Poisson MGF with rate 8.

5 Problem 5.

5.1 1. Customer Retention and Churn Analysis

A telecommunications company wants to model the behavior of its customers regarding their likelihood to stay with the company (retention) or leave for a competitor (churn). The company segments its customers into three states: • State 1: Active customers who are satisfied and likely to stay (Retention state). • State 2: Customers who are considering leaving (At-risk state). • State 3: Customers who have left (Churn state). The company has historical data showing the following monthly transition probabilities: • From State 1 (Retention): 80% stay in State 1, 15% move to State 2, and 5% move to State 3. • From State 2 (At-risk): 30% return to State 1, 50% stay in State 2, and 20% move to State 3. • From State 3 (Churn): 100% stay in State 3.

Retention (R), At-risk (A), Churn (C)

5.2 (a)

The company wants to analyze the long-term behavior of its customer base. Question: (a) Construct the transition matrix for this Markov Chain. (b) If a customer starts as satisfied (State 1), what is the probability that they will eventually churn (move to State 3)? (c) Determine the steady-state distribution of this Markov Chain. What percentage of customers can the company expect to be in each state in the long run?

library(markovchain)
## Loading required package: Matrix
## Package:  markovchain
## Version:  0.10.0
## Date:     2024-11-14 00:00:02 UTC
## BugReport: https://github.com/spedygiorgio/markovchain/issues
states <- c("R", "A", "C")

transitionMatrix <- matrix(c(
  0.80, 0.15, 0.05,  # From R
  0.30, 0.50, 0.20,  # From A
  0.00, 0.00, 1.00   # From C (absorbing)
), 
nrow = 3, byrow = TRUE)

mc_customer <- new("markovchain", states = states, transitionMatrix = transitionMatrix, name = "Customer Behavior")
mc_customer
## Customer Behavior 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  R, A, C 
##  The transition matrix  (by rows)  is defined as follows: 
##     R    A    C
## R 0.8 0.15 0.05
## A 0.3 0.50 0.20
## C 0.0 0.00 1.00

5.3 (b)

If the warehouse starts with a high inventory level (State 1), what is the probability that it will eventually end up in a low inventory level (State 3)?

absorbingStates(mc_customer)
## [1] "C"

5.4 (c)

Determine the steady-state distribution of this Markov Chain. What is the long-term expected proportion of time that the warehouse will spend in each inventory state?

steadyStates(mc_customer)
##      R A C
## [1,] 0 0 1

100% will eventually churn.

5.5 2. Inventory Management in a Warehouse

A warehouse tracks the inventory levels of a particular product using a Markov Chain model. The inventory levels are categorized into three states: • State 1: High inventory (More than 100 units in stock). • State 2: Medium inventory (Between 50 and 100 units in stock). • State 3: Low inventory (Less than 50 units in stock). The warehouse has the following transition probabilities for inventory levels from one month to the next: • From State 1 (High): 70% stay in State 1, 25% move to State 2, and 5% move to State 3. • From State 2 (Medium): 20% move to State 1, 50% stay in State 2, and 30% move to State 3. • From State 3 (Low): 10% move to State 1, 40% move to State 2, and 50% stay in State 3.

The warehouse management wants to optimize its restocking strategy by understanding the long-term distribution of inventory levels. Question: (a) Construct the transition matrix for this Markov Chain. (b) If the warehouse starts with a high inventory level (State 1), what is the probability that it will eventually end up in a low inventory level (State 3)? (c) Determine the steady-state distribution of this Markov Chain. What is the long-term expected proportion of time that the warehouse will spend in each inventory state?

States: High Inventory (H), Medium Inventory (M), Low Inventory (L)

  1. Construct the Transition Matrix
states <- c("H", "M", "L")

transitionMatrix <- matrix(c(
  0.70, 0.25, 0.05,   # From High
  0.20, 0.50, 0.30,   # From Medium
  0.10, 0.40, 0.50    # From Low
), byrow = TRUE, nrow = 3)

mc_inventory <- new("markovchain", transitionMatrix = transitionMatrix, states = states, name = "Inventory")
mc_inventory
## Inventory 
##  A  3 - dimensional discrete Markov Chain defined by the following states: 
##  H, M, L 
##  The transition matrix  (by rows)  is defined as follows: 
##     H    M    L
## H 0.7 0.25 0.05
## M 0.2 0.50 0.30
## L 0.1 0.40 0.50

5.6 (b)

If the warehouse starts in State 1 (High), what is the probability that it eventually ends up in Low (State 3)? This is not an absorbing chain, so “eventually ends up in” can be interpreted as the long-run probability that the warehouse is in State 3 — which is just the steady-state probability for State 3 when starting in State 1.

steady <- steadyStates(mc_inventory)
round(steady, 4)[3]
## [1] 0.2667

5.7 (c)

round(steady,4)
##           H      M      L
## [1,] 0.3467 0.3867 0.2667