Problem 1

(Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate.

Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease?

If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

Formula for Bayes’ Theorem

\(P(A|B)= \frac{P(B|A)\cdot P(A)}{P(B)}= \frac{P(B|A)\cdot P(A)}{P(B|A)\cdot P(A)+P(B|notA)+P(notA)}\)

where:

Probability of having the disease given a positive test result

Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease?

The probability that an individual who is reported as positive by the new test actually has the disease is \(P(Disease|PositiveTest)= \frac{P(PositiveTest|Disease)\cdot P(Disease)}{P(PositiveTest)}= \frac{P(PositiveTest|Disease)\cdot P(Disease)}{P(PositiveTest|Disease)\cdot P(Disease)+P(PositiveTest|NoDisease)\cdot P(NoDisease))}\) where,

\(P(PositiveTest|Disease)=0.96\) (the sensitivity of the test)

\(P(Disease)=0.001\) (the prevalance rate)

\(P(NoDisease)=1-0.001=0.999\)

\(P(PositiveTest|NoDisease)=1-0.98=0.02\) (false positive rate = 1-specificity estimate)

\(P(Disease|PositiveTest)= \frac{P0.06\cdot 0.001}{(P0.06\cdot 0.001)+(0.02\cdot 0.999)}\approx 0.0459\)

Total first-year cost

If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

# Given
median_cost <- 100000
test_cost <- 1000
number_of_individuals <- 100000
prevanlance_rate <- 0.001

# Calculate the total cost for true postive cases
# We assume 100000 (number_of_individuals) is the number of true positive cases
total_cost_tpos <- number_of_individuals*prevanlance_rate*median_cost

# Calculate the total cost for administrating the test
total_cost <- number_of_individuals*test_cost

total_1st_yr_cost <- total_cost_tpos + total_cost 

paste("The total first-year cost for treating 100,000 individuals is $",format(total_1st_yr_cost, big.mark= ",", scientific = FALSE))
## [1] "The total first-year cost for treating 100,000 individuals is $ 110,000,000"

Problem 2

(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05.

Formula for Binomoal Distribution

\(P(x=k)=\binom{N}{k} p^k(1-p)^{n-k}=\frac{n!}{(n-k)!k!}\cdot p^k(1-p)^{n-k}\)

Part 1

What is the probability that, after 24 months, you received exactly 2 inspections?

\(P(x=2)=\binom{24}{2} 0.05^2(1-0.05)^{24-2}=\frac{24!}{(24-2)!2!}\cdot 0.05^2(1-0.05)^{24-2}\)

n = 24
k = 2 
p = 0.05

p_2 <- dbinom(k, n, p)

paste("The probability that, after 24 months, you received exactly 2 inspections is ",p_2)
## [1] "The probability that, after 24 months, you received exactly 2 inspections is  0.22323814603186"

Part 2

What is the probability that, after 24 months, you received 2 or more inspections?

\(P(x\geq2)=P(x=2)+P(x=3)+P(x=4)+\cdots +P(x=23)+P(x=24)\)

or

\(P(x\geq2) = 1-[P(x=1)+P(x=0)]\)

# Instead of doing this: dbinom(2, n, p)+dbinom(3, n, p)+...+dbinom(24, n, p), we can use the 'sum function'

p_2_or_more <- sum(dbinom(k:n, n, p))

# OPTION 2: You can also use the code below
#p_2_or_more <- 1 -(dbinom(1, n, p)+dbinom(0, n, p))

# OPTION 3: You can also use the code below
#p_2_or_more <- pbinom(1, n, p, lower.tail=FALSE)

#The function `pbinom` returns the area to the left of given value k in the binomial distribution. In this case, pbinom(1, n, p, lower.tail=FALSE) gives the same result as the probability that you recieve more than 1 inspections. We set `lower.tail=FALSE` so that the function will return the area to the RIGHT of given value 1 in the binomial distribution.


paste("The probability that, after 24 months, you received 2 or more inspections is ",p_2_or_more)
## [1] "The probability that, after 24 months, you received 2 or more inspections is  0.339182734391199"

Part 3

What is the probability that your received fewer than 2 inspections?

\(P(x<2)= 1-P(x\geq2)\)

or

\(P(x<2)= P(x=1)+P(x=0)\)

p_less_than_2 <- 1- p_2_or_more

#p_less_than_2 <- dbinom(1, n, p) + dbinom(0, n, p)


#p_less_than_2  <- pbinom(1, n, p)
# You can also use the code above.
# The function 'pbinom' returns the area to the left of given value k in the binomial distribution. In this case, pbinom(1, n, p) gives the same result as the probability that your fewer than 2 inspections. Notice we used `1` instead of `2`. 
#pbinom(2, n, p) returns the probability that your received 2 or fewer inspections.

paste("The probability that your received fewer than 2 inspections is $",p_less_than_2)
## [1] "The probability that your received fewer than 2 inspections is $ 0.660817265608801"

Part 4

What is the expected number of inspections you should have received?

\(E(X)=n \cdot p\)

e <- n*p

paste("The expected number of inspections you should have received is ",e, "inspections.")
## [1] "The expected number of inspections you should have received is  1.2 inspections."

Part 5

What is the standard deviation? \(\sigma = \sqrt{np(1-p)} =\sqrt{24\cdot 0.5(1-0.5)}\)

sigma = sqrt(n*p*(1-p))

paste("The standard deviation is ",sigma)
## [1] "The standard deviation is  1.06770782520313"

Problem 3

(Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour.

Formula for Poisson Distribution

\(P(X=k)=\frac{e^{-\lambda} \lambda ^k}{k!}\)

Part 1

What is the probability that exactly 3 arrive in one hour?

\(\lambda = 10\) (rate of patient arrival per hour)

\(P(X=3)=\frac{e^{-10}\cdot 10 ^3}{3!}\)

lambda <- 10
x <- 3

p_3 <- dpois(x, lambda)

paste("The probability that exactly 3 patients arrive in one hour is ",p_3)
## [1] "The probability that exactly 3 patients arrive in one hour is  0.00756665496041414"

Part 2

What is the probability that more than 10 arrive in one hour?

\(P(X>10)=1-P(X\leq10)\)

# OPTION 1: Instead of doing this: 1 - [dpois(0,10) + dpois(1,10) + dpois(2,10) + ... + dpois(9,10) + dpois(10,10)], we can use the 'sum function'

p_more_than_10 <- 1 - sum(dpois(0:10, lambda))

# OPTION 2:

#p_more_than_10 <- 1- ppois(10, lambda)
# ppois(q, lambda) finds the probability that a certain number of successes or less occur based on an average rate of success. In this case, `ppois(10, lambda)` finds the probability that 10 patients or less  arrive in one hour.

# OPTION 3: 

#p_more_than_10 <- ppois(10, lambda, lower.tail = FALSE)
# lower.tail    logical; if TRUE (default), probabilities are P[X≤x], otherwise, P[X>x]

paste("The probability that more than 10 patients arrive in one hour is ",round(p_more_than_10,5))
## [1] "The probability that more than 10 patients arrive in one hour is  0.41696"

Part 3

How many would you expect to arrive in 8 hours?

\(E(X)=\lambda=rt\)

t <- 8
e_p <- lambda*t

paste("The expected number of patients to arrive in 8 hours is ", e_p , "patients.")
## [1] "The expected number of patients to arrive in 8 hours is  80 patients."

Part 4

What is the standard deviation of the appropriate probability distribution?

\(\sigma = \sqrt{\lambda)} =\sqrt{10)}\)

sigma_p = sqrt(lambda)

paste("The standard deviation is ", round(sigma_p,4))
## [1] "The standard deviation is  3.1623"

Part 5

If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

providers <- 3
patients_per_provider <- 24
Total_capacity <- providers*patients_per_provider

paste("The total capacity for", providers, "providers is", Total_capacity, "patients for each day.")
## [1] "The total capacity for 3 providers is 72 patients for each day."
# The expected number of patients to arrive in 8 hours is  80 patients. The average working hour per day is 8.5 hours. So we can expect about 80 patients per day. 

Percent_utilization <- e_p/Total_capacity*100

paste("The the percentage utilization per day is ", round(Percent_utilization,2), "%")
## [1] "The the percentage utilization per day is  111.11 %"

The family practice clinic’s percentage of utilization is over 100%. Below are some recommendations:

  • Hire more provider/staff

  • Adjust the appointment scheduling or reduce the number of appointments

Problem 4

(Hypergeometric). Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse.

Formula for Hypergeometric Distribution

\(P(X)=\frac{\binom{k}{x}\binom{N-k}{n-x}}{\binom{N}{n}}\) where,

\(N\) is the size of the population from which draws are taken

\(k\) is the number of “successful” objects in the population

\(N-k\) is the number of “defective” objects in the population

\(n\) is the size of the sample from which draws are taken

\(x\) is the number of successes that occur in the sample

Part 1

If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips?

N = 30
n = 6
k = 15
x = 5

p_5 <- dhyper(x,k, N-k,n) 

paste("The probability he/she would have selected five nurses for the trips is ",p_5)
## [1] "The probability he/she would have selected five nurses for the trips is  0.0758620689655173"

Part 2

How many nurses would we have expected your subordinate to send?

Formula for Expected value of Hypergeometric Distribution(Mn).)

\(E(X)=n\cdot \frac{k}{N}\)(Mn).)

e_h1 <- n*(k/N)

paste("The expected number of nurses would be send is ",e_h1, "nurses.")
## [1] "The expected number of nurses would be send is  3 nurses."

Part 3

How many non-nurses would we have expected your subordinate to send?

Formula for Expected value of Hypergeometric Distribution(Mn).)

\(E(Y)=n\cdot \frac{N-k}{N}\)(Mn).)

e_h2 <- n*(N-k/N)

paste("The expected number of nurses would be send is ",e_h2, "nurses.")
## [1] "The expected number of nurses would be send is  177 nurses."

Problem 5

(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year.

Formula for Geometric Distribution

\(P(X=x)=p(1-p)^{x-1}\) \(P(X\leq x)=1-(1-p)^{x}\)

Part 1

What is the probability that the driver will be seriously injured during the course of the year?

\(P(X\leq 1)=1-(1-0.001)^{1200}\)

p = 0.001 # 0.1%=0.001
x = 1200

p_g <- 1-((1-p)^x)

# OPTION 2:
#pgeom(x, p)

paste("The probability that the driver will be seriously injured during the course of the year is ",p_g)
## [1] "The probability that the driver will be seriously injured during the course of the year is  0.698986570906601"

Part 2

In the course of 15 months?

num_hrs_in_15mon <- (15/12*1200)

p_15 <- pgeom(num_hrs_in_15mon, p)

paste("The probability that the driver will be seriously injured during the course of 15 months is ",p_15)
## [1] "The probability that the driver will be seriously injured during the course of 15 months is  0.7772601990608"

Part 3

What is the expected number of hours that a driver will drive before being seriously injured?

Formula for expected value of Geometric Distribution

\(E(x)=\frac{1}{p}\) \(E(x)=\frac{1}{0.001}=1000\)

e_g <- 1/p

paste("The expected number of hours that a driver will drive before being seriously injured is ",e_g, "hours.")
## [1] "The expected number of hours that a driver will drive before being seriously injured is  1000 hours."

Part 4

Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

\(P(X\leq 100)=1-(1-p)^{100}\)

t = 100
p_100 <- 1-((1-p)^t)

# OPTION 2:
#pgeom(t, p)

paste("Given that a driver has driven 1200 hours,the probability that the driver will be injured in the next 100 hours is ",p_100)
## [1] "Given that a driver has driven 1200 hours,the probability that the driver will be injured in the next 100 hours is  0.0952078528862911"

Problem 6

You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours.

Part 1

What is the probability that the generator will fail more than twice in 1000 hours?

\(P(X=k)=\frac{e^{-\lambda} \lambda ^k}{k!}\) \(\lambda = 1\)

\(P(X>2) = 1 - P(X\leq 2)=1-[P(X=2)+P(X=1)+P(X=0)]\) \(P(X>2)=1-[\frac{e^{-\lambda} \lambda ^2}{2!}+\frac{e^{-\lambda} \lambda ^1}{1!}+\frac{e^{-\lambda} \lambda ^0}{0!}]\)

lambda6 <- 1
x <- 2

# OPTION 1: Instead of doing this: 1 - [dpois(0,lambda6) + dpois(1,lambda6) + dpois(2,lambda6) we can use the 'sum function'

p_more_than_2 <- 1 - sum(dpois(0:2, lambda6))

# OPTION 2:

#p_more_than_2 <- 1- ppois(2, lambda)
# ppois(q, lambda) finds the probability that a certain number of successes or less occur based on an average rate of success. In this case, `ppois(2, lambda)` finds the probability that the generator will fail twice or less in 1000 hours

# OPTION 3: 

#p_more_than_2 <- ppois(2, lambda6, lower.tail = FALSE)
# lower.tail    logical; if TRUE (default), probabilities are P[X≤x], otherwise, P[X>x]

paste("The probability that the generator will fail more than twice in 1000 hours is ",round(p_more_than_2,4))
## [1] "The probability that the generator will fail more than twice in 1000 hours is  0.0803"

Part 2

What is the expected value?

\(E(X)=\lambda=rt\)

e_p6 <- lambda6

paste("The expected number of generators that fail in 1000 hours is ", e_p6 )
## [1] "The expected number of generators that fail in 1000 hours is  1"

Problem 7

A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes.

Part 1

What is the probability that this patient will wait more than 10 minutes?

The probability density function (pdf) for values uniformly distributed across [a,b] is \(f(x)=\frac{1}{b-a}\).

For this problem, the probability density function for values uniformly distributed across [0,30] is \(f(x)=\frac{1}{30-0}=\frac{1}{30}\)

\(X\) is the patient wait time \(P(X>10)=1-P(X\leq10)\) \(P(X>10)=\int_{30}^{10}\frac{1}{30}dx\) \(P(X>10)=\frac{30-10}{30}=\frac{20}{30}=\frac{2}{3}\)

p7 <- 1-punif(10, min = 0, max = 30)

paste("The probability that tthat surgical patient will wait more than 10 minute is ",p7)
## [1] "The probability that tthat surgical patient will wait more than 10 minute is  0.666666666666667"

Part 2

If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen?

\(P(X \geq 15|P(X\geq10) = \frac{P(X \geq 15)andP(X\geq10)}{P(X\geq10)}=\frac{P(X \geq 15)}{P(X\geq10)}\)

\(\frac{P(X \geq 15)}{P(X\geq10)}=\frac{\frac{30-15}{30}}{\frac{30-10}{30}}=\frac{\frac{1}{2}}{\frac{2}{3}}=\frac{1}{2}\cdot\frac{3}{2}=\frac{3}{4}\)

p_wait_time2 <- 1-punif(15, min = 10, max = 30)


paste("The probability that tthat surgical patient will wait more than 10 minute is ",p_wait_time2)
## [1] "The probability that tthat surgical patient will wait more than 10 minute is  0.75"

Part 3

What is the expected waiting time?

\(E(X)=\frac{1}{2}(a+b)\) \(E(X)=\frac{1}{2}(0+30)=\frac{1}{2}\cdot 30=15\)

a = 0
b = 30
e7 <- 0.5*(a+b)

paste("The expected waiting time is",e7,"minutes." )
## [1] "The expected waiting time is 15 minutes."

Problem 8

Your hospital owns an old MRI, which has a manufacturer’s lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRIs obeys an exponential distribution.

\(P(X)=\lambda e ^{-\lambda x}\)

\(P(X>x)=e ^{-\lambda x}\)

Part 1

What is the expected failure time?

\(E(X)=\frac{1}{\lambda}=10\)

The expected failure time is 10 years.

Part 3

What is the probability that your MRI will fail after 8 years?

\(E(X)=\frac{1}{\lambda}=10\)

Solve for \(\lambda\). \(\lambda= \frac{1}{10}\)

\(P(X>8)=e ^{-\frac{1}{10} \cdot 8}\)

lambda8 <- 1/10
p8 <- 1-pexp(8,lambda8)
p8
## [1] 0.449329
paste("The probability that your MRI will fail after 8 years is", round(p8,4))
## [1] "The probability that your MRI will fail after 8 years is 0.4493"

Part 4

Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?

\(P(a\leq X \leq b =F(b)-F(a)\) \(P(8\leq X \leq10 =F(10)-F(8)\)

p_8_10 <- pexp((10-8), lambda8)
p_8_10
## [1] 0.1812692