1. (Bayesian).

A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate.Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease? If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

First

Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease?

if \(A\) is the probability that one has the disease, and \(B\) is the probability they they test positive, then we are looking for \(P(A|B)\). We can calculate this conditional probability as \(P(A|B) = P(B|A)*P(A) /P(B)\).

We know that \(P(A)\) (the prevalance) is equal to 0.001, that \(P(B'|A')\) (the specificity) is equal to 0.98, and that \(P(B|A)\) (the sensitivity) is 0.96. We don’t yet know the value for \(P(B)\), but we can use an alternative definition of the conditional probability that using Bayes Rule:

\[P(A|B) = \frac{P(B|A)*P(A)} {P(B|A)*P(A)+(P(B|A')*(P(A'))}\]

We can calculate \(P(B|A')\) as \(1 - P(B'|A')\) and \(P(A)\) as \(1 - P(A)\).

sensitivity = 0.96
specificity = 0.98
prevalance = 0.001

(sensitivity * prevalance) / (sensitivity * prevalance + (1 - specificity) * (1 - prevalance))
## [1] 0.04584527

The probability that a person has the disease given that the test is positive is 0.0458 (4.58%).

I’ll also test this out with a more “empirical” approach and put together a contigency table. We’ll assume we’re working with a test population of 100,000. We fill values using the probabilities we know (\(P(A)\), \(P(B|A)\), and \(P(B'|A')\)), then use the converse of these values to fill in the remainder of the table.

n = 100000

contingency <- table(
  factor(levels = c('sick','not sick','margin')), 
  factor(levels = c('positive','negative','margin'))
)

contingency['sick', 'margin'] <- prevalance * n
contingency['not sick', 'margin'] <- (1 - prevalance) * n

contingency['sick', 'positive'] <- sensitivity * contingency['sick', 'margin']
contingency['sick', 'negative'] <- (1 - sensitivity) * contingency['sick', 'margin']

contingency['not sick', 'negative'] <- specificity * contingency['not sick', 'margin']
contingency['not sick', 'positive'] <- (1 - specificity) * contingency['not sick', 'margin']

contingency['margin', 'positive'] <- sum(contingency[,'positive'])
contingency['margin', 'negative'] <- sum(contingency[,'negative'])

round(contingency)
##           
##            positive negative margin
##   sick           96        4    100
##   not sick     1998    97902  99900
##   margin       2094    97906      0

We can then convert these to probalities (marginal probabilities along the marjins, joint probabilities in the top-left square).

contingency_freq <- contingency / n

contingency_freq
##           
##            positive negative  margin
##   sick      0.00096  0.00004 0.00100
##   not sick  0.01998  0.97902 0.99900
##   margin    0.02094  0.97906 0.00000

Finally, we divide \(P(AB)\) (i.e. the joint probability that one is sick and tests positive) by \(P(B)\) (i.e. the marginal probabilities of testing positive) to get the probability that one is truly sick given that they have tested positive (i.e. \(P(A|B)\)).

contingency_freq['sick','positive'] / contingency_freq['margin','positive']
## [1] 0.04584527

The result matches with ~ 0.0458 (4.58%).

To learn more about Bayesian

Second

If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1,000 per administration, what is the total first-year cost for treating 100,000 individuals?

The cost of testing is simply the cost per test ($1000) times the number of tests (100,000). To get the cost of treatment, there are two appraoches. If we assume that anyone who tests positive will require full treatment, then we multiply the marginal probability of testing positive (\(P(A)\)) by the number of individuals tested (100,000).

test_cost = 100000 * 1000

treatment_cost1 = contingency['margin', 'positive'] * 100000

total_cost = test_cost + treatment_cost1

cat(
  'Test cost: $', format(test_cost, big.mark = ',', scientific = FALSE),
  '\nTreatment cost: $', format(treatment_cost1, big.mark = ',', scientific = FALSE),
  '\nTotal cost: $', format(total_cost, big.mark = ',', scientific = FALSE)
)
## Test cost: $ 100,000,000 
## Treatment cost: $ 209,400,000 
## Total cost: $ 309,400,000

However, if we assume that only individuals who are truly sick require the full $100,000 treatment, then we multiply the conditional probability of testing positive (\(P(A)\)) by the number of individuals tested (100,000).

treatment_cost2 = contingency['sick', 'positive'] * 100000

total_cost = test_cost + treatment_cost2

cat(
  'Test cost: $', format(test_cost, big.mark = ',', scientific = FALSE),
  '\nTreatment cost: $', format(treatment_cost2, big.mark = ',', scientific = FALSE),
  '\nTotal cost: $', format(total_cost, big.mark = ',', scientific = FALSE)
)
## Test cost: $ 100,000,000 
## Treatment cost: $ 9,600,000 
## Total cost: $ 109,600,000

2. (Binomial).

The probability of your organization receiving a Joint Commission inspection in any given month is .05. What is the probability that, after 24 months, you received exactly 2 inspections? What is the probability that, after 24 months, you received 2 or more inspections? What is the probability that your received fewer than 2 inspections? What is the expected number of inspections you should have received? What is the standard deviation?

First

What is the probability that, after 24 months, you received exactly 2 inspections?

\(P(X = 2)\)

dbinom(2,24,0.05)
## [1] 0.2232381

The probability of receiving exactly 2 inspections in 24 months is 22%.

Second

What is the probability that, after 24 months, you received 2 or more inspections?

\(P(X \ge 2) = 1 - P(X < 2) = 1 - P(X \le 1)\)

1 - pbinom(1,24,0.05)
## [1] 0.3391827

The probability of having 2 or more inspections in 24 months is 34%.

Third

What is the probability that your received fewer than 2 inspections?

\(P(X < 2) = P(X \le 1)\)

pbinom(1,24,0.05)
## [1] 0.6608173

The probability of receiving less than 2 inspections in 24 months is 66%.

Fourth

What is the expected number of inspections you should have received?

\(P(X) \times n\)

n = 24
p = 0.05

n*p
## [1] 1.2

The expected number of inspections is 1.2.

fifth

What is the standard deviation?

\(\sigma^{2} = npq \Longrightarrow \sigma = \sqrt{npq}\)

q = 1 - p

sqrt(n*p*q)
## [1] 1.067708

The standard deviation is 1.07.

3. (Poisson).

You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour. What is the probability that exactly 3 arrive in one hour? What is the probability that more than 10 arrive in one hour? How many would you expect to arrive in 8 hours? What is the standard deviation of the appropriate probability distribution? If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

First

What is the probability that exactly 3 arrive in one hour?

\(P(X = 3)\)

a = 10
t = 1

lambda = a*t

dpois(3, lambda)
## [1] 0.007566655

The probability that exactly 3 patients arrive in 1 hour is 0.76%.

Second

What is the probability that more than 10 arrive in one hour?

\(P(X > 10) = 1 - P(X \le 10)\)

1 - ppois(10, lambda)
## [1] 0.4169602

The probability that more than 10 arrive in one hour is 41.7%.

Third

How many would you expect to arrive in 8 hours?

# a = rate per hour
a * 8
## [1] 80

You would expect 80 patients in 8 hours.

Fourth

What is the standard deviation of the appropriate probability distribution?

\(\sigma = \sqrt{\lambda}\)

sqrt(lambda)
## [1] 3.162278
t = 8
sqrt(a*t)
## [1] 8.944272

The standard deviation is 3.16 for the one hour instance and 8.94 for the 8 hour instance.

Fifth

If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

hrs_limit <- 24
t <- 8
providers <- 3

((a*t)/(hrs_limit*providers)) * 100
## [1] 111.1111

The percent utilization is 111%. I would recommend having more providers so that demand can be met.

4. (Hypergeometric)

Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse.If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips? How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?

x = 5
m = 15
n = 15
k = 6

If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips?

dhyper(x, m, n, k)
## [1] 0.07586207

How many nurses would we have expected your subordinate to send?

m*k / (m+n)
## [1] 3

How many non-nurses would we have expected your subordinate to send?

n*k / (m+n)
## [1] 3

5. (Geometric).

The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year.What is the probability that the driver will be seriously injured during the course of the year? In the course of 15 months? What is the expected number of hours that a driver will drive before being seriously injured? Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

First

What is the probability that the driver will be seriously injured during the course of the year? In the course of 15 months?

p_injured <- 0.001

hours_per_year <- 1200
hours_per_month <- hours_per_year / 12

pgeom(hours_per_year, p_injured)
## [1] 0.6992876
pgeom(hours_per_month*15, p_injured)
## [1] 0.7772602

The probability of the driver being seriously injured within a year is 69.9% and within 15 months is 77.7%.

Second

What is the expected number of hours that a driver will drive before being seriously injured?

1/p_injured
## [1] 1000

The expected number of hours a driver will drive before being seriously injured is 1,000 hours.

Third

Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

pgeom(100, p_injured)
## [1] 0.09611265

The probability the driver will be injured in the next 100 hours given that they already drove 1200 hours is 9.61%.

6. (Binomial)

** You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours. a) What is the probability that the generator will fail more than twice in 1000 hours?

\(P(X > 2) = 1 - P(X \le 2)\)

a = 1/1000
t = 1000

lambda = a*t

1 - ppois(2,lambda)
## [1] 0.0803014

The probability of the generator failing more than twice in 1,000 hours is 8.03%.

  1. What is the expected value?

\(E(X) = \lambda\)

lambda
## [1] 1

It is expected that the generator will fail once in 1,000 hours.

7. (Uniform)

A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes. What is the probability that this patient will wait more than 10 minutes?

  1. What is the probability that this patient will wait more than 10 minutes?

\(P(X > 10) = 1 - P(X \le 10)\)

1 - punif(10, 0, 30)
## [1] 0.6666667

The probability a patient will wait more than 10 minutes is 66.7%.

  1. If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen?
1 - punif(15, 10, 30) 
## [1] 0.75

The probability the patient will wait at least another 5 minutes given they already waited 10 minutes is 75%.

  1. What is the expected waiting time?

\(E(X) = \mu\) = \(\alpha + \beta \over 2\)

(0 + 30) /2
## [1] 15

The expected waiting time is 15 minutes.

8. (Exponential)

Your hospital owns an old MRI, which has a manufacturer’s lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRIs obeys an exponential distribution.

  1. What is the expected failure time?

The expected failure time is 10 years.

  1. What is the standard deviation?
E_X = 10

sqrt(E_X)
## [1] 3.162278

The standard deviation is 3.16.

  1. What is the probability that your MRI will fail after 8 years?

\(P(X > 8) = 1 - P(X \le 8)\)

a = 1/10

1 - pexp(8, a)
## [1] 0.449329

The probility the machine will fail after 8 years is 44.9%.

  1. Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?
pexp(2, a)
## [1] 0.1812692

The probability the machine will fail in the next 2 years given you already owned it for 8 years is 18.1%.