Assignment #5

library(MASS)

1. (Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate. Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease? If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

Prior probability of having the disease

Answer 1:

Bayesian Formula:

\[P(A | B) = \frac{P(B | A) \cdot P(A)}{P(B)}\] where:

P(A | B) is the posterior probability of A given B, which represents the probability of hypothesis A being true given the observed evidence B.
P(B | A) is the likelihood of B given A, which represents the probability of observing evidence B assuming hypothesis A is true.
P(A) is the prior probability of A, which represents our belief in the probability of hypothesis A being true before observing the evidence.
P(B) is the marginal probability of B, which represents the probability of observing evidence B.

p_disease <- 0.001

Sensitivity and specificity of the test

sensitivity <- 0.96
specificity <- 0.98

False positive rate

false_pos_rate <- 1 - specificity
false_pos_rate

## [1] 0.02

Total probability of getting a positive test result

p_positive <- sensitivity * p_disease + false_pos_rate * (1 - p_disease)
p_positive

## [1] 0.02094

Probability of having the disease given a positive test result

p_disease_given_positive <- sensitivity * p_disease / p_positive
p_disease_given_positive

## [1] 0.04584527

Final Result

cat("Probability of having the disease given a positive test result:",
    round(p_disease_given_positive * 100, 2), "%\n")

## Probability of having the disease given a positive test result: 4.58 %

Second part of question#1

From the previous calculations using Bayes’ theorem, we know that the probability of having the disease given a positive test result is about 4.38%. So out of 100,000 individuals, we can expect about 4,380 to test positive for the disease.

total_cost <- 100000 * 4380
total_cost

## [1] 4.38e+08

2.(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05. What is the probability that, after 24 months, you received exactly 2 inspections? What is the probability that, after 24 months, you received 2 or more inspections? What is the probability that your received fewer than 2 inspections? What is the expected number of inspections you should have received? What is the standard deviation?

Probability of receiving exactly 2 inspections in 24 months

Answer 2:

Binomial Formula: \[P(X = k) = {n \choose k} p^k (1 - p)^{n - k}\] where:

(n x) is the probability of getting x successes in n independent trials.
(n choose x) is the binomial coefficient, which represents the number of ways to choose x items from a set of n items, and is computed as: (n choose x) = n! / (x! (n-x)!)
p is the probability of success in each trial.
(1 - p) is the probability of failure in each trial.

The notation (n x) is equivalent to the notation P(X = x) in the standard binomial distribution formula. The mean and variance of the binomial distribution can be computed using this notation as:

Mean = n * p
Variance = n * p * (1 - p)

months = 24
inspection_prob = .05
p_2 <- dbinom(2, 24, 0.05)

Probability of receiving 2 or more inspections in 24 months

p_2_or_more <- 1 - pbinom(1, 24, 0.05)
p_2_or_more

## [1] 0.3391827

Probability of receiving fewer than 2 inspections in 24 months

p_less_than_2 <- pbinom(1, 24, 0.05)
p_less_than_2

## [1] 0.6608173

Expected number of inspections in 24 months

expected <- 24 * 0.05
expected

## [1] 1.2

Standard deviation of inspections in 24 months

sd <- sqrt(24 * 0.05 * 0.95)
sd

## [1] 1.067708

3. (Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour. What is the probability that exactly 3 arrive in one hour?

What is the probability that more than 10 arrive in one hour? How many would you expect to arrive in 8 hours? What is the standard deviation of the appropriate probability distribution? If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

Answer 3:

The Poisson distribution is the probability distribution of independent event occurrences in an interval. If λ is the mean occurrence per interval, then the probability of having x occurrences within a given interval is:

Poisson distribution formula:

\[P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}\]

lambda<- 10
K<- 3
prob_3 <- dpois(K, lambda)
prob_3

## [1] 0.007566655

What is the probability that more than 10 arrive in one hour?

ppois(9, lambda=10)   # lower tail

## [1] 0.4579297

ppois(9, lambda=10, lower=FALSE)   # upper tail

## [1] 0.5420703

How many would you expect to arrive in 8 hours?

t <- 8
expected_arrive <- lambda * t
expected_arrive

## [1] 80

What is the standard deviation of the appropriate probability distribution?

\[SD = sqrt(\lambda)\]

sd_poisson <- sqrt(lambda)
sd_poisson

## [1] 3.162278

If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

Percent utilization of 3 providers each seeing 24 patients per day

utilization <- 30 * 24 / (3 * 24)
utilization

## [1] 10

cat("Percent utilization of 3 providers each seeing 24 patients per day:", utilization * 100, "%\n")

## Percent utilization of 3 providers each seeing 24 patients per day: 1000 %

4. (Hypergeometric). Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse. If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips? How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?

Answer 4:

The probability mass function (PMF) for the hypergeometric distribution is:

P(X = k) = [K choose k] * [(N - K) choose (n - k)] / [N choose n]

where: * X is the random variable representing the number of objects of interest in the sample * k is a specific value of X * [a choose b] represents the binomial coefficient, which is the number of ways to choose b objects from a set of a objects

variables

N <- 30   # Total population size
K <- 15   # Number of objects of interest
n <- 6    # Sample size
k <- 5    # Number of objects of interest in the sample
pmf <- choose(K, k) * choose(N-K, n-k) / choose(N, n)
pmf

## [1] 0.07586207

Calculate mean and variance

mean <- n * (K / N)
mean

## [1] 3

variance <- n * (K / N) * (1 - K / N) * ((N - n) / (N - 1))
variance

## [1] 1.241379

5.(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year. What is the probability that the driver will be seriously injured during the course of the year? In the course of 15 months? What is the expected number of hours that a driver will drive before being seriously injured? Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

Answer 5:

Geometric distribution Formula \[P(X = k) =(1 - p)^k-1 p\]

What is the probability that the driver will be seriously injured during the course of the year?

prob_injury = .001
hours = 1200

crash_1200 = pgeom(hours,prob_injury)
crash_1200

## [1] 0.6992876

In the course of 15 months?

crash_15_mths = pgeom(hours * (15/12),prob_injury)
crash_15_mths

## [1] 0.7772602

Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

crash_next_100 = pgeom(100,prob_injury)
crash_next_100

## [1] 0.09611265

6. You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours. What is the probability that the generator will fail more than twice in 1000 hours? What is the expected value?

Probability of more than two failures in 1000 hours

Answer 6:

p_more_than_two_failures <- 1 - ppois(2, 1/1000 * 1000)
cat("Probability of more than two failures in 1000 hours:", p_more_than_two_failures, "\n")

## Probability of more than two failures in 1000 hours: 0.0803014

Expected value

expected_value <- 1/1000 * 1000
cat("Expected value:", expected_value, "\n")

## Expected value: 1

7. A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes. What is the probability that this patient will wait more than 10 minutes? If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen? What is the expected waiting time?

Calculate the probability that the patient will wait at least another 5 minutes given that he/she has already waited 10 minutes

Answer 7:

# Define the minimum and maximum waiting times
a <- 0
b <- 30
p_wait_5_given_10 <- (1 - punif(15, min = 10, max = b)) / (1 - punif(10, min = a, max = b))
p_wait_5_given_10

## [1] 1.125

Calculate the probability that the patient will wait more than 10 minutes

p_wait_more_than_10 <- 1 - punif(10, min = a, max = b)
p_wait_more_than_10

## [1] 0.6666667

Calculate the expected waiting time

expected_waiting_time <- (a + b) / 2
expected_waiting_time

## [1] 15

Calculate the PDF of the uniform distribution

pdf_uniform <- dunif(0:30, min = a, max = b)
pdf_uniform

##  [1] 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333
##  [7] 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333
## [13] 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333
## [19] 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333
## [25] 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333 0.03333333
## [31] 0.03333333

8.Your hospital owns an old MRI, which has a manufacturer’s lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRIs obeys an exponential distribution. What is the expected failure time? What is the standard deviation? What is the probability that your MRI will fail after 8 years? Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?

Answer 8:

we can use the properties of the exponential distribution. Specifically, if a random variable X follows an exponential distribution with parameter λ, then the expected value of X is 1/λ and the standard deviation is also 1/λ.

lambda <- 1/10

# Calculate the expected failure time
expected_failure_time <- 1/lambda
expected_failure_time

## [1] 10

Calculate the standard deviation

sd_failure_time <- 1/lambda
sd_failure_time

## [1] 10

Calculate the probability that the MRI will fail after 8 years

prob_fail_after_8_years <- exp(-lambda*8)
prob_fail_after_8_years

## [1] 0.449329

Calculate the probability that the MRI will fail in the next two years given that it has already been owned for 8 years

prob_fail_in_2_years_given_8_years <- exp(-lambda*2)
prob_fail_in_2_years_given_8_years

## [1] 0.8187308

References:

https://www.youtube.com/watch?v=OByl4RJxnKA&t=678s https://www.3blue1brown.com/lessons/better-bayes https://www.youtube.com/watch?v=3PWKQiLK41M&t=1022s https://www.youtube.com/watch?v=m0o-585xwW0 https://www.youtube.com/watch?v=BCeFgnh6A1U https://r-coder.com/uniform-distribution-r/#The_punif_function https://r-coder.com/exponential-distribution-r/