Question 1

(Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate. Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease? If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

total_pop <- 100000
sensitivity <- 0.96
prevalence_rate <- .001
positive_rate <- .02 #(1 - .98)
A2 <- .999

# calculation
hiv_total <- total_pop * prevalence_rate
non_hiv_total <- total_pop - hiv_total
hiv_positive <- hiv_total * sensitivity
hiv_negative <- hiv_total - hiv_positive
non_hiv_positive <- non_hiv_total * positive_rate
non_hiv_negative <- non_hiv_total - non_hiv_positive
positive_total <- hiv_positive + non_hiv_positive
negative_total <- non_hiv_negative + hiv_negative

# create a matrix
mnr_matrix <- matrix(c(hiv_positive, non_hiv_positive, positive_total, hiv_negative, non_hiv_negative, negative_total, hiv_total, non_hiv_total, total_pop), nrow = 3, ncol = 3, byrow = TRUE)

mnr_matrix

##      [,1]  [,2]   [,3]
## [1,]   96  1998   2094
## [2,]    4 97902  97906
## [3,]  100 99900 100000

# Bayesian Formula
prob_new_rate <- (sensitivity * prevalence_rate) / ((sensitivity + prevalence_rate) + (positive_rate + A2))

# the probability that an individual who is reported as positive by the new test actually has the disease
prob_new_rate

## [1] 0.0004848485

cost_test <- 1000
cost_treatment <- 100000

total_first_year_cost <- total_pop + (cost_test * cost_treatment)

# the total first-year cost for treating 100,000 individuals
total_first_year_cost

## [1] 100100000

Question 2

(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05. What is the probability that, after 24 months, you received exactly 2 inspections? What is the probability that, after 24 months, you received 2 or more inspections? What is the probability that your received fewer than 2 inspections? What is the expected number of inspections you should have received? What is the standard deviation?

n <- 24
prob_n <- .05
n_inspect <- 2
prob_1 <- dbinom(n_inspect,n,prob_n)

# probability of getting 2 or more inspections after 24 months
prob_1

## [1] 0.2232381

# the expected number of inspections that should have received
prob_2 <- dbinom(1, n, prob_n) + dbinom(0, n, prob_n)

prob_2

## [1] 0.6608173

# standard deviation
stand_dev <- sqrt(n * .05 * .95)

stand_dev

## [1] 1.067708

Question 3

(Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour. What is the probability that exactly 3 arrive in one hour? What is the probability that more than 10 arrive in one hour? How many would you expect to arrive in 8 hours? What is the standard deviation of the appropriate probability distribution? If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

rate_clinic <- 10

# the probability that more than 10 arrive in one hour
prob_3 <- dpois(3,rate_clinic)
prob_3

## [1] 0.007566655

# the probability that more than 10 arrive in one hour? 
prob_4 <- 1 - ppois(10, rate_clinic)
prob_4

## [1] 0.4169602

# the probability that are expected to arrive in 8 hours
prob_5 <- rate_clinic * 8

prob_5

## [1] 80

# the standard deviation of the appropriate probability distribution
prob_6 <- sqrt(rate_clinic)

prob_6

## [1] 3.162278

# the utilization percent
expected_people <- 3 * 24
util_rate <- round(prob_5/expected_people, 2)

util_rate

## [1] 1.11

Question 4

(Hypergeometric). Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse. If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips? How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?

nurse_n <- 15
other_nurse <- 15
comp_trip <- 6
supervisor_n <- 30
ppl_send <- 6

# the probability the supervisor would have selected five nurses for the trips
prob_7 <- dhyper(5,nurse_n,other_nurse,comp_trip,log=FALSE)

prob_7

## [1] 0.07586207

# expected number of nurses to send
prob_8 <- (ppl_send * nurse_n)/supervisor_n

prob_8

## [1] 3

# expected number of non-nurse to send
prob_9 <- ppl_send - prob_8

prob_9

## [1] 3

Question 5

(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year. What is the probability that the driver will be seriously injured during the course of the year? In the course of 15 months? What is the expected number of hours that a driver will drive before being seriously injured? Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

hour_n <- 1200
prob_car_crash <- .001

# the probability that the driver will be seriously injured during the course of the year
prob_10 <- 1 - pgeom(hour_n, prob_car_crash)

prob_10

## [1] 0.3007124

# the probability that the driver will be seriously injured in 15 months
# 15 months = 100 hours per month * 15
prob_11 <- 1 - pgeom(1500, prob_car_crash)

prob_11

## [1] 0.2227398

# the expected number of hours that a driver will drive before being seriously injured
prob_12 <- 1/prob_car_crash

prob_12

## [1] 1000

# the probability that he or she will be injured in the next 100 hours
prob_13 <- ((pgeom(1300,.001) - pgeom(hour_n, prob_car_crash)) * (prob_10))/(prob_10)

prob_13

## [1] 0.02863018

Question 6

You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours. What is the probability that the generator will fail more than twice in 1000 hours? What is the expected value?

# the probability that the generator will fail more than twice in 1000 hours
prob_14 <- 1 - ppois(2,1)

prob_14

## [1] 0.0803014

The expected value is to be 1.

Question 7

A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes. What is the probability that this patient will wait more than 10 minutes? If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen? What is the expected waiting time?

waiting_time <- 30
patient_wait <- 10
# the probability that this patient will wait more than 10 minutes
prob_15 <- 1 - punif(patient_wait, 0, waiting_time)

prob_15

## [1] 0.6666667

# the probability that he/she will wait at least another 5 minutes prior to being seen
prob_16 <- 1 - punif(15, patient_wait, waiting_time)
prob_17 <- 1 - punif(10, patient_wait, waiting_time)
prob_18 <- prob_16/prob_17

prob_18

## [1] 0.75

# the expected waiting time
expected_waiting <- 0.5 * (0 + waiting_time)

expected_waiting

## [1] 15

Question 8

Your hospital owns an old MRI, which has a manufacturers lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRIs obeys an exponential distribution. What is the expected failure time? What is the standard deviation? What is the probability that your MRI will fail after 8 years? Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?

mri_lifetime <- 10

# the expected failure time
expected_failure <- 1/mri_lifetime

expected_failure

## [1] 0.1

# the standard deviation
stand_dev_2 <- sqrt(expected_failure^2)

stand_dev_2

## [1] 0.1

# the probability that your MRI will fail after 8 years
prob_19 <- 1 - pexp(8, expected_failure)

prob_19

## [1] 0.449329

# the probability that it will fail in the next two years
prob_20 <- ((pexp(mri_lifetime,expected_failure) - pexp(8,expected_failure)) * (prob_19)/(prob_19))

prob_20

## [1] 0.08144952

DATA 605 - Homework 5

Eddie Xu

2024-02-25