Introduction

This work was assigned for Data 605, assignment 5.

Question 1

** (Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate. Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease?**

$ P(A|B) = P(B|A) * P(A)/P(B) $

P(A) = actually has disease (prev = 0.001) P(B) = testing positive for disease P(A|B) = tested positive and actually has disease P(B|A) = actually has disease and tested positive (sensativity = 0.96)

sensativity <- 0.96
specificity <- 0.98
prev <- 0.001

# probability that someone tests positve 
# test positive and does have it * positive rate + test positive and does not have it * negative rate
p_b <- sensativity*prev + (1-specificity)*(1-prev)
  
# bayes theorem
p_a_b <- sensativity*prev/p_b

print(paste0("The probability that an individual who is reported as positive by the new test actually has the disease is: ", round(p_a_b,4)*100, "%"))
## [1] "The probability that an individual who is reported as positive by the new test actually has the disease is: 4.58%"

If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

To treat 100,000 individuals, we have to add up their treatment cost and their test costs.

cost_per_pos_case <- 100000
test_cost <- 1000
test_patients <- 100000

# calculate the total amount of tests that will test positive (whether or not its true)
total_pos_patients <- test_patients*p_b

total_cost <- test_patients*test_cost + total_pos_patients*cost_per_pos_case
print(paste0("The total first year cost is: $", total_cost))
## [1] "The total first year cost is: $309400000"

Question 2

(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05. What is the probability that, after 24 months, you received exactly 2 inspections? What is the probability that, after 24 months, you received 2 or more inspections? What is the probability that your received fewer than 2 inspections? What is the expected number of inspections you should have received? What is the standard deviation?

p_jc_inspection <- 0.05
months <- 24

# probability after 24 months, you received exactly 2 inspections
## https://www.r-tutor.com/elementary-statistics/probability-distributions/binomial-distribution
p_2 <- dbinom(2, size = months, prob = p_jc_inspection)
print(paste0("The probability after 24 months, you received exactly 2 inspections: ", round(p_2,4)))
## [1] "The probability after 24 months, you received exactly 2 inspections: 0.2232"
# probability after 24 months, 2 or more inspections
p_2_plus <- 1 - pbinom(1, size=months, prob=p_jc_inspection) 
print(paste0("The probability after 24 months, 2 or more inspections: ",  round(p_2_plus,4)))
## [1] "The probability after 24 months, 2 or more inspections: 0.3392"
# probability received fewer than 2 inspections
p_less_2 <- pbinom(1, size=months, prob=p_jc_inspection) 
print(paste0("The probability received fewer than 2 inspections: ",  round(p_less_2,4)))
## [1] "The probability received fewer than 2 inspections: 0.6608"
# expected number of inspections
## https://math.oxford.emory.edu/site/math117/expectedValueVarianceOfBinomial/
expected <- p_jc_inspection*months
print(paste0("The expected number of inspections: ", expected))
## [1] "The expected number of inspections: 1.2"
# standard deviation
stdev <- (p_jc_inspection*months*(1-p_jc_inspection))^(1/2)
print(paste0("The standard deviation: ",  round(stdev,4)))
## [1] "The standard deviation: 1.0677"

Question 3

(Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour. What is the probability that exactly 3 arrive in one hour? What is the probability that more than 10 arrive in one hour? How many would you expect to arrive in 8 hours? What is the standard deviation of the appropriate probability distribution? If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

arrival_rate_per_hour <- 10

# probability that exactly 3 arrive in one hour
## https://www.statology.org/dpois-ppois-qpois-rpois-r/
p_3 <- dpois(3,lambda = arrival_rate_per_hour)
print(paste0("The probability that exactly 3 arrive in one hour: ", round(p_3,6)))
## [1] "The probability that exactly 3 arrive in one hour: 0.007567"
# probability that more than 10 arrive in one hour
# 1 - probability that 0,1,2,3,4,5,6,7,8,9 arrive in one hour
p_10_plus <- 1 - ppois(9,lambda = arrival_rate_per_hour)
print(paste0("The probability that more than 10 arrive in one hour: ", round(p_10_plus,6)))
## [1] "The probability that more than 10 arrive in one hour: 0.54207"
# same answer ppois(9,lambda = arrival_rate_per_hour, lower=FALSE)

# How many would you expect to arrive in 8 hours
expected_eight <- 8*arrival_rate_per_hour
print(paste0("In eight hours you would expect: ", expected_eight, " to arrive"))
## [1] "In eight hours you would expect: 80 to arrive"
# What is the standard deviation of the appropriate probability distribution
stdev <- arrival_rate_per_hour^(1/2)
print(paste0("The standard deviation is: ", round(stdev,4)))
## [1] "The standard deviation is: 3.1623"
# three family practice providers can see 24 templated patients each day
# what is the percent utilization and what are your recommendations
patients_each_day <- 24
providers <- 3

total_slots <- patients_each_day*providers

utilization <- expected_eight/total_slots

print(paste0("There are ", total_slots, " slots available to see patients, and ", expected_eight, " patients are expected to arrive"))
## [1] "There are 72 slots available to see patients, and 80 patients are expected to arrive"
print(paste0("The utilization is: ", round(utilization,4)))
## [1] "The utilization is: 1.1111"
print("My recommendation is to either hire support staff to increase the practitioners availability or to hire more practitioners, possibly part time during the busier hours")
## [1] "My recommendation is to either hire support staff to increase the practitioners availability or to hire more practitioners, possibly part time during the busier hours"

Question 4

(Hypergeometric). Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse. If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips? How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?

nurses <- 15
other <- 15
# available slots 
trips <- 6
# supervisor sent 5 nurses and 1 non-nurse
sent_nurse <- 5
sent_other <- 1

# what was the probability he/she would have selected five nurses for the trips
p_5_nurses <- dhyper(sent_nurse, nurses, other, trips)
print(paste0("The probability he/she would have selected five nurses for the trips: ", round(p_5_nurses,4)))
## [1] "The probability he/she would have selected five nurses for the trips: 0.0759"
# How many nurses would we have expected your subordinate to send
expected_nurses <- (nurses/(nurses+other))*trips
print(paste0("The amount of nurses would we have expected your subordinate to send: ", expected_nurses))
## [1] "The amount of nurses would we have expected your subordinate to send: 3"
# How many non-nurses would we have expected your subordinate to send
expected_others <- (other/(nurses+other))*trips
print(paste0("The amount of other than nurses we would we have expected your subordinate to send: ", expected_others))
## [1] "The amount of other than nurses we would we have expected your subordinate to send: 3"

Question 5

(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year. What is the probability that the driver will be seriously injured during the course of the year? In the course of 15 months? What is the expected number of hours that a driver will drive before being seriously injured? Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

# dangerous location
serious_injury_per_hour <- 0.001
# driver is required to traverse this area for 1200 hours in the course of a year
hours_per_year <- 1200

# probability that the driver will be seriously injured during the course of the year
p_i_in_1200 <- pgeom(hours_per_year, serious_injury_per_hour)
print(paste0("The probability that the driver will be seriously injured during the course of the year: ", round(p_i_in_1200,4)))
## [1] "The probability that the driver will be seriously injured during the course of the year: 0.6993"
# In the course of 15 months
months <- 15
hours <- hours_per_year * (months)/12
p_i_in_1500 <- pgeom(hours, serious_injury_per_hour)
print(paste0("The probability that the driver will be seriously injured during the course of 15 months: ", round(p_i_in_1500,4)))
## [1] "The probability that the driver will be seriously injured during the course of 15 months: 0.7773"
# What is the expected number of hours that a driver will drive before being seriously injured
expected_hours <- 1/serious_injury_per_hour
print(paste0("The expected number of hours that a driver will drive before being seriously injured: ", expected_hours))
## [1] "The expected number of hours that a driver will drive before being seriously injured: 1000"
# a driver has driven 1200 hours -> probability of injury in the next 100 hours
# The probability of injury shouldnt change whethere its first 100 or not
hours <- 100
p_i_in_100 <- pgeom(hours, serious_injury_per_hour)
print(paste0("The probability that the driver will be seriously injured during the next 100 hours: ", round(p_i_in_100,4)))
## [1] "The probability that the driver will be seriously injured during the next 100 hours: 0.0961"

Question 6

You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours. What is the probability that the generator will fail more than twice in 1000 hours? What is the expected value?

failure_rate_per_hour <- 1

# What is the probability that the generator will fail more than twice in 1000 hours
p_2_plus <- 1 - ppois(2,lambda = failure_rate_per_hour)
print(paste0("The probability that the generator will fail more than twice in 1000 hours: ", round(p_2_plus,6)))
## [1] "The probability that the generator will fail more than twice in 1000 hours: 0.080301"
# What is the expected value
print(paste0("The expected value is: ", failure_rate_per_hour))
## [1] "The expected value is: 1"

Question 7

A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes. What is the probability that this patient will wait more than 10 minutes? If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen? What is the expected waiting time?

# waiting time is uniformly distributed from 0 to 30 minutes assuming person arrived on time

# What is the probability that this patient will wait more than 10 minutes?
t_min = 0
t_max = 30
p_plus_10 <- punif(20,t_min,t_max)
print(paste0("The probability that this patient will wait more than 10 minutes: " , round(p_plus_10,4)))
## [1] "The probability that this patient will wait more than 10 minutes: 0.6667"
# If the patient has already waited 10 minutes
# what is the probability that he/she will wait at least another 5 minutes prior to being seen
p_15 <- 1-punif(15, 10, t_max)
p_10 <- 1-punif(10, 10, t_max)
p <- p_15/p_10
print(paste0("The probability that if the patient has already waited 10 minutes that he/she will wait at least another 5 minutes prior to being seen: " , round(p,4)))
## [1] "The probability that if the patient has already waited 10 minutes that he/she will wait at least another 5 minutes prior to being seen: 0.75"
# What is the expected waiting time?
expected <- (1/2)*(t_max-t_min)
print(paste0("The expected waiting time (minutes): " , expected))
## [1] "The expected waiting time (minutes): 15"

Question 8

Your hospital owns an old MRI, which has a manufacturer’s lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRIs obeys an exponential distribution. What is the expected failure time? What is the standard deviation? What is the probability that your MRI will fail after 8 years? Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?

expected_value <- 10 # manufacturer’s lifetime of about 10 years (expected value)

# failure of most MRIs obeys an exponential distribution

# What is the expected failure time?
print(paste0("The expected failure time is (years): ", expected_value))
## [1] "The expected failure time is (years): 10"
# What is the standard deviation?
stdev <- expected_value
print(paste0("The standard deviation is (years): ", stdev))
## [1] "The standard deviation is (years): 10"
# What is the probability that your MRI will fail after 8 years?
p_8 <- 1 - pexp(8, 1/expected_value)
print(paste0("The probability that your MRI will fail after 8 years: ", round(p_8,4)))
## [1] "The probability that your MRI will fail after 8 years: 0.4493"
# Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?
p_8_2 <- pexp(2, 1/expected_value)
print(paste0("Given that you already owned the machine 8 years, the probability that it will fail in the next two years is: ", round(p_8_2,4)))
## [1] "Given that you already owned the machine 8 years, the probability that it will fail in the next two years is: 0.1813"