##Problem Set 1:
(Bayesian). A new test for multi-nucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate. Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease? If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?
##Solution:
The different events and conditionalities are given below. We are given that prior(MNR HIV-1) = 0.001 = P(A1) #Disease prevalence in the wider population, i.e. the actual positive rate. prior (not MNR HIV-1) = 0.999 P(+|MNR HIV-1) = 0.96 = P(B|A1) #Representing sensitivity of the test, i.e. correctly reports “positivity” given patient has the disease. P(-|MNR HIV-1) = 0.04 P(-|not MNR HIV-1) = 0.98 = P(B|A2) #Representing specificity of the test, i.e. correctly reports “negativity” given patient doesn’t have the disease. P(+|not MNR HIV-1) = 0.02
P(A1|B) = TBD
According to Bayes theorem: P(A|B) = (P(B|A).P(A))/P(B) Translating to this problem, P(A1|B) = ((P(B|A1).P(A1)) / (P(B|A1).P(A1) + (P(B|A2).P(A1)))
Median cost = $100,000 + $1,000 = $101,000. Now we need to calculate the true positive and false negative incidences based on test results.
For 100,000 individuals, the true prevalence of 0.001 indicates 100 patients with MNR (HIV-1) virus. Because the test has a 96% sensitivity, it will only accurately predict 96 of the patients to have the virus and 4 people will be misdiagnosed to not have the virus when in fact they do (false negative). Similarly, since the disease prevalence is 100 patients out of every 100,000, this means that the number of patients who are true negatives is 100,000 - 100 = 99,900. But again the test has an error component to it and will only accurately detect 98% of these 99,900 people to be true negatives (0.98 * 99,900 = 97,902 people). The remaining 1,998 (99,900 - 97,902) people will show a false positive result.
Hence the test will detect positive cases to be 96 + 1,998 = 2,094. The probability of MNR (HIV-1) given a positive test has increased from 0.001 to 0.046 (96/2094). While this is 45-fold increase, the probability that the person has MNR (HIV-1) is still small. Stated in another way, among the positive results, 95.4% are false positives, and only 4.6% are true MNR (HIV-1) cases.
The total first-year cost for treating these individuals is 2,094 * $101,000 = $211,494,000.
prob <- (0.96 * 0.001) / ((0.96 * 0.001) + (0.98 * 0.001)) * 100
sprintf("Probability that an individual who is reported as positive by the new test actually has the disease is: %f", prob)
## [1] "Probability that an individual who is reported as positive by the new test actually has the disease is: 49.484536"
cost <- 2094 * 101000
sprintf("The total first-year cost for treating these individuals is: $%d", cost)
## [1] "The total first-year cost for treating these individuals is: $211494000"
##Problem Set 2:
(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05. What is the probability that, after 24 months, you received exactly 2 inspections? What is the probability that, after 24 months, you received 2 or more inspections? What is the probability that your received fewer than 2 inspections? What is the expected number of inspections you should have received? What is the standard deviation?
We are assuming here that n is relatively large (n = 24), such that the Central Limit Theorem implies that the distribution is well approximated by the corresponding normal density function with parameters mean = (n * p) and std dev = sqrt(n * p * q).
## [1] "The probability of exactly 2 inspections is: 0.223238"
## [1] "The probability of 2 or more inspections is: 0.339183"
## [1] "The probability of fewer than 2 inspections is: 0.660817"
## [1] "The expected number of inspections we should have received is: 1.200000"
## [1] "The standard deviation is: 1.067708"
##Problem Set 3:
(Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour. What is the probability that exactly 3 arrive in one hour? What is the probability that more than 10 arrive in one hour? How many would you expect to arrive in 8 hours? What is the standard deviation of the appropriate probability distribution? If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?
#Lambda is the average rate or the average number of occurrences per year. Here lambda = 10.
#What is the probability that exactly 3 arrive in one hour?
sprintf("The probability of exactly 3 patients arriving in one hour is: %f",dpois(3,10))
## [1] "The probability of exactly 3 patients arriving in one hour is: 0.007567"
#What is the probability that more than 10 arrive in one hour?
sprintf("The probability of more than 10 patients arriving in one hour is: %f",1-ppois(10,10))
## [1] "The probability of more than 10 patients arriving in one hour is: 0.416960"
#How many would you expect to arrive in 8 hours?
per_hour_arrival_rate <- 10
per_day_arrival_rate <- 10 * 8
sprintf("The number of patients arriving in 8 hours is: %f", per_day_arrival_rate)
## [1] "The number of patients arriving in 8 hours is: 80.000000"
#What is the standard deviation of the appropriate probability distribution?
sprintf("The standard deviation of the appropriate probability distribution is: %f", sqrt(10))
## [1] "The standard deviation of the appropriate probability distribution is: 3.162278"
#If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?
patients_seen_per_day <- 24 * 3 #Assuming each provider can see 24 patients/day.
per_day_arrival_rate <- 80 #Total number of patients arriving each day.
utilization_rate <- (per_day_arrival_rate / patients_seen_per_day) * 100
sprintf("The percent utilization rate is: %f", utilization_rate)
## [1] "The percent utilization rate is: 111.111111"
#The recommendations would be to either add more staff or increase the number of working hours for each doctor. Here, the patients seen per day is akin to the average service rate. Since the average service rate is less than the arrival rate, the queue has the potential to build up during the day, hence additional resources are needed.
##Problem Set 4:
(Hypergeometric). Your subordinate with 30 subordinates was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse. If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips? How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?
#If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips?
x <- 5
m <- 15
n <- 15
k <- 6
calc1 <- dhyper(x,m,n,k,log=FALSE)
sprintf("The probability he/she would have selected five nurses for the trips is: %f", calc1)
## [1] "The probability he/she would have selected five nurses for the trips is: 0.075862"
#How many nurses would we have expected your subordinate to send?
calc2 <- (m * k) / (m + n)
sprintf("The number of nurses we would we have expected the subordinate to send is: %d", calc2)
## [1] "The number of nurses we would we have expected the subordinate to send is: 3"
#How many non-nurses would we have expected your subordinate to send?
sprintf("The number of non-nurses we would we have expected the subordinate to send is: %d", calc2) #This is the same as the number of nurses and non-nurses are equal.
## [1] "The number of non-nurses we would we have expected the subordinate to send is: 3"
##Problem Set 5:
(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year. What is the probability that the driver will be seriously injured during the course of the year? In the course of 15 months? What is the expected number of hours that a driver will drive before being seriously injured? Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?
#What is the probability that the driver will be seriously injured during the course of the year?
calc3 <- (1- (1 - 0.001)^1200)
sprintf("The probability that the driver will be seriously injured during the course of the year is: %f", calc3)
## [1] "The probability that the driver will be seriously injured during the course of the year is: 0.698987"
#What is the probability that the driver will be seriously injured in the course of 15 months?
calc4 <- (1- (1 - 0.001)^1500)
sprintf("The probability that the driver will be seriously injured in the course of 15 months is: %f", calc4)
## [1] "The probability that the driver will be seriously injured in the course of 15 months is: 0.777037"
#What is the expected number of hours that a driver will drive before being seriously injured?
calc5 <- 1/0.001
sprintf("The expected number of hours that a driver will drive before being seriously injured is: %d", calc5)
## [1] "The expected number of hours that a driver will drive before being seriously injured is: 1000"
#Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?
calc6 <- pgeom(100, 0.001)
sprintf("The probability that the driver will be injured in the next 100 hours is: %f", calc6)
## [1] "The probability that the driver will be injured in the next 100 hours is: 0.096113"
##Problem Set 6:
You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours. What is the probability that the generator will fail more than twice in 1000 hours? What is the expected value?
The expected value is: \[\begin{equation*} (\lambda = 1) \end{equation*}\]
#What is the probability that the generator will fail more than twice in 1000 hours?
calc7 <- 1 - ppois(2,1)
sprintf("The probability that the generator will fail more than twice in 1000 hours is: %f", calc7)
## [1] "The probability that the generator will fail more than twice in 1000 hours is: 0.080301"
##Problem Set 7:
A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes. What is the probability that this patient will wait more than 10 minutes? If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen? What is the expected waiting time?
#What is the probability that this patient will wait more than 10 minutes?
q <- 10
min <- 0
max <- 30
calc8 <- 1- punif(q,min,max)
sprintf("The probability that the patient will wait more than 10 minutes is: %f", calc8)
## [1] "The probability that the patient will wait more than 10 minutes is: 0.666667"
#If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen?
PA <- 1-punif(15,10,30)
PB <- 1-punif(10,10,30)
calc9 <- PA/PB
sprintf("The probability that the patient will wait at least another 5 minutes prior to being seen is: %f", calc9)
## [1] "The probability that the patient will wait at least another 5 minutes prior to being seen is: 0.750000"
#What is the expected waiting time?
calc10 <- 1/2*(0+30)
sprintf("The expected waiting time in minutes is: %d", calc10)
## [1] "The expected waiting time in minutes is: 15"
##Problem Set 8:
Your hospital owns an old MRI, which has a manufacturer’s lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRI’s obeys an exponential distribution. What is the expected failure time? What is the standard deviation? What is the probability that your MRI will fail after 8 years? Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?
What is the expected failure time? The failure time is 10 years. What is the standard deviation? The standard deviation is 10.
#What is the probability that your MRI will fail after 8 years?
lambda <- 1/10
calc11 <- 1-(pexp(8, lambda))
sprintf("The probability that the MRI will fail after 8 years is: %f", calc11)
## [1] "The probability that the MRI will fail after 8 years is: 0.449329"
#Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?
calc12 <- pexp(2,lambda)
sprintf("The probability that the machine will fail in the next 2 years is: %f", calc12)
## [1] "The probability that the machine will fail in the next 2 years is: 0.181269"