Data 605 HW 5

Question 1

(Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate.

Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease?

Using the Bayesian formula: P(A|B) = $\frac{P(B|A)P(A)}{P(B)}$
A = reported “positive” and B = is positive
So P(B|A) = .96 and P(A) = .001
P(B) is when its is positive so we need the probability of a reported “positive” (.96), is positive test (.001) and a reported negative (1-.98), is positive (1-.001).

ans <- (.96*.001)/((.96*.001)+(.02*.999))
cat("The probability that an individual who is reported as positive by the new test actually has the disease:", ans*100,"%")

## The probability that an individual who is reported as positive by the new test actually has the disease: 4.584527 %

If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

Each test cost $1000 so we need to first do 1000*100,000
This will be added to cost of a positive case $100,000 times the amount of positive cases in 100,000 individuals.
This can be found using the probability found above: ~4.58%

cost <- (1000*100000)+(100000*(100000*ans))
cat("The total first-year cost for treating 100,000 individuals: $", cost)

## The total first-year cost for treating 100,000 individuals: $ 558452722

Question 2

(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05. What is the probability that, after 24 months, you received exactly 2 inspections?

Using the Binomial formula: P(k) = ($\frac{n}{k}$)p$^n$$^-$$^k$q$^k$
n = 24 months
k = # of inspections
p = probability of an inspection or .05
q = probability of no inspection or 1-.05

For exactly 2 inspections or k=2:

inspections_2 <- dbinom(2, size = 24, prob = .05)
cat("The probability that, after 24 months, you received exactly 2 inspections: ", inspections_2*100, "%")

## The probability that, after 24 months, you received exactly 2 inspections:  22.32381 %

What is the probability that, after 24 months, you received 2 or more inspections?

To find >= 2, it will be easier to the find the probability of those not in this range and do 1 - them. (a calculate for k=0,1 vs. calculate for k=2-24)

inspections_2more <- 1 - (dbinom(0, size = 24, prob = .05)+dbinom(1, size = 24, prob = .05))
cat("The probability that, after 24 months, you received 2 or more inspections: ", inspections_2more*100, "%")

## The probability that, after 24 months, you received 2 or more inspections:  33.91827 %

What is the probability that your received fewer than 2 inspections?
Since it has tp be less than 2, we only need 2 probabilities: k = 0, 1.

inspections_2less <- (dbinom(0, size = 24, prob = .05)+dbinom(1, size = 24, prob = .05))
cat("The probability that, after 24 months, you received fewer than 2 inspections: ", inspections_2less*100, "%")

## The probability that, after 24 months, you received fewer than 2 inspections:  66.08173 %

What is the expected number of inspections you should have received?

inspections_expected <- 24 * .05 #months * probabilty 
cat("The expected number of inspections you should have received: ", inspections_expected)

## The expected number of inspections you should have received:  1.2

What is the standard deviation?
$SD=sqrt(np(1-p))$

inspections_sd <- sqrt(24*.05*(1-.05))
cat("The standard deviation: ", inspections_sd)

## The standard deviation:  1.067708

Question 3

(Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour. What is the probability that exactly 3 arrive in one hour?

Using the Poisson formula: P(x) = $\frac{λ^xe^-λ}{x!}$
λ = 10 or the normal rate of 10 per hour
x = 3 or how much we want

patients_3 <- (10^3*exp(-10))/(factorial(3))
cat("The probability that exactly 3 arrive in one hour: ", patients_3*100, "%")

## The probability that exactly 3 arrive in one hour:  0.7566655 %

What is the probability that more than 10 arrive in one hour?
We can calculate for x = 0-10 and then 1 - that probability.

patients_010 <- 0
for (i in 0:10){ patients_010 = patients_010 + (10^i*exp(-10))/(factorial(i)) }
patients_10more <- 1- patients_010
cat("The probability that more than 10 arrive in one hour: ", patients_10more*100, "%")

## The probability that more than 10 arrive in one hour:  41.69602 %

How many would you expect to arrive in 8 hours?

patients_expected <- 8 * 10 #hours * 10 per hour was the given rate 
cat("The expected number to arrive in 8 hours: ", patients_expected)

## The expected number to arrive in 8 hours:  80

What is the standard deviation of the appropriate probability distribution?
$SD=sqrt(λ)$

patients_sd <- sqrt(10)
cat("The standard deviation: ", patients_sd)

## The standard deviation:  3.162278

If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

Assuming a day is 8 hours,

patients_percent <- (10*8)/(24*3)#(patients*per hour)*(patients per day * 3 clinics)
cat("The percent utilization: ", patients_percent*100, "%")

## The percent utilization:  111.1111 %

80 to 72 so the three clinics will need more doctors or be more effective if it wants to do better than the first option (one clinic doing 10 patients per hour).

Question 4

(Hypergeometric). Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse. If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips?

For hypergeometric, in R we use hyper() calls.
dhyper(x,m,n,k)
x = 5 nurses chosen
m = 15 nurses total
n = 15 non-nurses total
k = 6 # of chosen

nurses_5 <- dhyper(5,15,15,6)
cat("The probability he/she would have selected five nurses for the trips: ", nurses_5*100, "%")

## The probability he/she would have selected five nurses for the trips:  7.586207 %

How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?
Since its 15 nurses and 15 non-nurses, it should have been an even split with its 1:1 ratio. (50% nurses 50% non-nurses). So for the 6 slots, it should be 3 nurses and 3 non-nurses.

Question 5

(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year. What is the probability that the driver will be seriously injured during the course of the year?

For geometric, in R we use geom() calls.
pgeom(x,prob)
x = 1200 hours
prob = .1% or .001

injured_1200 <- pgeom(1200,.001)
cat("The probability that the driver will be seriously injured during the course of the year: ", injured_1200*100, "%")

## The probability that the driver will be seriously injured during the course of the year:  69.92876 %

In the course of 15 months?
1200 per year = 100 per month so 15 months = 1500 hours

injured_15 <- pgeom(1500,.001)
cat("The probability that the driver will be seriously injured during the course of 15 months: ", injured_15*100, "%")

## The probability that the driver will be seriously injured during the course of 15 months:  77.72602 %

What is the expected number of hours that a driver will drive before being seriously injured?

injured_expected <- 1/.001
cat("The expected number of hours that a driver will drive before being seriously injured: ", injured_expected, "hours")

## The expected number of hours that a driver will drive before being seriously injured:  1000 hours

Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

injured_1300 <- pgeom(1200,.001)+pgeom(100,.001)
cat("The probability that he or she will be injured in the next 100 hours: ", injured_1300*100, "%")

## The probability that he or she will be injured in the next 100 hours:  79.54002 %

Question 6

You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours. What is the probability that the generator will fail more than twice in 1000 hours?

Using the Binomial formula, I can calculate the probability of the generator failing 0 and 1 times and 1 minus that should give me the correct probability.

P(k) = ($\frac{n}{k}$)p$^n$$^-$$^k$q$^k$
n = 1000 hours
k = # of generator failures or 0 & 1
p = probability of failure or 1 in 1000 -> .001
q = probability of no failure or 1-.001 -> .999

generator_fail <- 1 - (dbinom(0, size = 1000, prob = .001)+dbinom(1, size = 1000, prob = .001))
cat("The probability that the generator will fail more than twice in 1000 hours: ", generator_fail*100, "%")

## The probability that the generator will fail more than twice in 1000 hours:  26.42411 %

What is the expected value?

generator_expected <- .001 * 1000 #the probability of failure * hours
cat("The expected value: ", generator_expected)

## The expected value:  1

Question 7

A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes. What is the probability that this patient will wait more than 10 minutes?

Using the Poisson formula, we can calculate for x = 0-10 and then 1 - that probability.
P(x) = $\frac{λ^xe^-λ}{x!}$
λ = 15 or the mean of our two extreme waiting times (0 & 30)
x = 0-10

waiting_010 <- 0
for (i in 0:10){ waiting_010 = waiting_010 + (15^i*exp(-15))/(factorial(i)) }
waiting_10more <- 1 - waiting_010
cat("The probability that this patient will wait more than 10 minutes: ", waiting_10more*100, "%")

## The probability that this patient will wait more than 10 minutes:  88.15356 %

If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen?

We can just add on 1 - waiting010 + waiting015 where waiting 015 has x = 11-15

waiting_015 <- 0
for (i in 11:15){ waiting_015 = waiting_015 + (15^i*exp(-15))/(factorial(i)) }
waiting_15more <- 1 - (waiting_010+waiting_015)
cat("The probability that this patient will wait more than 10 minutes and then he/she will wait at least another 5 minutes prior to being seen: ", waiting_15more*100, "%")

## The probability that this patient will wait more than 10 minutes and then he/she will wait at least another 5 minutes prior to being seen:  43.19104 %

What is the expected waiting time?

waiting_expected <- (0 + 30)/2 #the mean of our two extreme waiting times
cat("The expected waiting time: ", waiting_expected)

## The expected waiting time:  15

Question 8

Your hospital owns an old MRI, which has a manufacturer’s lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRIs obeys an exponential distribution. What is the expected failure time?

The expected failure time should be simply the manufacturer’s lifetime as that is how long the manufacturer thinks it lasts which is 10 years.
To calculate in exponential distribution: 1/mean or 1/(1/λ)
λ = 10 years

failure_expected <- 1/(1/10)
cat("The expected failure time: ", failure_expected, "years")

## The expected failure time:  10 years

What is the standard deviation? $SD=sqrt(λ)$

mri_sd <- sqrt(10)
cat("The standard deviation: ", mri_sd)

## The standard deviation:  3.162278

What is the probability that your MRI will fail after 8 years?

Using the Poisson formula, we can calculate for x = 9-10 and then 1 - that probability.
P(x) = $\frac{λ^xe^-λ}{x!}$
λ = 10 years
x = 0-8 years

fail_08 <- 0
for (i in 0:8){ fail_08 = fail_08 + (10^i*exp(-10))/(factorial(i)) }
fail_8more <- 1 - fail_08
cat("The probability that your MRI will fail after 8 years: ", fail_8more*100, "%")

## The probability that your MRI will fail after 8 years:  66.71803 %

Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years?

fail_082 <- 0
for (i in 9:10){ fail_082 = fail_082 + (10^i*exp(-10))/(factorial(i)) }
fail_8more2 <- fail_08 + fail_082
cat("The probability that your MRI will fail after 8 years in the next two years: ", fail_8more2*100, "%")

## The probability that your MRI will fail after 8 years in the next two years:  58.30398 %