(Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a 0.1% or .001 prevalence rate.
Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease?
Setting up the Bayesian probability function:
\[P(D|Pos) = \frac{P(Pos|D)\times P(D)}{P(Pos|D)\times P(D) + P(Pos|NoD)\times P(NoD)}\]
\[P(D|Pos) = \frac{0.96\times 0.001}{0.96\times 0.001 + 0.02\times 0.999} = \frac{.00096}{.00096 + .01998} = .046\]
If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?
(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05.
What is the probability that, after 24 months, you received exactly 2 inspections?
Setting up the Binomial probability function,
\[P(X=k) = (^n_k)p^k(1-p)^{n-k}\]
Where:
\[P(X=2) = \frac{24!}{2! \times 22!} \times 0.5^2(1-0.5)^{24-2}\] \[ = \frac{24 \times 23}{2} \times 0.25 \times 0.5^{22}\]
Which comes to 1.6450882^{-5} or approx 0.001645%.
binomial_fn <- function(p,n,k){
b_coef = factorial(n)/(factorial(k) * factorial(n-k))
return(b_coef * p^k * ((1-p)^(n-k)))
}
What is the probability that, after 24 months, you received 2 or more inspections?
About 99.99%. Solved by subtracting the sum of P(0) and P(1), from 1 (representing all possible outcomes.)
What is the probability that you received fewer than 2 inspections?
About 0.00015%. Solved by summing P(0) and P(1).
What is the expected number of inspections you should have received?
\(E(X) = (n \times p) = (24 \times 0.5) = 12\)
What is the standard deviation?
\(\sigma = \sqrt{np(1-p)} = \sqrt{24 \times 0.5(0.5)} = \sqrt{6} \approx 2.45\)
(Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour.
Setting up the Poisson probability function,
\[P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}\]
poisson_fn <- function(l,k){
return(l^k * (exp(1)^-l) / factorial(k))
}
Where:
What is the probability that exactly 3 arrive in one hour?
About 0.76%.
\[P(3) = \frac{10^3 e^{-10}}{3!} = \frac{1000 \times 0.4539 \times 10^{-5}}{6} \approx 0.00756\]
What is the probability that more than 10 arrive in one hour?
Solve by summing the probabilities of P(0) through P(10), then subtract from 1 (representing all possible outcomes)
p=0
for (k in seq(0,10)){
p = p + poisson_fn(10,k)
}
\(P(X>10) \approx\) 0.4169602 or 41.7%
How many would you expect to arrive in 8 hours?
Our hourly expected value (lambda) is 10, so we would expect 80 visitors in 8 hours.
What is the standard deviation of the appropriate probability distribution?
The standard deviation is the square root of lambda \(\lambda\) (10), or \(\approx\) 3.16
If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?
Between our three practitioners, we have the capacity to service 72 patients per day. For an 8-hour day, we seem to be 8 patients (11%) over capacity, and may need to consider extending hours or adding medical staff.
(Hypergeometric). Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse.
Setting up the Hypergeometric probability function,
\[P(X=x) = \frac{(^K_x)(^{N-K}_{n-x})}{(^N_n)}\]
hgeom_fn <- function(n,s,k,x){
return((factorial(k) * factorial(s) * factorial(n-k) * factorial(n-s)) / (factorial(x) * factorial(n) * factorial(k-x) * factorial(s-x) * factorial(n-k-s+x)))
}
Where:
If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips?
In this example our population size \(N=30\), our number of successes in the population \(K=15\), our number of samples \(n=6\) and our target variable \(x=5\). The probability of obtaining this result is 0.0758621 or \(\approx\) 7.6%. Unlikely, but not impossible.
How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?
The Mean or Expected Value of this example is \(\mu = \frac{nK}{N} = \frac{6 \times 15}{30} = 3\); we would have expected three nurses and three non-nurses to have been sent.
(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about 0.1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year.
What is the probability that the driver will be seriously injured during the course of the year?
Setting up the Geometric probability function,
\[(1-p)^{n-1}p\]
geom_fn <- function(n,p){
return((1-p)^(n-1) * p)
}
Where:
The Geometric distribution models the probability of observing the first success on the \(n\)th trial. To estimate the probability of success at any point in the first 1200 trials, we add each individual probability:
r = 0
for (n in seq(1,1200)){
r = r + geom_fn(n,.001)
}
The probability of serious injury is 0.6989866 or approx. 69.9%.
In the course of 15 months?
r = 0
for (n in seq(1,1500)){
r = r + geom_fn(n,.001)
}
Assuming our driver travels this route 100 times per month, the probability of serious injury based on 1500 trials is 0.7770372 or approx. 77.7%.
What is the expected number of hours that a driver will drive before being seriously injured?
The expected value \(\mu = \frac{1}{p}\) or 1000 hours.
Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?
Since we already know the outcome of the first 1200 hours, the probability of injury over the next 100 hours should be calculated independently:
r = 0
for (n in seq(1,100)){
r = r + geom_fn(n,.001)
}
The probability of serious injury in the next 100 hours is 0.0952079 or approx. 9.5%.