605 HW5-Probabilit Distributions

Problem 1.

(Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the disease, 98% will be correctly reported as “negative.” MNR HIV-1 is considered to be rare (albeit emerging), with about a .1% or .001 prevalence rate. Given the prevalence rate, sensitivity, and specificity estimates, what is the probability that an individual who is reported as positive by the new test actually has the disease? If the median cost (consider this the best point estimate) is about $100,000 per positive case total and the test itself costs $1000 per administration, what is the total first-year cost for treating 100,000 individuals?

According to Bayesian’s Rule and Diagnostic Testing, The probability that an individual has the disease given a positive test result, $P(D \mid P)$ is assigned as follow: \[ P(D \mid P)=\frac{P(P \mid D) * P(D)}{P(P)} \] Where \[ \begin{aligned} & P(P \mid D)=\text { Sensitivity of the test }=96 \% \text { or } 0.96, \\ & P(D)=\text { Prevalence rate }=0.1 \% \text { or } 0.001 \\ & P(P)=\text { Probability of testing positive results. } \end{aligned} \] To find $P(P)$, \[ P(P)=P(P \mid D) * P(D)+P(P \mid N o D) * P(N o D) \] $P(N o D)$ is the probability of a positive test result given no disease. Since the specificity of the test is $98 \%$ or 0.98 \[ P(N o D)=1-0.98=0.02 \] \[ P(P)=(0.96 * 0.001)+(0.02 * 0.999) \approx 0.2292 \] Therefore, \[ P(D \mid P)=\frac{0.96 * 0.001}{0.2292} \approx 0.04184 \] The probability that an individual that an individual who is reported as positive by the new test has a disease, $P(D \mid P)$ is $4.18 \%$.

Next, to find the total first year cost for treating 100,000 individuals, Cost per positive case is $\$ 100,000$, Test itself costs is $\$ 1000$ per administered.

# Given values
probability_positive <- 0.04184  # Probability of testing positive
total_individuals <- 100000       # Total number of individuals
cost_per_positive_case <- 100000  # Cost per positive case
cost_per_test <- 1000             # Cost per test

# Calculate the number of positive cases
num_positive_cases <- probability_positive * total_individuals

# Calculate the total cost
total_cost <- (num_positive_cases * cost_per_positive_case) + (total_individuals * cost_per_test)

# Print the result
cat("The total first-year cost for treating 100,000 individuals is: $", total_cost, "\n")

## The total first-year cost for treating 100,000 individuals is: $ 518400000

Problem 2

(Binomial). The probability of your organization receiving a Joint Commission inspection in any given month is .05. What is the probability that, after 24 months, you received exactly 2 inspections? What is the probability that, after 24 months, you received 2 or more inspections? What is the probability that your received fewer than 2 inspections? What is the expected number of inspections you should have received? What is the standard deviation?

The binomial probability formula is \[ P(k)=\left(\begin{array}{l} n \\ k \end{array}\right) \cdot p^k \cdot(1-p)^{n-k} \] Where: $P(k)$ is the probability of inspections. $n$ is the number months, $k$ is the number of inspections, $p$ is the probability of receiving an inspection $\left(\begin{array}{l}n \\ k\end{array}\right)$ is the binomial coefficient, which is the number of ways to choose $k$ inspections out of $n$ trials, and it is calculated as $\left(\begin{array}{l}n \\ k\end{array}\right)=\frac{n !}{k !(n-k) !}$, where $n$ ! is the factorial of $n$

The probability of receiving an inspection in any given month is $p=0.05$ and the number of months is $n=24$ Probability of receiving exactly 2 inspections after 24 months $(P(k=2))$ :

# Probability of receiving an inspection in any given month
p <- 0.05
# Number of months
n <- 24
# Number of inspections 
k <- 2
# Calculate P(k = 2)
probability <- dbinom(k, size = n, prob = p)
cat("The probability of receiving exactly 2 inspections after 24 months is:", probability, "\n")

## The probability of receiving exactly 2 inspections after 24 months is: 0.2232381

Probability of receiving $\mathbf{2}$ or more inspections after $\mathbf{2 4}$ months is calculated as: \[ P(k \geq 2)=1-P(k<2)=1-(P(k=0)+P(k=1)) \] Probability of receiving fewer than 2 inspections after 24 months is: \[ P(k<2)=P(k=0)+P(k=1) \] Standard deviation of the number of inspections $(\sigma)$ is: \[ \sigma=\sqrt{n \cdot p \cdot(1-p)} \]

p <- 0.05
n <- 24
# Calculate P(k = 0)
k0_probability <- dbinom(0, size = n, prob = p)
# Calculate P(k = 1)
k1_probability <- dbinom(1, size = n, prob = p)
# Calculate P(k >= 2) using the complement rule
k_geq_2_probability <- 1 - (k0_probability + k1_probability)
# Calculate P(k < 2) using the complement rule
k_less_2_probability <- (k0_probability + k1_probability)
# Print the result
cat("The probability of receiving 2 or more inspections after 24 months is:", k_geq_2_probability, "\n")

## The probability of receiving 2 or more inspections after 24 months is: 0.3391827

cat("The probability of receiving fewer than 2 inspections after 24 months is:", k_less_2_probability, "\n")

## The probability of receiving fewer than 2 inspections after 24 months is: 0.6608173

# Calculate the expected number of inspections (μ)
expected_inspections <- n * p
cat("The expected number of inspections over 24 months is:", expected_inspections, "\n")

## The expected number of inspections over 24 months is: 1.2

# Calculate the standard deviation (σ)
standard_deviation <- sqrt(n * p * (1 - p))
cat("The standard deviation of the number of inspections is approximately:", standard_deviation, "inspections", "\n")

## The standard deviation of the number of inspections is approximately: 1.067708 inspections

Problem 3

(Poisson). You are modeling the family practice clinic and notice that patients arrive at a rate of 10 per hour. What is the probability that exactly 3 arrive in one hour? What is the probability that more than 10 arrive in one hour? How many would you expect to arrive in 8 hours? What is the standard deviation of the appropriate probability distribution? If there are three family practice providers that can see 24 templated patients each day, what is the percent utilization and what are your recommendations?

Poisson Distribution Formula The formula for the Poisson distribution function is given by: \[ f(x)=\left(e^{-\lambda} \lambda^x\right) / x ! \] Where, $\mathrm{e}$ is the base of the logarithm $\mathrm{x}$ is a Poisson random variable $\lambda$ is an average rate of value

The standard deviation of the Poisson distribution is the square root of the average rate is \[ \sigma=(\sqrt{\lambda}) \] If there are three family practices provider can see 24 templated patients per day, the total slots available per day is $3 * 24=72$ slots. \[ \text { Percent Utilization }=\frac{\text { Number of Patients seen in } 8 \text { hrs }}{\text { Total Slots Available per day }} * 100 \% \]

 # Average rate of patients arriving per hour
lambda <- 10
# Number of patients (k)
k <- 3

# The probability of exactly 3 patients arriving in one hour
P3patients <- dpois(k, lambda)
cat("The probability that exactly 3 patients arrive in one hour is:", P3patients, "\n")

## The probability that exactly 3 patients arrive in one hour is: 0.007566655

# Cumulative probability of up to 10 patients
CP10_patients <- ppois(10, lambda)
P_more_10_patients <- 1 - CP10_patients
cat("The probability that more than 10 patients arrive in one hour is:", CP10_patients, "\n")

## The probability that more than 10 patients arrive in one hour is: 0.5830398

# Expected number of patients in 8 hours
hours <- 8
ExP_8_hours <- lambda * hours
cat("The expected number of patients arrive in 8 hours is:", ExP_8_hours, "\n")

## The expected number of patients arrive in 8 hours is: 80

# The standard deviation 
std_deviation <- sqrt(lambda)
cat("The standard deviation of appropriate probability distribution is:", std_deviation, "\n")

## The standard deviation of appropriate probability distribution is: 3.162278

# Calculate percent utilization
slots <- 3 * 24
patients <- 10 * 8
PU <- (patients / slots) * 100
cat("The percent utilization is:", PU, "%\n")

## The percent utilization is: 111.1111 %

Problem 4

(Hypergeometric). Your subordinate with 30 supervisors was recently accused of favoring nurses. 15 of the subordinate’s workers are nurses and 15 are other than nurses. As evidence of malfeasance, the accuser stated that there were 6 company-paid trips to Disney World for which everyone was eligible. The supervisor sent 5 nurses and 1 non-nurse. If your subordinate acted innocently, what was the probability he/she would have selected five nurses for the trips? How many nurses would we have expected your subordinate to send? How many non-nurses would we have expected your subordinate to send?

The PMF (probability mass function) for the hypergeometric distribution is given by: \[ P(X=k)=\frac{\left(\begin{array}{l} K \\ k \end{array}\right) \cdot\left(\begin{array}{l} N-K \\ n-k \end{array}\right)}{\left(\begin{array}{l} N \\ n \end{array}\right)} \] Where: - $X$ is the random variable representing the number of succes the sample. - $k$ is the specific number of successes you want to calculate th probability for. - $\quad N$ is the total population size. - $K$ is the number of successes in the population. - $\quad n$ is the sample size.

# Total number of subordinates (N)
N <- 30
# Number of nurses (K)
K <- 15
# Number of non-nurses (X)
X <- 15
# Number of selected Nurses for the trips
k <- 5
# Number of subordinates chosen for the Disney World trips (n)
n <- 6
# Probability mass function (pmf) of selecting exactly 5 nurses for the trips
pmf <- choose(K, k) * choose(N-K, n-k) / choose(N, n)
cat("The probability of selecting exactly 5 nurses for the trips is:", pmf, "\n")

## The probability of selecting exactly 5 nurses for the trips is: 0.07586207

# The expected number of nurses sent
nurses <- n * (K / N)
# The expected number of non-nurses sent
non_nurses <- n * (X / N)
# Print the results
cat("The expected number of nurses sent is:", nurses, "\n")

## The expected number of nurses sent is: 3

cat("The expected number of non-nurses sent is:", non_nurses, "\n")

## The expected number of non-nurses sent is: 3

Problem 5

(Geometric). The probability of being seriously injured in a car crash in an unspecified location is about .1% per hour. A driver is required to traverse this area for 1200 hours in the course of a year. What is the probability that the driver will be seriously injured during the course of the year? In the course of 15 months? What is the expected number of hours that a driver will drive before being seriously injured? Given that a driver has driven 1200 hours, what is the probability that he or she will be injured in the next 100 hours?

The probability of being seriously injured in a car crash per hour is $0.1 \%$ \[ \mathrm{p}=0.001 \text {. } \] The probability of not being injured in each hour, we use the complement probability: \[ 1-\mathrm{p}=0.999 \text {. } \] The probability of not being injured in 1200 consecutive hours:

\[ (0.999)^{(1200)} \] Therefore, Probability of being seriously injured during the year (1200 hours): \[ P(X \leq x)=1-(1-p)^x \] \[ P(\text { Injured in a year })=1-(0.999)^{(1200)} \] Probability of being injured for 15 months (1800 hours) is \[ P(\text { Injured in a year })=1-(0.999)^{(1800)} \] The expected number of hours $(E)$ before an event occurs in a geometric distribution is \[ E(h r)=\frac{1}{P} \quad \mathrm{p}=0.001 \] Probability of being injured in the next 100 hours, given that the driver has driven 1200 hours is to find the cumulative probability $P(X \leq 100)$ where $X$ is the number of additional hours until injury.

# Probability of being injured in a year
p <- 0.001
hours <- 1200
P_injured_per_year <- 1 - (1 - p)^hours
cat("The probability of being injured in a year is:", P_injured_per_year, "\n")

## The probability of being injured in a year is: 0.6989866

# Total hours driven in 15 months is 1800 hours
tot_hours <- 1800
fifteen_months <- 1 - (1 - p)^tot_hours
cat("The probability of being injured in 15 month is:", fifteen_months, "\n")

## The probability of being injured in 15 month is: 0.8348499

# Additional hours 
add_hours <- 100
# The cumulative probability of being injured in the next 100 hours
cond_probability <- pgeom(add_hours, prob = p)
cat("The probability of being injured in the next 100 hours, given 1200 hr, is:", cond_probability, "\n")

## The probability of being injured in the next 100 hours, given 1200 hr, is: 0.09611265

Problem 6

You are working in a hospital that is running off of a primary generator which fails about once in 1000 hours. What is the probability that the generator will fail more than twice in 1000 hours? What is the expected value? The Probability of mass function (PMF) of Poisson distribution is appropriate to address problem when events occur randomly over time or space. Given that the generator fails once in 1000 hours, so the average rate of failure $(\lambda = 0.001)$ is 1 failure per 1000 hours. The probability of more than two failures in 1000 hours is you can calculate the complement of the probability of two or fewer failures: \[ P(\text { more than twofailure })=1-[P(\text { no failure })+P(\text { One failure })+P(\text { two failure }) \] Using the Poisson distribution formula: \[ P(X=k)=\frac{e^{-\lambda} \cdot \lambda^k}{k !} \] where $X$ is the number of failures, $k$ is the number of failures and $\lambda$ is the average rate of failure.

# One failure per 1000 hours
lambda <- 1/1000  

# The probability of 0 failures
No_failures <- dpois(0, lambda)

#The probability of 1 failure
One_failure <- dpois(1, lambda)

# The probability of 2 failures
Two_failures <- dpois(2, lambda)

# The probability of more than 2 failures
More_than_Two_failures <- 1 - (No_failures + One_failure + Two_failures)
cat("The probability of more than 2 failures in 1000 hours is:", More_than_Two_failures, "\n")

## The probability of more than 2 failures in 1000 hours is: 1.665417e-10

Problem 7

A surgical patient arrives for surgery precisely at a given time. Based on previous analysis (or a lack of knowledge assumption), you know that the waiting time is uniformly distributed from 0 to 30 minutes. What is the probability that this patient will wait more than 10 minutes? If the patient has already waited 10 minutes, what is the probability that he/she will wait at least another 5 minutes prior to being seen? What is the expected waiting time? The uniform PDF (probability distribution function) is constant over an interval $[0,30]$ minutes: \[ f_X(x)=\left\{\begin{array}{cl} \frac{1}{b-a}, & a \leq x<b \\ 0, & \text { elsewhere } \end{array}=\frac{1}{30} \quad \text { for } 0 \leq x \leq 30 .\right. \] The probability that the patient will wait more than 10 minutes is \[ P(\text { Waiting time more than } 10 \text { minutes })=\int_{10}^{30} \frac{1}{30} d x \]

a <- 0  # min
b <- 30  # max

# Probability of waiting more than 10 minutes
Wait_more_than_10_min <- (1 - punif(10, min = a, max = b))
cat("The probability that the patient will wait more than 10 minutes is:", Wait_more_than_10_min, "\n")

## The probability that the patient will wait more than 10 minutes is: 0.6666667

# The probability that the patient will wait at least another 5 minutes after 10 minutes of waiting
Wait_time_15_min <- (1 - punif(15, min = 10, max = 30))
cat("The probability that the patient will wait at least another 5 minutes after 10 minutes of waiting is:", Wait_time_15_min, "\n")

## The probability that the patient will wait at least another 5 minutes after 10 minutes of waiting is: 0.75

cat("The expected waiting time is:", (a + b)/2, "minutes", "\n")

## The expected waiting time is: 15 minutes

Problem 8

Your hospital owns an old MRI, which has a manufacturer’s lifetime of about 10 years (expected value). Based on previous studies, we know that the failure of most MRIs obeys an exponential distribution. What is the expected failure time? What is the standard deviation? What is the probability that your MRI will fail after 8 years? Now assume that you have owned the machine for 8 years. Given that you already owned the machine 8 years, what is the probability that it will fail in the next two years? MRI follows the Exponential Distribution Formula: The probability density of exponential distribution function: \[ f_X(x \mid \lambda)=\left\{\begin{array}{cc} \lambda e^{-\lambda x} & \text { for } x>0 \\ 0 & \text { forr } x \leq 0 \end{array}\right. \] Where $\lambda=\frac{1}{t}$ is called the distribution rate. The manufacturer’s lifetime is about 10 years (expected value) is: \[ \begin{gathered} E(\text { Failure Time })=\frac{1}{\lambda} \\ E(\text { Failure Time })=\frac{1}{\frac{1}{10}}=10 \text { years } \end{gathered} \] Standard Deviation $(\sigma)$ is also: $\sigma=\frac{1}{\lambda}=10$ years Probability of Failure After 8 Years ( $P$ ( Failure After 8Ye To calculate the probability that the MRI will fail after 8-year probability density function (PDF), which is given by: \[ f(t)=\lambda e^{-\lambda t} \] Probability of Failure in the Next 2 Years ( $P$ (Failure in Next 2 Years)) is : \[ P(\text { Failure in Next } 2 \text { Years })=\int_8^{10} \frac{1}{10} e^{-\frac{1}{10} t} d t \]

lambda <- 1/10
t <- 8 
# The probability of failure after 8 years
failure_after_8_years <- 1 - pexp(t, rate = lambda)
cat("The probability that the MRI will fail after 8 years is:", failure_after_8_years, "\n")

## The probability that the MRI will fail after 8 years is: 0.449329

# Calculate the probability of failure in the next 2 years, given 8 years 
failure_next_two_years <- pexp(10, rate = lambda) - pexp(8, rate = lambda)
cat("The probability that the MRI will fail in the next 2 years, given 8 years, is:", failure_next_two_years, "\n")

## The probability that the MRI will fail in the next 2 years, given 8 years, is: 0.08144952