DATA 605 Assignment 5

Question 1:

$P(A_k|B) = \frac {P(B|A_k)*P(A_k)}{\sum_{i = 1}^{k}{P(B|A_i)*P(A_i)}}$

In this case, $A_k$ will represent correct test results. $P(B|A_{pos}) = 0.96$ $P(A_{pos}) = 0.001$ $P(B|A_{neg}) = 0.02$ $P(A_{neg}) = 0.999$

prob <- ((.96)*(.001))/((.96)*(.001) + (.02)*(.999))
round(prob,3)

## [1] 0.046

Cost:

cost <- 100000*(0.001*100000) + 1000*100000
round(cost,3)

## [1] 1.1e+08

Ignoring test results and considering the actual prevalence of the disease, the expected cost would be $110,000,000 for the first year of treatment and testing for 100,000 individuals.

Question 2:

The probability of receiving exactly 2 inspections can be represented by $P(exactly two) = \binom{24}{2} * p^2*(1 - p)^{22}$

p <- .05
round((choose(24,2)*(p^2)*((1-p)^22)), 6)

## [1] 0.223238

P(exactly 2) = 0.223238

Two or more

$P(\geq 2) = 1 - P(one) - P(none)$

p <- .05
p1 <- (choose(24,1)*(p^1)*((1-p)^23))
p0 <- (choose(24,0)*(p^0)*((1-p)^24))
round(1 - (p1 + p0), 6)

## [1] 0.339183

P(Two or more) = 0.339183

Fewer than 2

$P(\leq 2) = P(one) + P(none)$

This is also the complement of having two or more inspections.

p <- .05
p1 <- (choose(24,1)*(p^1)*((1-p)^23))
p0 <- (choose(24,0)*(p^0)*((1-p)^24))
round((p1 + p0), 6)

## [1] 0.660817

P(Fewer than two) = 0.660817

Expectation

$E = 24*p$

round(24*p,4)

## [1] 1.2

The expected number of inspections is 1.2, or 1.

Standard Deviation

$\sigma = \sqrt{np(1 - p)}$

round((24*.05*(.95))^(1/2), 3)

## [1] 1.068

Standard Deviation = 1.068

Question 3:

The PMF for a Poisson Distribution is \[\frac {e^{-\lambda} \lambda^x}{x!}\].

lambda <- 10
x <- 3
pmf <- (exp(-lambda)*lambda^x)/(factorial(x))
round(pmf,5)

## [1] 0.00757

Probability of Exactly 3: 0.00757

More than 10:

The probability of more than 10 coming can be represented by $1 - \sum P(0):P(10)$

prob <- 1
lambda <- 10

pmf <- function(lambda,x){
  out <- (exp(-lambda)*(lambda^x))/(factorial(x))
  return(out)
}

for(i in 0:10){
  prob <- prob - pmf(10, i)
}

prob

## [1] 0.4169602

Probability of More than 10: 0.41696

8 Hour Expectation:

The expectation in one hour is $\lambda = 10$, so the expectation for 8 hours is simply $\lambda \times 8 = 80$

Standard Deviation:

The standard deviation is the square root of the variance. For a poisson distribution, the variance is $\lambda$. So, Standard Deviation = $\sqrt {10} = 3.16228$

Percent Utilization:

The percent utilization is given by dividing the expectation by the utilization capacity.

use <- 80/(24*3)
use

## [1] 1.111111

These doctors are at 111% capacity. They should add another doctor or add a part time doctor to cover the extra patients in order to appropriately meet the demands of the practice.

Question 4:

At a glance, this does appear to be favoritism.

The PMF for a hypergeometric distribution is $\frac {\left[ {\binom{m}{x} {\binom{N - m}{n - x}}} \right]} {\binom{N}{n}}$. In this case, m will represent the number of nurses and n will represent the number of trips. x represents the target quantity of 5 nurses.

phyper <- (choose(15, 5)*choose(30 - 15, 6 - 5))/choose(30,6)
phyper

## [1] 0.07586207

Probability that it occurred by chance: 0.07586

The mean, or expectation for a hypergeometric distribution is given by $n \times \frac m N$, where $n$ is the number of trips, m is the number of nurses, and N is the total population.

Thus, the expected amount of nurses is $n \times \frac m N = 6 \times \frac {15}{30} = 3$ The same can be done to compute the number of non-nurses that are expected. Expectation of non-nurses: $n \times \frac m N = 6 \times \frac {15}{30} = 3$

Question 5:

The PMF for a geometric distribution is $(1-p)^{x - 1}p$ where x is the number of trials and p is the probability of the event happening in one trial. The probability of at least one serious injury in the first year can be found by subtracting the probability that there is no serious injury from 1.

trials <- 1200
p <- .001
pgeom <- 1 - ((1 - p)^(trials))
pgeom

## [1] 0.6989866

It is very likely that a serious injury occurs in a year. Probability of injury within a year: 0.69899

15 Months

trials <- 1200*(15/12)
p <- .001
pgeom <- 1 - ((1 - p)^(trials))
pgeom

## [1] 0.7770372

expect <- 1/p - 1
expect

## [1] 999

Probability of injury within 15 months: 0.77704

Expectation:

The expected number of hours to drive before injury can be found from $E[X] = \frac 1p - 1$ Expectation: 999 hours before serious injury (Injury would occur in the 1000th hour)

Conditional Probability

Probability of injury occuring between 1300 and 1200 hours:

The important idea here is the memoryless property that applies to geometric distributions. Memorylessness means that regardless of the history of a variable, the future remains independent. We are given that a driver has driven 1200 hours. The probability that they will be injured in the next 100 hours is simply the probability of the injury occurring in 100 hours, regardless of how many hours they have already driven.

trials <- 100
p <- .001
pgeom <- 1 - ((1 - p)^(trials))
pgeom

## [1] 0.09520785

Thus, the probability of an injury occurring in the next 100 hours is 0.09521.

Question 6:

This appears to be a Poisson distribution.

Failing more than twice:

prob <- 1
lambda <- 1

pmf <- function(lambda,x){
  out <- (exp(-lambda)*(lambda^x))/(factorial(x))
  return(out)
}

for(i in 0:2){
  prob <- prob - pmf(lambda, i)
}

prob

## [1] 0.0803014

Probability of more than two failures: 0.0803

Expectation:

Because this is likely a Poisson random variable, the expectation is equivalent to $\lambda$ = 1.

Question 7:

Under a continuous uniform distribution, the PDF is $\frac 1{b-a}$, where $a$ and $b$ are the boundaries, in this case 0 and 30. The probability that the patient waits more than 10 minutes is found via simple integration. The area under a uniform distribution forms a basic rectangle and the area of the rectangle is the probability. For a probability of more than ten minutes, we are looking for the area from 10:30.

len <- 30 - 10
hgt <- 1/30
p <- len*hgt
p

## [1] 0.6666667

Thus, the probability of waiting more than 10 minutes is 0.667 or $\frac23$

Additional 5 minutes

Probability of waiting to at least 15 minutes given already waiting 10 minutes:

$P(A|B) = \frac{P(A\cap B)}{P(B)}$

In this case, we do not need this equation. Given that the patient waited 10 minutes already, the probabilities of future events have changed. There are now only 20 more minutes in the interval, so the distribution needs to be adjusted.

The rectangle here shrinks because we are given that the patient already waited 10 minutes. The individual rectangles have now shrunk to $\frac{1}{20}$ each.

hgt <- 1/20
p15 <- (hgt)*(20 - 5)

p15

## [1] 0.75

Probability of waiting at least 5 more minutes: 0.75

Expectation:

The expectation of a uniform distribution is $\frac{a + b}{2}$, where a and b are the boundaries of the interval.

Expectation: 15 minutes

Question 8:

Expectation:

The expectation of an exponential distribution is $\theta$, where $\theta = \frac 1{\lambda}$. In this case, there is no rate parameter but the expectation is given to be 10 years. Expected failure time: 10 years

Standard Deviation

The standard deviation of an exponential distribution is $\sqrt{\theta^2} = \theta$ Standard Deviation = 10

Probability of failure after 8 years:

We need to evaluate the PDF at x = 8 years.

theta <- 10
x <- 8
pdfexp <- (exp(-(x/theta)))
pdfexp

## [1] 0.449329

Probability of failure at 8 years: 0.449329

Probability of failure between years 8 and 10:

Here, we need to evaluate the CDF from 8 to 10 years. CDF: $1 - e^{- \frac {x}{\theta}}$.

This is not a traditional conditional probability question because the exponential distribution has the memoryless property. Thus, we need to find the probability of failure between 8 and 10 years.

cdfexp <- function(theta, x){
  out <- 1 - exp(-x/theta)
  return(out)
}

theta <- 10

up <- cdfexp(theta,10)
low <- cdfexp(theta,8)

pexp <- up - low
pexp

## [1] 0.08144952

Probability of failure between 8 and 10 years: 0.0814495

DATA 605 Assignment 5

Shane Hylton

2/25/2022

Question 1:

Cost:

Question 2:

Two or more

Fewer than 2

Expectation

Standard Deviation

Question 3:

More than 10:

8 Hour Expectation:

Standard Deviation:

Percent Utilization:

Question 4:

Question 5:

15 Months

Expectation:

Conditional Probability

Question 6:

Failing more than twice:

Expectation:

Question 7:

Additional 5 minutes

Expectation:

Question 8:

Expectation:

Standard Deviation

Probability of failure after 8 years:

Probability of failure between years 8 and 10: