library(MASS)
Sensitivity = True Positive for disease = 96% Specificity = True Negative for disease = 98% Prevalence rate = rare instance rate = 0.1% = .001 What is the probability that an individual who is reported as positive by the new test actually has the disease?
Median cost per positive case total = $100,000 Test administration cost = $1000 What is the total first-year cost for treating 100,000 individuals?
Bayesian Formula:
The Bayes Theorem finds the chance that an event will happen based on our prior knowledge of similar events.
\[P(A|B)=((P(B|A)⋅P(A))/P(B)\]
where
P(A|B) - conditional probability that event A will happen if event B has already happened.
P(B|A) - conditional probability of event B given that event A has already occurred.
P(A) - the earlier probability that event A will take place.
P(B) - the probability that event B will occur.
# Sensitivity = True Positive for disease = 96%
sensitivity <- 0.96
# Specificity = True Negative for disease = 98%
specificity <- 0.98
# Prevalence rate = rare instance rate
prevalence_rate <- 0.001
# False positive rate
false_positive_rate <- (1 - specificity)
false_positive_rate <- (1 - 0.98)
false_positive_rate
## [1] 0.02
Total Probability of getting the true positive result
total_prob_positive <- sensitivity * prevalence_rate + false_positive_rate * (1 - prevalence_rate)
total_prob_positive <- .96 * .001 + .02 * (1 - .001)
total_prob_positive
## [1] 0.02094
Probability of Positive given the positive test result
prob_disease_given_positiveTest <- sensitivity * prevalence_rate / total_prob_positive
prob_disease_given_positiveTest <- .96 * .001 / .02094
prob_disease_given_positiveTest
## [1] 0.04584527
Rounding the result and calculating percentage of
cat(round(prob_disease_given_positiveTest * 100, 2), "%\n")
## 4.58 %
The probability of actually having the positive for disease by the new test is 4.58 %
Test administration cost = $1000
cost_per_head <- 4.58 * 1000
cost_per_head
## [1] 4580
total_cost <- 100000 * 4580
total_cost
## [1] 4.58e+08
\[ P(x)= \left( \begin{bmatrix} n \\ x \end{bmatrix}\right) p^x(1-p)^{n-x}\]
The binomial distribution model is used for finding the probability of success of an event which has only two possible outcomes in a series of experiments.
R has four in-built functions to generate binomial distribution.
dbinom(x, size, prob) - the probability density distribution at each point. pbinom(x, size, prob) - the cumulative probability of an event. It is a single value representing the probability. qbinom(p, size, prob) - takes the probability value and gives a number whose cumulative value matches the probability value. rbinom(n, size, prob) - generates required number of random values of given probability from a given sample.
x is a vector of numbers.
p is a vector of probabilities.
n is number of observations.
size is the number of trials.
prob is the probability of success of each trial.
Mean = n * p Variance = n * p * (1 - p)
prob_inspection = 0.05
months = 24
prob_2months <- dbinom(2, 24, 0.05)
prob_2months
## [1] 0.2232381
prob_2month_or_more <- 1 - pbinom(1, 24, 0.05)
prob_2month_or_more
## [1] 0.3391827
prob_2month_or_less <- pbinom(1, 24, 0.05)
prob_2month_or_less
## [1] 0.6608173
expected_inspections <- 24 * 0.05
expected_inspections
## [1] 1.2
standard_deviation <- sqrt(months * prob_inspection * (1-prob_inspection))
standard_deviation <- sqrt(24 * 0.05 * 0.95)
standard_deviation
## [1] 1.067708
\[ P(x)= \lambda^{x}e^{-\lambda}/x^{!}\]
where
x=0,1,2,3…..
𝑒 is the euler’s number
𝜆 is an average rate of expected value and λ=variance,λ>0
dpois() function calculates the probability mass function (PMF) for a given Poisson distribution. It takes two arguments: the value at which to evaluate the PMF and the mean of the Poisson distribution.
lambda<- 10
x<- 3
prob_3 <- dpois(x, lambda)
prob_3 <- dpois(3, 10)
prob_3
## [1] 0.007566655
ppois() function calculates the cumulative distribution function (CDF) for a Poisson distribution.
It takes two arguments: the value at which to evaluate the CDF and the mean of the Poisson distribution.
prob <- ppois(10, lambda=10) # lower tail
prob
## [1] 0.5830398
ppois(9, lambda=10, lower=FALSE) # upper tail
## [1] 0.5420703
lambda<- 10
time <- 8
expected_arrive <- lambda * time
expected_arrive
## [1] 80
\[SD=sqrt(λ)\]
lambda<- 10
std_dev_poisson <- sqrt(lambda)
std_dev_poisson
## [1] 3.162278
utilization <- 30 * 24 / (3 * 24)
utilization
## [1] 10
cat(utilization * 100, "%\n")
## 1000 %
The probability mass function (PMF) for the hypergeometric distribution is:
P(X = k) = [K choose k] * [(N - K) choose (n - k)] / [N choose n]
where: X is the random variable representing the number of objects of interest in the sample k is a specific value of X [a choose b] represents the binomial coefficient, which is the number of ways to choose b objects from a set of a objects
dhyper(x, m, n, k, log = FALSE) phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE) qhyper(p, m, n, k, lower.tail = TRUE, log.p = FALSE) rhyper(nn, m, n, k)
p - probability, it must be between 0 and 1.
x: represents the data set of values m: size of the population n: number of samples drawn k: number of items in the population
prob_5_nurses <- dhyper(5,15,15,6)
prob_5_nurses
## [1] 0.07586207
expected_nurses = 6 * (15/30)
expected_nurses
## [1] 3
expected_non_nurses = 6 * (15/30)
expected_non_nurses
## [1] 3
prob_injury = .001
hours = 1200
crash_1200hrs = pgeom(hours,prob_injury)
crash_1200hrs
## [1] 0.6992876
crash_15_months = pgeom(hours * (15/12),prob_injury)
crash_15_months
## [1] 0.7772602
Expected_hours <-1/prob_injury
Expected_hours
## [1] 1000
prob_100hours <- pgeom(100,prob_injury)
prob_100hours
## [1] 0.09611265
fail_more_than_twice = 1 - ppois(2,1)
fail_more_than_twice
## [1] 0.0803014
expected_value <- 1/1000 * 1000
cat(expected_value, "\n")
## 1
wait = 10
min = 0
max = 30
wait_more_than_10 = 1- punif(wait,min,max)
wait_more_than_10 = 1- punif(10,0,30)
wait_more_than_10
## [1] 0.6666667
wait_2 = 15
new_min = 10
wait_more_than_15 = 1 - punif(wait_2,new_min,max)
wait_more_than_15
## [1] 0.75
expected_wait <- .5*(0+30)
expected_wait
## [1] 15
years = 8
fail_rate = 1/10
fail_after_8yrs = 1- pexp(years,fail_rate)
fail_after_8yrs = 1- pexp(8,1/10)
fail_after_8yrs
## [1] 0.449329
next_years = 2
fail_after_8_wi_2 = pexp(next_years,fail_rate)
fail_after_8_wi_2 = pexp(2,1/10)
fail_after_8_wi_2
## [1] 0.1812692