library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Question 1

Let X1, X2, . . . , Xn be n mutually independent random variables, each of which is uniformly distributed on the integers from 1 to k. Let Y denote the minimum of the Xi’s. Find the distribution of Y.

In order to make sense of this problem, my inclination is to concretize with an example; even if that can’t prove anything in the abstract, it may help point toward a more general solution.

So let’s say that \(X_1\) to \(X_n\) are 1000 runs of a 20-sided dice roll (whose outcomes are uniformly distributed from 1 to 6). I can run that experiment and examine the distribution

results <- c()

for (i in 1:10000) {
  results <- c(results, min(sample(1:20, size=5, replace = TRUE)))
}

results %>%
  hist()

Admittedly, I’m having trouble conceptualizing this problem in the abstract. But the distribution is clearly biased toward its lower values and has a long tail to the right.

Question 2

Your organization owns a copier (future lawyers, etc.) or MRI (future doctors). This machine has a manufacturer’s expected lifetime of 10 years. This means that we expect one failure every ten years. (Include the probability statements and R Code for each part.).

Part a.

What is the probability that the machine will fail after 8 years?. Provide also theexpected value and standard deviation. Model as a geometric. (Hint: the probability is equivalent to not failing during the first 8 years..)

The geometric distribution is given as follows:

\[ P(X=k) = p(1-p)^{k-1} \] Here, \(p\) is the probability that the machine will fail in any given independent trial period (1 year, same unit as our \(k\) number of trials). Since the expected life of the machine is 10 years, we can assume the probability of failure in any given year is \(1/10\) or \(0.1\). If we run this formula for every value of \(k\) from 1 to 8, we can subtract that value from 1 to get our probability of failure after 8 years.

P = 0

for (k in 1:8) {
  P_k <- (0.1)*(1 - 0.1)^(k-1)
  P <- P + P_k
}

1 - P
## [1] 0.4304672

This works out to around 43%.

In a geometric distribution, the mean or expected value is given by \(1/p\), or \(10\) (ironically, it was given in the problem and we worked backwards to derive \(p\)). The standard deviation is given by:

\[ \sqrt{\frac{1-p}{p^{2}}} \]

sqrt((1-0.1)/(0.1)^2)
## [1] 9.486833

So around 9.5.

Part b

What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as an exponential.

Using the exponential distribution function, we can find the probability of an success (machine failure) occurring in fewer than \(k\) trials (years) when we know the mean time \(\mu\) needed for a success (in our case, 10 years), given by the formula:

\[ P(X \leq k) = 1-e^{-k/\mu} \] Therefore, the probability of a failure in the first 8 years is given by:

\[ P(X \leq 8) = 1-e^{-8/10} \]

1 - exp(1)^(-8/10)
## [1] 0.550671

The probability of the machine failure coming after 8 years must be one minus that probability:

1 - (1 - exp(1)^(-8/10))
## [1] 0.449329

So using the exponential distribution, the probability of the machine failing after 8 years is roughly 44.9%.

The mean \(\mu\) was a requirement for this formula, representing the average number of years required for a success (in this case, 10). In an exponential distribution, the standard deviation is equivalent to the mean, so it is also 10.

Part c

What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a binomial. (Hint: 0 success in 8 years)

The binomial distribution models the probability of a certain amount of successes of a binary outcome (either success or failure) occurring over a determined number of independent trials. That distribution is given as follows:

\[ P(X = k) = \binom{n}{k} p^k (1-p)^{n-k} \] Where \(n\) is the number of trials (years), \(p\) is the probability of an individual success (0.1, the probability of the machine failing in any given year), and \(k\) is the number of successes for which we want to the know the probability. In essence, here we are looking for the probability of achieving \(k=0\) successes in \(n=8\) trials. Plugging in those values, we get:

\[ P(X = 0) = \binom{8}{0} (0.1)^0 (1-0.1)^{8-0} \]

choose(8,0) * (0.1)^0 * (1 - 0.1)^8
## [1] 0.4304672

Using a binomial distribution, the probability of the machine failing after 8 years (that is, not failing once in 8 years) is around 43%.

In a binomial distribution, the mean is given by \(n * p\), or the number of trials times the probability of success for each trial. In this case that’s equal to \(8*(0.1)=0.8\). The standard deviation is \(\sqrt{np(1-p)}\), or \(\sqrt{(8)(0.1)(1-0.9)}\). Shown below, that works out to around 0.28.

sqrt((8)*(0.1)*(1-0.9))
## [1] 0.2828427

Part d

In a Poisson distribution, we model the likelihood of a success occurring \(k\) times over a given time period, when we know the average rate \(\lambda\) of that success occurring. The formula is given as follows:

\[ P(X = k) = \frac{e^{-\lambda}\lambda^k}{k!} \] In our case, \(k\) would be equal to \(0\), because we want to know the probability of the machine not failing a single time in a 8 year span (ergo, failing only some time after years). In this case, we must consider the rate \(\lambda\) in terms of the amount of failures expected per 8 years. If we expect 0.1 failures in 1 year, we can expect 0.8 failures per 8 years.

\[ P(X = 0) = \frac{e^{-0.8}0.8^0}{0!} \] \(0.8^0\) and \(0!\) both work out to 1, so we are left with \(e^{-0.8}\).

exp(1)^(-0.8)
## [1] 0.449329

That leaves us a probability of about 44.9%.

In a Poisson distribution, the mean is equivalent to our rate \(\lambda\) or \(0.8\). The standard deviation is the square root of the mean, so \(\sqrt{0.8}= 0.8944\).

sqrt(0.8)
## [1] 0.8944272