Grando 7 Homework

1. Let X1, X2, . . . , Xn be n mutually independent random variables, each of which is uniformly distributed on the integers from 1 to k. Let Y denote the minimum of the Xi’s. Find the distribution of Y .

Answer:

The denominator of the distribution is pretty straightforward since it is all the choices for each evnt \({k}^{n}\).

For the numerator, let’s start with an example. Suppose we have \(k=3\) and \(n=3\), then we can easily compute the number of occurances with \(Y = 1\), which is a total of seven.

\[S = \left\{ 111, 112, 121, 122, 211, 212, 221 \right\}\]

How do I get these numbers? I’m just cycling through the options like so:

\[\sum{S} = 1\times k \times k + k \times 1 \times k + k \times + k \times k \times 1\]

But that’s not actually right, I’m removing repeats as I go; however I can express that by just removing one item from each set I iterated over:

\[\sum{S} = 1\times k \times k + \left(k-1\right) \times 1 \times k + \left(k-1\right) \times \left(k-1\right) \times 1 \]

If we look at the pattern, then we can see that I can express this operation more concisely:

\[\sum{S} = \sum _{ i = 0 }^{ n-1 }{ {k}^{n-1-i} \times {\left( k-1 \right)}^{i}}\]

But that is just for the value of 1 where we have \(k\) options. If the minimum were 2 then that would mean we would only have \(\left( k-1 \right)\) options to start with since none of the choices could contain 1. We can add that in by a simple substitution of \(\left(k-j \right)\) for \(k\):

\[\sum{S\left(Y\right)} = \sum _{ i = 0 }^{ n-1 }{ {\left(k-j \right)}^{n-1-i} \times {\left( k-j-1 \right)}^{i}} \\ where \quad j = \left( Y-1 \right)\]

\[\sum{S\left(Y\right)} = \sum _{ i = 0 }^{ n-1 }{ {\left(k-Y+1 \right)}^{n-1-i} \times {\left( k-Y \right)}^{i}}\]

So, let’s write a function that does this:

minimum_sum_set <- function(k, n, y) {
    x <- 0
    for (i in 0:(n - 1)) {
        x <- x + (k - y + 1)^(n - 1 - i) * (k - y)^i
    }
    return(x)
}

So, if we run the formula we just made for all \(Y\) in a set determined by \(k, n\), we should get the exact number of combinations \({k}^{n}\). Let’s check:

k <- sample(2:20, 1)
n <- sample(2:10, 1)

sum_all_minimum_sets <- function(k, n) {
    x <- 0
    for (y in 1:k) {
        x <- x + minimum_sum_set(k, n, y)
    }
    return(x)
}

all.equal(k^n, sum_all_minimum_sets(k, n))
## [1] TRUE

And now we can write a function that gets the probability distribution of \(Y\):

minimum_probability <- function(k, n) {
    y_v <- c(rep(0, k))
    p_v <- c(rep(0, k))
    for (i in 1:k) {
        y_v[i] <- i
        p_v[i] <- minimum_sum_set(k, n, i)/k^n
    }
    return(data.frame(y_v, p_v))
}

Now plot:

library(ggplot2)
library(knitr)
library(kableExtra)
prob_df <- minimum_probability(k, n)
colnames(prob_df) <- c("Y values", "Probabilities")
ggplot(prob_df, aes(x = `Y values`, y = Probabilities)) + geom_bar(stat = "identity") + 
    ggtitle(paste("Probability Function for K =", k, "n =", n, sep = " ")) + theme(plot.title = element_text(hjust = 0.5))

kable(prob_df, "html") %>% kable_styling(c("striped", "bordered")) %>% add_header_above(c(`Probabilty Density Function Results` = 2))
Probabilty Density Function Results
Y values Probabilities
1 0.2739750
2 0.2134029
3 0.1624943
4 0.1204089
5 0.0863065
6 0.0593467
7 0.0386891
8 0.0234936
9 0.0129197
10 0.0061272
11 0.0022758
12 0.0005252
13 0.0000350
# Sum of all K probabilities
sum(prob_df$Probabilities)
## [1] 1

To summarize, our final formula is as follows:

\[P\left(Y\right) = \frac{\sum _{ i = 0 }^{ n-1 }{ {\left(k-Y+1 \right)}^{n-1-i} \times {\left( k-Y \right)}^{i}} }{{k}^{n}}\]

2. Your organization owns a copier (future lawyers, etc.) or MRI (future doctors). This machine has a manufacturer’s expected lifetime of 10 years. This means that we expect one failure every ten years. (Include the probability statements and R Code for each part.).

a. What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a geometric. (Hint: the probability is equivalent to not failing during the first 8 years..)

Answer:

A geometric distribution is described as the following:

\[P\left( X = x \right) = {q}^{x-1}p\]

\[E\left( X \right) = \frac{1}{p}\]

\[Var\left(X \right) = \frac{q}{{p}^{2}}\]

So we can use the expected value of ten years to find p:

\[E\left( X \right) = 10 = \frac{1}{p}\]

\[p = 0.1\]

To compute \(P\left( X > 8 \right)\) we can either add up all the probabilities of the discrete events up to 8 and subtract from 1:

\[P\left( X>8 \right) =1-\sum _{ x = 1 }^{ 9 }{ { q }^{ x-1 }p } \]

(round(1 - (0.9^(1 - 1) * 0.1 + 0.9^(2 - 1) * 0.1 + 0.9^(3 - 1) * 0.1 + 0.9^(4 - 
    1) * 0.1 + 0.9^(5 - 1) * 0.1 + 0.9^(6 - 1) * 0.1 + 0.9^(7 - 1) * 0.1 + +0.9^(8 - 
    1) * 0.1 + +0.9^(9 - 1) * 0.1), 3))
## [1] 0.387

Or we can just use the builtin function:

(round(pgeom(8, prob = 0.1, lower.tail = FALSE), 3))
## [1] 0.387

Or, we could have just used the cumulative distribution function:

(round((1 - (1 - (1 - 0.1)^9)), 3))
## [1] 0.387

As noted above, we determined the expected value to be ten years

The standard deviation can be computed as such:

\[sd(X)=\sqrt { Var\left( X \right) } =\sqrt { \frac { q }{ { p }^{ 2 } } } = \sqrt { \frac { 0.9 }{ { 0.1 }^{ 2 } } } \]

(round(sqrt(0.9/0.1^2), 3))
## [1] 9.487

b. What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as an exponential.

Answer:

A exponential cumulative distribution is described as the following:

\[P\left( X \le x \right) =1-{ e }^{ -\lambda x }\]

\[E\left( X \right) = \frac{1}{\lambda}\]

\[Var\left(X \right) = \frac{1}{{\lambda}^{2}}\]

As previously noted, the expected value is 10 years, which we use to compute \(\lambda\):

\[E\left( X \right) = 10 = \frac{1}{\lambda}\]

\[\lambda = 0.1\]

To compute \(P\left(X>8\right)\), we can use the cumulative distributin function:

(round(1 - (1 - exp(-0.1 * 8)), 3))
## [1] 0.449

builtin check:

pexp(8, 0.1, lower.tail = FALSE)
## [1] 0.449329

standard deviation:

(round(sqrt(1/0.1^2), 3))
## [1] 10

c. What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a binomial. (Hint: 0 success in 8 years)

Answer:

Note, I realy struggled with this one, and i’m unsure if it is correct since I didn’t incorporate the hint into my answer, but here’s my best shot….

A binomial cumulative distribution is described as the following:

\[P\left( X\le x \right) =\sum _{ i=0 }^{ x }{ \left( \begin{matrix} n \\ i \end{matrix} \right) { p }^{ i }{ \left( 1-p \right) }^{ n-i } } \]

\[E\left( X \right) = np\]

\[Var\left(X \right) = np\left(1-p\right)\]

As previously noted, the expected value is 10 years, which we use to compute p:

\[E\left( X \right) = 10 = np\]

However, we have some trouble here, because n (the number of years we observe the machine status), is used to determine the expected value. So, let’s just pick 20 years and see what we get (\(p = 0.5\)).

To compute \(P\left(X>8\right)\), we can use the cumulative distributin function:

n = 20
p = 10/n
round(1 - (choose(n, 0) * p^0 * (1 - p)^(n - 0) + choose(n, 1) * p^1 * (1 - p)^(n - 
    1) + choose(n, 2) * p^2 * (1 - p)^(n - 2) + choose(n, 3) * p^3 * (1 - p)^(n - 
    3) + choose(n, 4) * p^4 * (1 - p)^(n - 4) + choose(n, 5) * p^5 * (1 - p)^(n - 
    5) + choose(n, 6) * p^6 * (1 - p)^(n - 6) + choose(n, 7) * p^7 * (1 - p)^(n - 
    7) + choose(n, 8) * p^8 * (1 - p)^(n - 8)), 3)
## [1] 0.748

Well, that’s okay but since there isnt’ a definite maximum number of years, why don’t we just pick something really big to encompass all options.

n = 1e+07

binom_func <- function(n) {
    p = 10/n
    round(1 - (choose(n, 0) * p^0 * (1 - p)^(n - 0) + choose(n, 1) * p^1 * (1 - p)^(n - 
        1) + choose(n, 2) * p^2 * (1 - p)^(n - 2) + choose(n, 3) * p^3 * (1 - p)^(n - 
        3) + choose(n, 4) * p^4 * (1 - p)^(n - 4) + choose(n, 5) * p^5 * (1 - p)^(n - 
        5) + choose(n, 6) * p^6 * (1 - p)^(n - 6) + choose(n, 7) * p^7 * (1 - p)^(n - 
        7) + choose(n, 8) * p^8 * (1 - p)^(n - 8)), 3)
}

binom_func(n)
## [1] 0.667

builtin check:

round(pbinom(8, prob = 10/n, size = n, lower.tail = FALSE), 3)
## [1] 0.667

standard deviation:

n * (10/n) * (1 - (10/n))
## [1] 9.99999

well, that is an odd way of solving this problem so I thought I would also plot \(P\left(X>8\right)\) for given choices of n:

binom_df <- data.frame(val = seq(10, 10000))
binom_df$prob <- binom_func(binom_df$val)
ggplot(binom_df, aes(x = val, y = prob)) + geom_line() + coord_cartesian(xlim = c(10, 
    100)) + labs(x = "Values of n", y = "Probability")

d. What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a Poisson.

Answer:

The cumulative poisson distribution function is:

\[P\left( X\le x \right) ={e}^{-\lambda}\sum _{ i=0 }^{ x }{ \frac{{\lambda}^{i}}{i!} } \]

Where the expected value and variance is:

\[\lambda = E\left(X\right) = Var\left(X\right) = 10\]

\[sd\left(X\right)=\sqrt{10}\] So, the probability using this distribution is:

round(1 - exp(-10) * (10^0/factorial(0) + 10^1/factorial(1) + 10^2/factorial(2) + 
    10^3/factorial(3) + 10^4/factorial(4) + 10^5/factorial(5) + 10^6/factorial(6) + 
    10^7/factorial(7) + 10^8/factorial(8)), 3)
## [1] 0.667

builtin check:

(round(ppois(8, lambda = 10, lower.tail = FALSE), 3))
## [1] 0.667