DATA 605 FALL 2019 HW7

Alexander Ng

10/13/2019

Problem 1

Statement

Let \(X_1, X_2, \dots, X_n\) be \(n\) mutually independent random variables, each of which is uniformly distributed on the integers from 1 to \(k\). Let \(Y = min(X_1, X_2, \dots, X_n)\). Find the distribution of \(Y\).

Solution

We seek to calculate \[ F(s) = P[Y \leq s] \text{ where } s \in \{1,\dots,k\}\].

Clearly \[P[Y \leq s] = 1 - P[Y > s] = 1 - P[X_1 > s, X_2>s, \dots, X_n > s]\]

By independences and identical distributions of the \(X_i\) we conclude the RHS equals:

\[1-P[X_1>s]P[X_2>s]\dots P[X_n>s] = 1 - \prod_{i=1}^{n}P[X_1>s]= 1 - \left(1 - \frac{s}{k}\right)^n \] The answer is the distribution of \(Y\) follows:

\[F(s) = 1 - \left( 1 - \frac{s}{k} \right)^n\]

For fixed \(k\) and large \(n\), the minimum \(Y\) should attain the value 1 with high probability because the chance of no variables attaining the value of 1 decreases to zero as \(n\) grows large. We illustrate with an example: \(n=10\) and \(k=6\) and

n = 10
k = 6
(s = c(1:k))
## [1] 1 2 3 4 5 6
(F =  1- (1- s/k)**n )
## [1] 0.8384944 0.9826585 0.9990234 0.9999831 1.0000000 1.0000000
(df = data.frame( s = s, F = F))
##   s         F
## 1 1 0.8384944
## 2 2 0.9826585
## 3 3 0.9990234
## 4 4 0.9999831
## 5 5 1.0000000
## 6 6 1.0000000
ggplot( data=df, aes(x=s, y = F) ) + geom_bar( stat="identity") +
  ggtitle("CDF of Y for N=10 and K=6") + xlab("value k") + ylab("Probability")

Problem 2A

Statement

What is the probability the machine will fail after 8 years? Provide the expected value and standard deviation. Model as a geometric.

Solution

Let \(Y\) be random variable that a machine first fails in year \(j\). Then the geometric distribution that that

\[P(Y = j) = q^{j-1}p \text{ where p is probability of failure in year j and } q = 1 - p \] We seek to calculate \(P[Y > 8]\) but this equals \(1- P[Y \leq 8]\) which gives us:

\[ P[Y > 8] = 1 - (p + pq + pq^2 + \dots + pq^7)\]

(p = 0.1)
## [1] 0.1
(q = 1 - p )
## [1] 0.9
(ProbFailAfter8 = 1 - p * ( 1 + q + q^2 + q^3 + q^4 + q^5 + q^6 + q^7) )
## [1] 0.4304672

The probability of the copier failing first failing after 8 years is 43.046%.

We obtain the mean and standard deviation for the geometric distribution from wikipedia that states: [https://en.wikipedia.org/wiki/Geometric_distribution]

\[E[Y] = \frac{1}{p} \text{ and } Var(Y) = \frac{ 1- p}{p^2} \text{ where p = 0.1 }\]

(expected_value = 1/p)
## [1] 10
(variance = (1-p)/(p^2))
## [1] 90
(standard_dev = sqrt(variance))
## [1] 9.486833

The expected value is \(E[Y] = 10\) years and the standard deviation is 9.4868 years.

Problem 2B

Statement

What is the probability the machine will fail after 8 years? Provide the expected value and standard deviation. Model as a exponential

Solution

Let \(Y\) be the arrival time of the failure of the copier where we assume \(Y\) has an exponential distribution. We know the cumulative distribution function is given by \[ \begin{equation} F(x, \lambda) = \begin{cases} 1 - e^{-\lambda x } & x \geq 0 \\ 0 & x < 0 \\ \end{cases} \end{equation} \]

We seek to calculate
\[P[ Y > 8 ] = 1 - P[ Y \leq 8] = 1 - ( 1- exp^{-\lambda 8} ) \]

Finally, we know that \(\lambda = \frac{1}{10}\) is the relevant parameter for the copier reliabilities. Using the above formula, we calculate the probability of first failure exceeding 8 years:

lambda = 0.1
time = 8
(prob_exp = 1 - (1 - exp(-lambda *time) ) )
## [1] 0.449329

We conclude the probability of arrival of first failure exceeding 8 years is 44.93%.

Finally we know from Wikipedia’s page on the exponential distribution that:

\[E[Y] = \frac{1}{\lambda} \text{ and } \sigma(Y) = \frac{1}{\lambda}\]

[https://en.wikipedia.org/wiki/Exponential_distribution]

(Expected_value = 1/lambda )
## [1] 10

This implies the expectation \(E[Y] = 10\) years and standard deviation also equals 10 years.

Problem 2C

Statement

What is the probability the machine will fail after 8 years? Provide the expected value and standard deviation. Model as a binomial.

Solution

A binomial distribution with parameters \(n, p, k\) is the distribution of the random variable which counts the number of heads which occur when a coin is tossed \(n\) times assuming that on any one toss the probability of a head is \(p\). Here we will treat a “head” as a machine failure outcome. The probability of a failure is \(p =0.1\) and \(n = 8\).

Thus, we will seek to calculate \[b(n,p,k) = {n\choose k} p^kq^{n-k}\].

\[b(8,p,0) = {8\choose 0}p^0q^8\]

p = 0.1
q = 1 - p

(binomial_8_k = q^8)
## [1] 0.4304672

The probability is copier failure occuring after year 8 is 43.0467%.

If we model the failures at a binomial coin toss with \(n=8\) trials there is a positive probability of no failures.

According to Wikipedia, the mean and standard deviation of a binomial distribution is:

\[ E[Y] = np \text{ and } \sigma(Y) = \sqrt{ np(1-p)} \]

n = 8
p = 0.1
( Expected = n *p )
## [1] 0.8
(StandardDev = sqrt( n * p * (1-p) ) )
## [1] 0.8485281

We conclude the expected value of 8 trials of a copier with each trial (equivalent to use in a year) and probability \(p=0.1\) of failure per trial is 0.8 failures in \(n=8\) trails and the standard deviation of the number of failures is 0.8485.

Problem 2d

Statement

What is the probability the machine will fail after 8 years? Provide the expected value and standard deviation. Model as a Poisson.

Solution

In a Poisson process, we model the number of times \(Y\) an event occurs in an interval of time. In our case, the interval of time is 1 years. The event corresponds to failure of the copier in the time interval.

\(\lambda\) means the average number of events in a period. So we know \[\lambda = \frac{1}{10}\].

We are interested in the subset of 8 consecutive non-interlapping intervals avoiding any Poisson events. Under a Poisson process, the arrival of the failure in the 8 intervals are independent events. Thus, we can write:

Thus, \[ P[ Y = 0 | year = 1] = P[ Y = 0 | year = 2] \dots P[ Y = 0 | year = 8]\]

\[P[ Y = 0 \text{ failures in year 1}] = \frac{\lambda^0 exp(-lamda)}{k!}\]

Therefore, \[P[Y = 0 \text{ in year } j] = \frac{\lambda^0 exp(-\lambda)}{0!} = exp(-\lambda) = exp(-\frac{1}{10})\]

It follows by independence and identical distributions that

\[P[Y = 0 \text{ in years 1..8}] = \left[ exp(-\frac{1}{10}) \right]^8 = exp\left( -\frac{8}{10} \right)\]

lambda = 0.1

(prob_no_failure_in_8 = exp( -lambda )**8 )
## [1] 0.449329

We conclude the probability of no arrival of a copier failure in the first 8 years is 44.93%.

The average arrival time of the failure is \[E[Y] = \lambda\]. The variance of a Poisson arrival time is \[Var[Y] = \lambda\]. Thus, the standard deviation of the arrival time is \[Std(Y)=\sqrt{\lambda}\].

(stdev_poisson = sqrt(lambda))
## [1] 0.3162278

We conclude that the average arrival count of a copier failure in a year is 0.1. The standard deviation of the count of copier failures in one year in 0.31622.

Which means it requires 10 years on average for a copier failure to arrive. In 10 years, the standard deviation of arrival counts is 1 failure.