Let X1, X2, . . . , Xn be n mutually independent random variables, each of which is uniformly distributed on the integers from 1 to k. Let Y denote the minimum of the Xi’s. Find the distribution of Y.
### Attempt at an analytic solution ###
# Let's assume k = 11 and solve this one at a time for n = 1, 2, 3, and then see if we can spot a pattern.
# if n = 1 then the minimum, Y, can be any deviate, uniformly distributed, from 1 to 11.
# if n = 2 then there are two deviates between 1 and 11. Let's say the minimum, Y, is 5. Then one of the deviates is 5 and the other variable is between 5 and 11 with a probability of occurring (11-5)/(11-1). Said generally, if one of the deviates is the minimum, Y, then the other is between Y and 11 with a probability of occurring (11-Y)/(11-1). Since either deviate can be the lowest then we need to multiply the probability by 2. We can generalize this further by replacing the 11 with k. The probability is 2*(k-Y)/(k-1)
# if n = 3 then there are three deviates between 1 and k. If one of the deviates is the minimum, Y, then the other two are between Y and k and have the probability of occurring (k-Y)/(k-1). Since there are three deviates that could the minimum we have to multiply the probability by 3!
# The pattern extended would be n!*(k-Y)/(k-1)
# However this doesn't pass the sniff test of looking like a distribution. With relatively small values of n we no longer have a probability.
### Attempt at a simulated solution ###
# if I were to attempt a second analytic solution I would try to count the number of ways one value is equal to Y given that the other values are greater than Y divided by the total number of possible values but that wouldn't work for a continuous problem, only for a discrete number of values. But maybe we can model this on the uniform distribution using r below.
# I'm modeling where k = 11 and n = 5
set.seed(seed=99)
Simulations <- 10
k <- 11
n <- 5
SumOfMins <- 0
for (s in 1:Simulations) {
SumOfMins <- SumOfMins + min(runif(n)*(k-1)+1)
}
sprintf("Average minimum was: %f", SumOfMins/Simulations)
## [1] "Average minimum was: 2.733364"
# With this as a kernel we could increase the number of simulations instead of taking the average to build out a histogram approximating the distribution curve of Y for multiple values of k and n. That would make a slick shiny app!
Your organization owns a copier (future lawyers, etc.) or MRI (future doctors). This machine has a manufacturer’s expected lifetime of 10 years. This means that we expect one failure every ten years. (Include the probability statements and R Code for each part.).
# From homework 5 the exponential distribution is the best fit for the problem. Poisson seems the least suitable by being out of the pack. The binomial has an odd result of a standard deviation that is 1/10th of the Geometric distribution's standard deviation which seems impossible; if the standard deviation is so low then the probability of the event occurring more than 2 standard deviations away from the mean should be much higher. Rather than calculating the binomial standard deviation differently it makes more sense to say this isn't a suitable problem for a binomial distribution.
Distribution <- c("Geometric", "Exponential", "Binomial", "Poisson")
Mean <- c(10, 10, 10, 10)
STD <- c(9.4868, 10, 0.9487, 3.1623)
Probability <- c("43.05%", "44.93%", "43.05%", "66.72%")
df <- data.frame(Distribution, Mean, STD, Probability)
library(knitr)
kable(df, caption = "The same probability modeled with four distributions")
Distribution | Mean | STD | Probability |
---|---|---|---|
Geometric | 10 | 9.4868 | 43.05% |
Exponential | 10 | 10.0000 | 44.93% |
Binomial | 10 | 0.9487 | 43.05% |
Poisson | 10 | 3.1623 | 66.72% |
What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a geometric. (Hint: the probability is equivalent to not failing during the first 8 years..)
# The mean of a geometric random variable is 1/p
# They tell us the mean is 10 years so p is .1
10
## [1] 10
# The standard deviation of a geometric random variable is sqrt((1-p)/p^2)
sqrt((1-.1)/.1^2)
## [1] 9.486833
# The probability that the machine will fail after 8 years
# is the same as 1 - the probability that it fails in the first 8 years
# we can calculate that from first principals using the geometric distribution
prob = 0
for (x in 1:8)
{
prob = prob + .9^(x-1)*.1
}
1-prob
## [1] 0.4304672
# as a check for a calculation in r below:
# (Note, the first argument sent to pgeom() is the number of failures before the first success)
1-pgeom(7,.1)
## [1] 0.4304672
What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as an exponential.
# They tell us the mean is 10 years
10
## [1] 10
# The standard deviation of an exponential random variable is the same as the mean.
10
## [1] 10
# For an exponential random variable, the decay parameter, m, is 1/mean
# the pdf is m*e^(-mx) where x is the time unit elapsed
# and the cdf is 1-e^(-mx)
# We're trying to find 1 minus the probability that there was a failure in the first 8 years
1-(1-exp(-.1*8))
## [1] 0.449329
# which is calculated in r as:
1-pexp(8,.1)
## [1] 0.449329
What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a binomial. (Hint: 0 success in 8 years)
# They tell us the mean is 10 years
10
## [1] 10
# The standard deviation of a binomial distribution is sqrt(np(1-p)) but we don't have n or p.
# we'll treat each year as n
# so for the mean to be 10 and the mean of a binomial distribution to be np, then the probability of failure in a year is 0.1
# now we can calculate the standard deviation
sqrt(10*.1*(1-.1))
## [1] 0.9486833
# The probability that the machine will fail after 8 years is the probability that it doesn't fail for 8 years:
.9^8
## [1] 0.4304672
What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a Poisson.
# They tell us the mean, or lambda in a poisson distribution, is 10 years
10
## [1] 10
# The standard deviation of a poisson random variable is the square root of the mean
sqrt(10)
## [1] 3.162278
# The probability that the machine will fail after 8 years
# is the same as 1 - the probability that it fails in the first 8 years
# we can calculate that from first principals using the poisson distribution
prob = 0
for (x in 1:8)
{
prob = prob + exp(-10)*10^x/factorial(x)
}
1-prob
## [1] 0.6672257
# as a check for a calculation in r below:
1-ppois(8,10)
## [1] 0.6671803
# Note, I would rely on the result as calculated by r. My first principals result is a tiny bit off (less than a basis point). Maybe this is because of rounding during intermediate steps in the calculation within my for loop.