Let X1, X2, . . . , Xn be n mutually independent random variables, each of which is uniformly distributed on the integers from 1 to k. Let Y denote the minimum of the Xi’s. Find the distribution of Y .
Answer
k <- 30
Y <- list()
i <- 1
for(i in 1:100) {
x <- sample(k, 1, replace = TRUE)
Y[length(Y) + 1] <- list(min(x))
i <- i + 1
}
hist(as.numeric(Y))
When the sample size is 100, the distribution of Y is being shown in the graph, however, it does not show much pattern of any distribution. Lets see the effect of increasing the size of number from 100 to 10000.
k <- 30
Y <- list()
i <- 1
for(i in 1:100000) {
x <- sample(k, 1, replace = TRUE)
Y[length(Y) + 1] <- list(min(x))
i <- i + 1
}
hist(as.numeric(Y))
The distribution appears to be discrete uniform distribution too.
Number of possible combinations of \(X_i\)’s is \(k^n\) (choosing \(n\) values out of \(k\) options with replacement).
When considering the number of combinations with at least one \(1\), it is equal to all combinations (\(k^n\)) minus all combinations with values between \(2\) and \(k\) (\((k-1)^n\)). So \(P(Y=1) = \frac{k^n-(k-1)^n}{k^n}\).
When considering the number of combinations with at least one \(2\) and no \(1\), it is euqal to all combinations (\(k^n\)) minus all combinations with at least one \(1\) (see above: \(k^n-(k-1)^n\)) and minus all combinations with values between \(3\) and \(k\) (\((k-2)^n\)). So \(P(Y=2) = \frac{k^n-(k^n-(k-1)^n)-(k-2)^n}{k^n}= \frac{k^n-k^n+(k-1)^n-(k-2)^n}{k^n}= \frac{(k-1)^n-(k-2)^n}{k^n}\).
Similarly considering combinations without \(1\) or \(2\) and with at least one \(3\),
\[\begin{split} P(Y=3) &=\frac{k^n - (k^n-(k-1)^n)-((k-1)^n-(k-2)^n)-(k-3)^n}{k^n}\\ &=\frac{k^n - k^n+(k-1)^n-(k-1)^n+(k-2)^n-(k-3)^n}{k^n}\\ &= \frac{(k-2)^n-(k-3)^n}{k^n} \end{split}\]
Generally, \(P(Y=a) = \frac{(k-a+1)^n-(k-a)^n}{k^n}\).
Setting up a function to run simulated trials.
Q1sim <- function(k,n,trials=100000) {
Y<-rep(0,trials)
for (i in 1:trials) {
x<-sample.int(k,size=n,replace=TRUE)
Y[i]<-min(x)
}
return(Y)
}
Plot distribution of simulated trials and theoretical probability distribution for several values of \(k\) and \(n\).
# Run 1
par(mfrow=c(2,2))
k<-100
n<-20
hist(Q1sim(k,n),breaks=60,
main=paste("Simulation with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
# Run 2
k<-100
n<-5
hist(Q1sim(k,n),breaks=60,
main=paste("Simulation with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
# Run 3
par(mfrow=c(2,2))
k<-20
n<-5
hist(Q1sim(k,n),breaks=60,
main=paste("Simulation with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
# Run 4
k<-20
n<-100
hist(Q1sim(k,n),breaks=60,
main=paste("Simulation with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
xlab="Y",xlim=c(1,k))
Your organization owns a copier (future lawyers, etc.) or MRI (future doctors). This machine has a manufacturer’s expected lifetime of 10 years. This means that we expect one failure every ten years. (Include the probability statements and R Code for each part.).
Provide also the expected value and standard deviation. Model as a geometric. (Hint: the probability is equivalent to not failing during the first 8 years..)
Answer
\[ Expected = 10; Prob = \frac{1}{Expected}; Probability = (1-Prob)^{n-1}Prob\]
Geometric Distribution
n <- 8 #number of trials
k <- 0 #number of failures
p <- 1/10 #probability of successes
E <- 1/p
paste("Expected Value:", E)
## [1] "Expected Value: 10"
sd <- sqrt((1-p)/(p^2))
paste("Standard Deviation:", sd)
## [1] "Standard Deviation: 9.48683298050514"
# Calculating P(X>8) using geometric distribution
paste("Possibility that machine might fail after 8 years via Geometric#1:", pgeom(8, 0.1, lower.tail=FALSE))
## [1] "Possibility that machine might fail after 8 years via Geometric#1: 0.387420489"
paste("Possibility that machine might fail after 8 years via Geometric#2:", 1-sum(dgeom(0:8, p)))
## [1] "Possibility that machine might fail after 8 years via Geometric#2: 0.387420489"
Provide also the expected value and standard deviation. Model as an exponential.
Answer
Exponential Distribution
n <- 8 #number of trials
k <- 0 #number of failures
p <- 1/10 #probability of successes
E <- 1/p
paste("Expected Value:", E)
## [1] "Expected Value: 10"
sd <- sqrt(n*p*(1-p))
paste0("Standard Deviation is ", sqrt(1/p^2))
## [1] "Standard Deviation is 10"
# Calculating P(X>8) using exponential distribution
paste("Possibility that machine might fail after 8 years via Exponential#1:", pexp(8, 0.1, lower.tail=FALSE))
## [1] "Possibility that machine might fail after 8 years via Exponential#1: 0.449328964117222"
paste("Possibility that machine might fail after 8 years via Exponential#2:", dexp(n, p))
## [1] "Possibility that machine might fail after 8 years via Exponential#2: 0.0449328964117222"
Provide also the expected value and standard deviation. Model as a binomial. (Hint: 0 success in 8 years)
Answer
n <- 8 #number of trials
k <- 0 #number of failures
p <- 1/10 #probability of successes
k <- 0
E <- n*p
paste("Expected Value:", E)
## [1] "Expected Value: 0.8"
sd <- sqrt(n*p*(1-p))
paste("Standard Deviation:", sd)
## [1] "Standard Deviation: 0.848528137423857"
# Calculating P(X=0) for n=8 using binomial distribution
paste("Possibility that machine might fail after 8 years via Binomial#1:", pbinom(0,8,0.1,lower.tail=TRUE))
## [1] "Possibility that machine might fail after 8 years via Binomial#1: 0.43046721"
paste("Possibility that machine might fail after 8 years via Binomial#2:", dbinom(k, n, p))
## [1] "Possibility that machine might fail after 8 years via Binomial#2: 0.43046721"
Expected value and standard deviation will depend on number of years/trials tracked. Consider first 8 years.
Provide also the expected value and standard deviation. Model as a Poisson.
Answer
For a Poisson distribution, lambda = rate/unit time = 1 machine breakdown per 10 years = expected value
On average we observe \(\lambda=0.1\) machine failures per year. Probability of k = 0 breakdowns in years 1 through 8
n <- 8 #number of trials
k <- 0 #number of failures
p <- 1/10 #probability of successes
E <- n*p
paste("Expected Value:", E)
## [1] "Expected Value: 0.8"
sd <- sqrt(E)
paste("Standard Deviation:", sd)
## [1] "Standard Deviation: 0.894427190999916"
# Calculating P(X=0) for 8 intervals using Poisson distribution
paste("Possibility that machine might fail after 8 years via Poisson#1:", ppois(0,0.1,lower.tail=TRUE)^8)
## [1] "Possibility that machine might fail after 8 years via Poisson#1: 0.449328964117221"
paste("Possibility that machine might fail after 8 years via Poisson#2:", dpois(0, E))
## [1] "Possibility that machine might fail after 8 years via Poisson#2: 0.449328964117222"