ASSIGNMENT 7

Question 1

Let X1, X2, . . . , Xn be n mutually independent random variables, each of which is uniformly distributed on the integers from 1 to k. Let Y denote the minimum of the Xi’s. Find the distribution of Y .

Answer

k <- 30
Y <- list()
i <- 1

for(i in 1:100) {
  x <- sample(k, 1, replace = TRUE)
  Y[length(Y) + 1] <- list(min(x))
  i <- i + 1
}

hist(as.numeric(Y))

When the sample size is 100, the distribution of Y is being shown in the graph, however, it does not show much pattern of any distribution. Lets see the effect of increasing the size of number from 100 to 10000.

k <- 30
Y <- list()
i <- 1

for(i in 1:100000) {
  x <- sample(k, 1, replace = TRUE)
  Y[length(Y) + 1] <- list(min(x))
  i <- i + 1
}

hist(as.numeric(Y))

The distribution appears to be discrete uniform distribution too.

Number of possible combinations of \(X_i\)’s is \(k^n\) (choosing \(n\) values out of \(k\) options with replacement).

When considering the number of combinations with at least one \(1\), it is equal to all combinations (\(k^n\)) minus all combinations with values between \(2\) and \(k\) (\((k-1)^n\)). So \(P(Y=1) = \frac{k^n-(k-1)^n}{k^n}\).

When considering the number of combinations with at least one \(2\) and no \(1\), it is euqal to all combinations (\(k^n\)) minus all combinations with at least one \(1\) (see above: \(k^n-(k-1)^n\)) and minus all combinations with values between \(3\) and \(k\) (\((k-2)^n\)). So \(P(Y=2) = \frac{k^n-(k^n-(k-1)^n)-(k-2)^n}{k^n}= \frac{k^n-k^n+(k-1)^n-(k-2)^n}{k^n}= \frac{(k-1)^n-(k-2)^n}{k^n}\).

Similarly considering combinations without \(1\) or \(2\) and with at least one \(3\),

\[\begin{split} P(Y=3) &=\frac{k^n - (k^n-(k-1)^n)-((k-1)^n-(k-2)^n)-(k-3)^n}{k^n}\\ &=\frac{k^n - k^n+(k-1)^n-(k-1)^n+(k-2)^n-(k-3)^n}{k^n}\\ &= \frac{(k-2)^n-(k-3)^n}{k^n} \end{split}\]

Generally, \(P(Y=a) = \frac{(k-a+1)^n-(k-a)^n}{k^n}\).

SIMULATION

Setting up a function to run simulated trials.

Q1sim <- function(k,n,trials=100000) {
  Y<-rep(0,trials)
  for (i in 1:trials) {
    x<-sample.int(k,size=n,replace=TRUE)
    Y[i]<-min(x)
  }
  return(Y)
}

Plot distribution of simulated trials and theoretical probability distribution for several values of \(k\) and \(n\).

# Run 1
par(mfrow=c(2,2))
k<-100
n<-20
hist(Q1sim(k,n),breaks=60,
     main=paste("Simulation with k=",k," and n=",n,sep=""),
     xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
        xlab="Y",xlim=c(1,k))

# Run 2
k<-100
n<-5
hist(Q1sim(k,n),breaks=60,
     main=paste("Simulation with k=",k," and n=",n,sep=""),
     xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
        xlab="Y",xlim=c(1,k))

# Run 3
par(mfrow=c(2,2))
k<-20
n<-5
hist(Q1sim(k,n),breaks=60,
     main=paste("Simulation with k=",k," and n=",n,sep=""),
     xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
        xlab="Y",xlim=c(1,k))

# Run 4
k<-20
n<-100
hist(Q1sim(k,n),breaks=60,
     main=paste("Simulation with k=",k," and n=",n,sep=""),
     xlab="Y",xlim=c(1,k))
pY<-((k-1:k+1)^n-(k-1:k)^n)/k^n
barplot(pY,main=paste("Theoretical with k=",k," and n=",n,sep=""),
        xlab="Y",xlim=c(1,k))

Question 2

Your organization owns a copier (future lawyers, etc.) or MRI (future doctors). This machine has a manufacturer’s expected lifetime of 10 years. This means that we expect one failure every ten years. (Include the probability statements and R Code for each part.).

(a) What is the probability that the machine will fail after 8 years?

Provide also the expected value and standard deviation. Model as a geometric. (Hint: the probability is equivalent to not failing during the first 8 years..)

Answer

\[ Expected = 10; Prob = \frac{1}{Expected}; Probability = (1-Prob)^{n-1}Prob\]

Geometric Distribution

n <- 8 #number of trials
k <- 0 #number of failures 
p <- 1/10 #probability of successes

E <- 1/p
paste("Expected Value:", E)

## [1] "Expected Value: 10"

sd <- sqrt((1-p)/(p^2))
paste("Standard Deviation:", sd)

## [1] "Standard Deviation: 9.48683298050514"

# Calculating P(X>8) using geometric distribution
paste("Possibility that machine might fail after 8 years via Geometric#1:", pgeom(8, 0.1, lower.tail=FALSE))

## [1] "Possibility that machine might fail after 8 years via Geometric#1: 0.387420489"

paste("Possibility that machine might fail after 8 years via Geometric#2:", 1-sum(dgeom(0:8, p)))

## [1] "Possibility that machine might fail after 8 years via Geometric#2: 0.387420489"

Expected number of years before the first machine failure is \(E(X) = 1/p = 1/0.1 = 10\).
Standard deviation \(\sigma^2 = \sqrt{q/p^2} = \sqrt{0.9/0.1^2} \approx 9.4868\).
For geometric distribution, CDF \(F_X(k)=P(X\le k)=1-q^{k+1}\), where \(k\) is the number of failures before the first success (which is how R defines geometric distribution). Alternatively, \(P(X>k) = 1-P(X \le k) = 1 - (1-q^{k+1}) = q^{k+1}\). So for \(k=8\), \(P(X>8) = 0.9^9 \approx 0.3874\).

(b) What is the probability that the machine will fail after 8 years?

Provide also the expected value and standard deviation. Model as an exponential.

Answer

Exponential Distribution

n <- 8 #number of trials
k <- 0 #number of failures 
p <- 1/10 #probability of successes

E <- 1/p
paste("Expected Value:", E)

## [1] "Expected Value: 10"

sd <- sqrt(n*p*(1-p))
paste0("Standard Deviation is ", sqrt(1/p^2))

## [1] "Standard Deviation is 10"

# Calculating P(X>8) using exponential distribution
paste("Possibility that machine might fail after 8 years via Exponential#1:", pexp(8, 0.1, lower.tail=FALSE))

## [1] "Possibility that machine might fail after 8 years via Exponential#1: 0.449328964117222"

paste("Possibility that machine might fail after 8 years via Exponential#2:", dexp(n, p))

## [1] "Possibility that machine might fail after 8 years via Exponential#2: 0.0449328964117222"

Expected value is \(E(X) = 1/\lambda = 1/0.1 = 10\).
Standard deviation \(\sigma^2 = \sqrt{1/\lambda^2} = 1/\lambda = 10\).
For Exponential distribution, CDF \(F_X(k) = P(X \le k) = 1-e^{-\lambda k}\), where \(\lambda\) is the rate parameter. For this example, \(\lambda=0.1\). \(P(X>k) = 1-P(X \le k) = 1-(1-e^{-\lambda k}) = e^{-\lambda k}\). So for \(k=8\), \(P(X>8) = e^{-0.8} \approx 0.4493\).

(c) What is the probability that the machine will fail after 8 years?

Provide also the expected value and standard deviation. Model as a binomial. (Hint: 0 success in 8 years)

Answer

n <- 8 #number of trials
k <- 0 #number of failures 
p <- 1/10 #probability of successes
k <- 0

E <- n*p
paste("Expected Value:", E)

## [1] "Expected Value: 0.8"

sd <- sqrt(n*p*(1-p))
paste("Standard Deviation:", sd)

## [1] "Standard Deviation: 0.848528137423857"

# Calculating P(X=0) for n=8 using binomial distribution
paste("Possibility that machine might fail after 8 years via Binomial#1:", pbinom(0,8,0.1,lower.tail=TRUE))

## [1] "Possibility that machine might fail after 8 years via Binomial#1: 0.43046721"

paste("Possibility that machine might fail after 8 years via Binomial#2:", dbinom(k, n, p))

## [1] "Possibility that machine might fail after 8 years via Binomial#2: 0.43046721"

Expected value and standard deviation will depend on number of years/trials tracked. Consider first 8 years.

Expected value \(E(X)=np= 8 \times 0.1 = 0.8\).
Standard deviation \(\sigma^2 = \sqrt{npq} = \sqrt{8 \times 0.1 \times 0.9} \approx 0.8485\).
For Binomial distribution, \(P(X=k)= {n \choose k}p^k q^{n-k}\). Probability of a machine failure after 8 years is the same as probability of 0 successes after 8 trials. So for \(k=0\) and \(n=8\), \(P(X=0) = {8\choose0} 0.1^0 \times 0.9^{8-0} = 1 \times 1 \times 0.9^8 \approx 0.4305\).

(d) What is the probability that the machine will fail after 8 years?

Provide also the expected value and standard deviation. Model as a Poisson.

Answer

For a Poisson distribution, lambda = rate/unit time = 1 machine breakdown per 10 years = expected value

On average we observe \(\lambda=0.1\) machine failures per year. Probability of k = 0 breakdowns in years 1 through 8

n <- 8 #number of trials
k <- 0 #number of failures 
p <- 1/10 #probability of successes

E <- n*p
paste("Expected Value:", E)

## [1] "Expected Value: 0.8"

sd <- sqrt(E)
paste("Standard Deviation:", sd)

## [1] "Standard Deviation: 0.894427190999916"

# Calculating P(X=0) for 8 intervals using Poisson distribution
paste("Possibility that machine might fail after 8 years via Poisson#1:", ppois(0,0.1,lower.tail=TRUE)^8)

## [1] "Possibility that machine might fail after 8 years via Poisson#1: 0.449328964117221"

paste("Possibility that machine might fail after 8 years via Poisson#2:", dpois(0, E))

## [1] "Possibility that machine might fail after 8 years via Poisson#2: 0.449328964117222"

Expected value \(E(X) = \sigma^2 = \lambda = 0.8\).
For Poisson distribution, \(P(X=k)=\frac{e^{-\lambda}\lambda^k}{k!}\). Probabilty of a machine failure after 8 years is the same as probability of 0 successess after 8 intervals (similarly to the binomial distribution). \(P(no\ failures\ in\ 8\ years)=P(X=0)^8 = (\frac{e^{-0.1} \times 0.1^0}{0!})^8 = (e^{-0.1})^8 \approx 0.4493\)

DATA 605 : Homework #7

Debabrata Kabiraj

March 9, 2020

ASSIGNMENT 7

Question 1

SIMULATION

Question 2

(a) What is the probability that the machine will fail after 8 years?

(b) What is the probability that the machine will fail after 8 years?

(c) What is the probability that the machine will fail after 8 years?

(d) What is the probability that the machine will fail after 8 years?