##Exponential distribution is simulated in R with rexp(n, lambda) where lambda λ is the rate parameter. ##Average or Mean of exponential distribution is 1/λ and the standard deviation is also 1/λ. ##For this simulation,we set λ = 0.2. In this simulation, we investigate the distribution of averages of 40 numbers sampled fromexponential distribution with λ = 0.2. ##Now let’s do a thousand simulated averages of 40 exponentials.

##Slide with R Code and Output

lambda <- 0.2
num_sim <- 1000
sample_size <- 40
sim <- matrix(rexp(num_sim*sample_size, rate=lambda), num_sim, sample_size)
row_means <- rowMeans(sim)

##Slide with Plot

##The distribution of the sample means is below: ##Plot#1 ##Plot the Histogram of Averages

hist(row_means, breaks=50, prob=TRUE,
     main="Distribution of averages of samples,
     drawn from exponential distribution with lambda=0.2",
   xlab="")
lines(density(row_means))
abline(v=1/lambda, col="hotpink1")
xfit <- seq(min(row_means), max(row_means), length=100)
yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(sample_size)))
lines(xfit, yfit, pch=22, col="peachpuff2", lty=2)
legend('topright', c("simulation", "theoretical"), lty=c(1,2), col=c("violet", "peachpuff2"))

##Distribution of sample means is centered at 4.9866 and the center of the distribution is λ−1 = 5. The variance of sample means is 0.6258 where the theoretical variance of the distribution is:σ2/n = 1/(λ2n) =1/(0.04×40)= 0.625.

##Due to the central limit theorem (CLT), the averages of samples follow normal distribution. The figure abovealso shows the density computed using the histogram and the normal density plotted with theoretical mean and variance values. Also, the q-q plot below suggests the normality.

##Plot#2

qqnorm(row_means); qqline(row_means)

##Finally, let’s evaluate the coverage of the confidence interval for:1/λ = X¯ ± 1.96 s√n

##Plot 3

lambda_vals <- seq(4, 6, by=0.01)
coverage <- sapply(lambda_vals, function(lamb) {
  mu_hats <- rowMeans(matrix(rexp(sample_size*num_sim, rate=0.2),
                             num_sim, sample_size))
  ll <- mu_hats - qnorm(0.975) * sqrt(1/lambda**2/sample_size)
  ul <- mu_hats + qnorm(0.975) * sqrt(1/lambda**2/sample_size)
  mean(ll < lamb & ul > lamb)
})
library(ggplot2)
qplot(lambda_vals, coverage) + geom_hline(yintercept=0.95)

##95% confidence intervals for the rate parameter (λ) to be estimated (λˆ)are λˆlow = λˆ(1−1.96√n) agnd λˆupp = λˆ(1 + .96√n).

##As can be seen from the plot above, for selection of λˆ around 5, the average of the sample mean falls within the confidence interval at least 95% of the time. Note that the true rate, λ is 5.