The exponential distribution can be simulated in R with rexp(n, lambda) where lambda(\( \lambda \)) is the rate parameter.
The mean of exponential distribution is \( 1/\lambda \) and the standard deviation is also \( 1/\lambda \). For this simulation, we set \( \lambda=0.2 \). In this simulation, we investigate the distribution of averages of 40 exponential(0.2)s.
First, we will run 1,000 simulated averages of 40 exponentials.
# Set seed
set.seed(1000)
lambda <- 0.2
# Perform 1000 simulations with 40 samples
sampleSize <- 40
numsim<- 1000
sim_exp <- matrix(rexp(numsim*sampleSize, rate=lambda), numsim, sampleSize)
Next we generate the 1,000 means of these 40 averages.
# Averages of 40 exponentials
row_means <- rowMeans(sim_exp)
1. Comparison for the mean of the simulated distribution to the theoretical value of the mean.
sim_mean <- mean(row_means)
true_mean <- 1/lambda
The mean for our distribution of averages of 40 simulated exponentials are 4.987, while the mean of the analytical expression is \( 1/\lambda= \) 5.
The distribution of sample means is shown below:
To observe that the mean of the simulated distribution is aproximatting very well the teoretical distribution, we plotted above the histogram of the simulated distribution we generated and on the same graph the teorethical curve is plotted for comparison.
2. Comparison for the standard deviation of the simulated distribution to the theoretical value of the standard deviaton of the theoretical distribution.
The standard deviation of the distribution of averages of 40 exponentials is:
sim_sd <- sd(row_means)
sim_sd
## [1] 0.8113908
while the standard deviation for the analytical expression is:
true_sd <- (1/lambda)/sqrt(sampleSize)
true_sd
## [1] 0.7905694
We can see that the standard deviation of the simulated distribution, which is 0.811, is very close to the theoretical value of the standard deviation, which is 0.791.
Also, the variance of the sample mean is:
sim_var <- var(row_means)
sim_var
## [1] 0.6583551
while the variance for the analythical expression is:
true_var <- 1/((lambda^2) * 40)
true_var
## [1] 0.625
We can see that the variability in distribution of averages of 40 exponentials is very close to the theoretical variance of the distribution.
The variance of sample means is 0.658, where as the theoretical variance of the distribution is
\( \sigma^2 / n = 1/(\lambda^2 n) = 1/(0.2 \times 0.2 \times 40) \) = 0.625.
3. Confirmation that the distribution is approximately normal.
Due to the central limit theorem, the averages of samples follow normal distribution. The figure above also shows the density computed using the histogram and the normal density plotted with theoretical mean and variance values. Also, the Q-Q plot suggests the distribution of averages of 40 exponentials is very close to a normal distribution.
4. Establishing the coverage of the confidence interval for \( 1/\lambda = \overline{X} \pm 1.96 \frac{S}{\sqrt{n}} \).
## Warning: package 'ggplot2' was built under R version 3.0.3
The 95% confidence intervals for the rate parameter (\( \lambda \)) to be estimated, (\( \hat{\lambda} \)), are
\( \hat{\lambda}_{l} = \hat{\lambda}(1 - \frac{1.96}{\sqrt{n}})\text{ and }\hat{\lambda}_{u} = \hat{\lambda}(1 + \frac{1.96}{\sqrt{n}}) \).
Since, we consider the distribution of averages of exponentials, the standard deviation of this distribution already incorporates the standard error term, \( \sqrt{n} \).
The confidence interval is given by 3.397, 6.577