In this simulation exercise, we will investigate the distribution of averages of 40 exponentials (Setting lambda = 0.2 for all of the simulations) to answer the following questions. We will need to do a thousand or so simulated averages of 40 exponentials.

Question 1

Showing where the distribution is centered at and the comparing it to the theoretical center of the distribution.

Lambda = 0.2
n = 40
nSims = 1:1000
set.seed(820)
Means <- data.frame(x = sapply(nSims, function(x) {
    mean(rexp(n, Lambda))
}))

head(Means)
##       x
## 1 5.750
## 2 3.808
## 3 4.058
## 4 3.999
## 5 4.313
## 6 4.418
m <- mean(Means$x)
m
## [1] 4.999
# Theoretical (expected) mean
1 / Lambda
## [1] 5
sd(Means$x)
## [1] 0.7909
# Theoretical (expected) standard deviation
(1/Lambda)/sqrt(40)
## [1] 0.7906
hist(Means$x, xlab = "mean", main = "Exponential Function Simulations")
abline(v = m, col = "red")

plot of chunk unnamed-chunk-1

The distribution of the sample means is a normal distribution that is centered at the mean, 4.999 and standard deviation, 0.7909. It is expected, as the theoretical mean of this exponential function is 5 and the theoretical standard deviation is 0.7906.

Question 2

Showing how variable it is and comparing it to the theoretical variance of the distribution.

var(Means$x)
## [1] 0.6256
# Theoretical (expected) Variance
((1/Lambda)/sqrt(40))^2
## [1] 0.625

The variance of the simulation is 0.6256 that is quite expected as the theoretical variance is 0.625.

Question 3

showing that the distribution is approximately normal.

library(ggplot2)
ggplot(data = Means, aes(x = x)) + geom_histogram(aes(y = ..density..), 
       fill = I("blue"), binwidth = 0.2, color = I("black")) + 
       stat_function(fun = dnorm, arg = list(mean = m, sd = sd(Means$x)))

plot of chunk unnamed-chunk-3

The histogram plot depicts a distribution that is approximately normal (mean = 4.999, sd = 0.7909).

qqnorm(Means$x)
qqline(Means$x, col = "green")

plot of chunk unnamed-chunk-4

The figure above also shows the density computed using the histogram and the normal density plotted with theoretical mean and variance values. Also, the q-q plot suggests the distribution of averages of 40 exponentials is very close to a normal distribution.

Above, we’ve run the distribution of a large collection (1000) of averages of 40 exponentials.

Now we are interested in running the distribution of a large collection (say, 1000000) of random exponentials.

mean(rexp(1000000, Lambda))
## [1] 4.996

It would be very close to 1 / Lambda, or 5 in this case.