In this simulation exercise, we will investigate the distribution of averages of 40 exponentials (Setting lambda = 0.2 for all of the simulations) to answer the following questions. We will need to do a thousand or so simulated averages of 40 exponentials.
Question 1
Showing where the distribution is centered at and the comparing it to the theoretical center of the distribution.
Lambda = 0.2
n = 40
nSims = 1:1000
set.seed(820)
Means <- data.frame(x = sapply(nSims, function(x) {
mean(rexp(n, Lambda))
}))
head(Means)
## x
## 1 5.750
## 2 3.808
## 3 4.058
## 4 3.999
## 5 4.313
## 6 4.418
m <- mean(Means$x)
m
## [1] 4.999
# Theoretical (expected) mean
1 / Lambda
## [1] 5
sd(Means$x)
## [1] 0.7909
# Theoretical (expected) standard deviation
(1/Lambda)/sqrt(40)
## [1] 0.7906
hist(Means$x, xlab = "mean", main = "Exponential Function Simulations")
abline(v = m, col = "red")
The distribution of the sample means is a normal distribution that is centered at the mean, 4.999 and standard deviation, 0.7909. It is expected, as the theoretical mean of this exponential function is 5 and the theoretical standard deviation is 0.7906.
Question 2
Showing how variable it is and comparing it to the theoretical variance of the distribution.
var(Means$x)
## [1] 0.6256
# Theoretical (expected) Variance
((1/Lambda)/sqrt(40))^2
## [1] 0.625
The variance of the simulation is 0.6256 that is quite expected as the theoretical variance is 0.625.
Question 3
showing that the distribution is approximately normal.
library(ggplot2)
ggplot(data = Means, aes(x = x)) + geom_histogram(aes(y = ..density..),
fill = I("blue"), binwidth = 0.2, color = I("black")) +
stat_function(fun = dnorm, arg = list(mean = m, sd = sd(Means$x)))
The histogram plot depicts a distribution that is approximately normal (mean = 4.999, sd = 0.7909).
qqnorm(Means$x)
qqline(Means$x, col = "green")
The figure above also shows the density computed using the histogram and the normal density plotted with theoretical mean and variance values. Also, the q-q plot suggests the distribution of averages of 40 exponentials is very close to a normal distribution.
Above, we’ve run the distribution of a large collection (1000) of averages of 40 exponentials.
Now we are interested in running the distribution of a large collection (say, 1000000) of random exponentials.
mean(rexp(1000000, Lambda))
## [1] 4.996
It would be very close to 1 / Lambda, or 5 in this case.