We will generate 40 (n) random observations from the exponential distribution with lambda equal to 0.2 (l) and calculate a mean. We will repeat the experiment 10000 times (trials):
n=40
l=0.2
trials=10000
out=vector("numeric",0)
for (i in 1:trials){
out=c(out, mean(rexp(n, l)))
}
Let’s plot a histogram of the generated 10000 averages:
hist(out, 30, freq=F)
Recalling the central limit theorem, we would expect that for a large sample (n=40 in our case) the averages are distributed normally. The mean of exponential distribution is equal to 1/lambda and standard deviation is also 1/lambda, hence the theoretical mean of our sample averages is:
m_=1/l
m_
## [1] 5
and the theoretical standard deviation
s_=1/l/sqrt(n)
s_
## [1] 0.7905694
The values obtained in the simulation are respectively:
m=mean(out)
m
## [1] 5.005876
and
s=sd(out)
s
## [1] 0.7944534
To further analyse how close the averages’ distribution is to normal, let’s overlay the normal curve with the parameters equal to the theoretical values over the histogram:
range=seq(1,10,0.1)
normal=dnorm(range, m_, s_)
hist(out, 30, freq=F)
lines(range, normal)
Generally, the center of the simulated distribution as well as the variable are close to the theoretical values and the shape of the distribution is close to normal.
For comparison, let’s repeat the simulation, this time generating just one observation in each trial, i.e. we will analyse the distribution of individual observations, rather than the sample means:
out2=vector("numeric",0)
for (i in 1:trials){
out2=c(out2, mean(rexp(1, l)))
}
hist(out2, 30, freq=F)
m2=mean(out2)
m2
## [1] 4.956509
s2=sd(out2)
s2
## [1] 4.885354
As could be expected, the observations follow the exponential distribution. The mean and standard deviation are both close to 1/lambda.