The exponential distribution can be simulated in R using rexp(n, λ) where lambda is the rate parameter. The mean of the exponential distribution is 1/λ and the standard deviation is also 1/λ.
This analysis uses a bunch of simulations in order to study the properties of the distribution of the means of a set of randomly sampled exponential distribution. The analysed characteristics are the following:
The investigation does 1000 simulations of 40 exponentials and analyses the distribution of their means. The rate parameter λ is set to 0.2 for all the simulations.
library(ggplot2)
# for reproduceability
set.seed(1)
# generation of the sample means
simulations <- 1000
samples <- 40
λ <- 0.2
samples.means = c(1:simulations)
for (i in 1 : simulations) samples.means[i] <- mean(rexp(n = samples, rate = λ))
df <- data.frame(x= 1:simulations, y = samples.means)
# plot
ggplot(df, aes(y)) +
geom_histogram(binwidth = 0.1, fill="#3591d1", color="black") +
labs(title="Asymptotic distribution of the means") +
labs(x="sample means", y="frequency")
The theoretical mean of an exponential distribution of rate λ is μ = 1 / λ. In our case:
μ <- 1 / λ
μ
## [1] 5
The average mean μ_x of 1000 of 40 randomly sampled exponential distributions is:
μ_x <- mean(samples.means)
c(μ_x, μ-μ_x)
## [1] 4.990025201 0.009974799
Which is very close to the theoretical mean.
The theoretical standard deviation of an exponential distribution of rate λ is σ = 1 / λ / √n and the theoretical variance of σ is var = σ^2.
In our case:
σ <- 1 / λ / sqrt(samples)
σ
## [1] 0.7905694
And the associated variance is:
var <- σ^2
var
## [1] 0.625
The standard deviation of the average of the means of 1000 of 40 randomly sampled exponential distributions:
σ_x <- sd(samples.means)
c(σ_x, σ-σ_x)
## [1] 0.78173939 0.00883003
And the associated variance is:
var_x <- var(samples.means)
c(var_x, var-var_x)
## [1] 0.61111647 0.01388353
The standard deviation is very close to the theoretical standard deviation. The associated variance stays close to the theoretical variance, but with slightly larger difference due to the power of 2 applied to the standard deviation.
The following plot proposes to visually compare the probability density function of the asymptotic distribution of the sample means, to the one of a normal distribution centered at μ=5 and with standard deviation σ=0.7905694.
cols <- c("Sample means"="#f04546","Normal"="#3591d1")
ggplot(data=df, aes(x=y)) +
geom_histogram(aes(y = ..density..), binwidth = 0.1, color="darkgrey", fill="white") +
geom_density(aes(color="Sample means")) +
stat_function(fun = dnorm, args = list(mean = μ, sd = σ), aes(color="Normal")) +
xlab("sample means") +
scale_colour_manual(name="Density", values=cols)
The probability density function of the distribution of randomly sampled exponential distributions is approximately the one of a normal distribution N(μ=5, σ^2=0.625).
The Central Limit Theorem states that the distribution of the averages of properly normalized iid variables becomes that of a standard normal as the size increases.
This simulation of an asymptotic behavior applied to a set of calibrated exponential distributions proves this fact experimentaly.