This is a research of exponential distribution where I compare sample mean and variance to their theoretical values. And verify Central Limit Theorem that states that averages have are approximately normal distribution.

## Warning: package 'ggplot2' was built under R version 3.1.3

Simulation

Perform 1000 simulations of 40 random exponentials with lambda = 0.2

nosim <- 1000
n <- 40
lambda <- 0.2

data <- matrix(rexp(nosim * n, lambda), nosim)

Sample Mean

#Theoretical mean
t_mean <- 1/lambda
#Simulation means
s_mean <- data.frame(apply(data, 1, mean))
names(s_mean) <- c("mean")

(ggplot(s_mean, aes(x = mean)) 
  + geom_histogram(colour="black", fill="white", binwidth=0.1)
  + geom_vline(aes(xintercept = t_mean), size = 1))

From the plot we can see that distribution of means of exponential distrubution is centered around its teoretical mean i.e. 1/lambda = 5

Sample Variance

#Theoretical variance
t_var <- (1/lambda)^2
#Simulation variances
s_var <- data.frame(apply(data, 1, var))
names(s_var) <- c("var")

(ggplot(s_var, aes(x = var)) 
  + geom_histogram(colour="black", fill="white", binwidth=1)
  + geom_vline(aes(xintercept = t_var), size = 1))

From the plot we can see that distribution of variances of exponential distrubution is more or less centered around its teoretical variance i.e. \(1/lambda^2 = 25\). The bias can be explained by small \(n=40\) during each simulation.

Central Limit Theorem

#Theoretical mean and standard error
t_mean <- 1/lambda
t_se <- (1/lambda)/sqrt(n)

#Simulated normalized means
s_norm_mean <- data.frame((apply(data, 1, mean) - t_mean) / t_se)
names(s_norm_mean) = c("mean")

(ggplot(s_norm_mean, aes(x=mean)) 
    + geom_histogram(aes(y=..density..), colour="black", fill="white", binwidth=0.1) 
    + stat_function(fun = dnorm, size = 2))

From the plot we can see that distribution of normalized means of exponential distribution looks very similar to standard normal distribution.