Exponential distribution has a mean and standard deviation as the inverse of its rate.
The central limit theorem (CLT) states that if you have a population with mean mu and standard deviation sigma and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed with mean mu and standard deviation sigma/n0.5, where n is the number of samples whose mean is taken each time.
The Central Limit Theorem is tested against an exponential distribution. 1000 simulations of sample size 40 were run.
library(knitr)
library(ggplot2)
The aim is to obtain 1000 values of mean of 40 random generations from exponential distribution with a rate of 0.2. For that we need to generate 1000*40 i.e. 40,000 random exponential simulations. The exponential distribution has a mean and standard deviation of 1/0.2 = 5.
set.seed(1994)
nosim <- 1000 # number of simulated averages
n <- 40 # sample size for each mean
lambda <- 0.2 # rate parameter
expsim <- data.frame( x = rexp(nosim * n, lambda)) # data frame with simulated values
expmns <- data.frame(x = apply(matrix(expsim$x,nosim), 1, mean))
# data frame with means of simulated values
ggplot(data = expsim, aes(x)) +
geom_histogram(binwidth = .8, color = "black", fill = "red", aes(y = ..density..)) +
stat_function(fun = function(x){dexp(x = x, rate = lambda)}, size = 2) +
labs(title = "Exponential Distribution( Rate = 0.2 )")
mean(expmns$x)
## [1] 5.011837
The mean of sample mean distribution is 5.012, which is very close to population mean of 1/lambda = 1/0.2 = 5
var(expmns$x)
## [1] 0.6076351
The variance of sample mean distribution is 0.608, which is very close to population variance is sigma2/n = 1/lambda2n = 1/(0.22 * 40) = 0.625
According to CLT, the distribution of sample means will approximately normally distributed with a mean of 5 and a standard deviation of 5/400.5.
mu <- 1/lambda
sigma <- 1/lambda
ggplot(expmns, aes(x = x)) +
geom_histogram(binwidth=0.1, colour = "black", fill = "blue", aes(y = ..density..)) +
stat_function(fun = dnorm, args = list(mean = mu, sd = sigma/sqrt(n)), size = 2) +
labs(title = "Distribution of Sample Means", x = "Mean", y = "Density")
It can be concluded that sample means are approximately normally distributed with mean mu and standard deviation sigma/n0.5, as stated by the Central Limit Theorem.