In this project I will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. I will set lambda = 0.2 for all of the simulations. I will investigate the distribution of averages of 40 exponentials. I will do a thousand simulations.
n <- 1000 # count of simulations
mns = NULL
for (i in 1:n) mns = c(mns, mean(rexp(n = 40, rate = 0.2)))
means = cumsum(mns) / (1:n) # cumulative means of samples(40 exponentials)
library(ggplot2)
g <- ggplot(data.frame(x = 1:n, y = means), aes(x = x, y = y))
g <- g + geom_hline(yintercept = 0) + geom_line(size = 2)
g <- g + labs(x = "number of obs", y = "cumulative mean")
g
mean_samples <- mean(mns)
print(mean_samples)
## [1] 4.988696
The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda.
lambda <- 0.2
mean_theo <- 1/lambda
print(mean_theo)
## [1] 5
So, the result of simulation and the result of theoritical mean are almost same.
The variance is a square of standard variance. I used standard variance instead of the variance.
n <- 1000 # count of simulations
mns = NULL
for (i in 1:n) mns = c(mns, sd(rexp(n = 40, rate = 0.2)))
sds = cumsum(mns) / (1:n) # cumulative means of standard variations
g <- ggplot(data.frame(x = 1:n, y = sds), aes(x = x, y = y))
g <- g + geom_hline(yintercept = 0) + geom_line(size = 2)
g <- g + labs(x = "number of obs", y = "cumulative mean of standard variation of samples")
g
sd_samples <- mean(mns)
print(sd_samples)
## [1] 4.851743
The standard variation of exponential distribution is 1/lambda and the standard deviation is also 1/lambda.
lambda <- 0.2
sd_theo <- 1/lambda
print(sd_theo)
## [1] 5
So, the result of simulation and the result of theoritical standard variation are almost same.
The histogram of random exponential variables and sample averages is below:
par(mfrow = c(1, 2))
hist(rexp(n = 1000, rate = 0.2), breaks = 100)
hist(mns, breaks = 100)
The left graph was biased to the left. The right graph looks like a normal distribution.