We will investigate the exponential distribution in R and will apply the Central Limit Theorem (CLT) to the distribution of averages of 40 exponentials with \(\lambda\) = .2
This is how the PDF for the exponential distribution looks like:
hist(rexp(n=10000, rate = .2), breaks=30, main="Probability Density Function",
xlab="rate=0.2, n=10.000")
We will use base and ggplot2 libraries for the graphical representations (see appendix)
We generate first the data of our simulations (see appendix)
We calculate the mean of our previous simulations:
meansim<-rowMeans(fmat(1000)) #See appendix for details on fmat function
mean(meansim)
## [1] 5.014378
And in the otherside, the theoretical mean of the exponential distribution is \(\mu\) = 1/\(\lambda\), as \(\lambda\) = .2:
\(\mu\) = 5
Here with a graphical view:
hist(meansim, breaks=30,main="Distribution of sample means", xlab="rate=.2, n=1000")
abline(v=5, col="purple") #Line of the theoretical mean
axis(3, at=5, labels = 5, pos=3, col.axis="purple")
We calculate the Variance of our simulations:
meansim<-rowMeans(fmat(1000))
var(meansim)
## [1] 0.6463562
The theoretical variance of the exponential distribution is \(\sigma^2\) = 1/(\(\lambda^2 * n\)), as \(\lambda\) = .2 and n = 40:
\(\sigma^2\) = .625
The Central Limit Theorem (CLT) tell us that \(\bar X_n\) is approximately \(N(\mu, \sigma^2 / n)\)
The exponential distribution has \(\mu\) = 1/\(\lambda\)
The exponential distribution has \(\sigma\) = 1/\(\lambda\)
As by our analysis n = 40
g <- ggplot(dat, aes(x = x, fill = simulations)) +
geom_histogram(binwidth=.3, colour = "black", aes(y = ..density..))
g <- g + stat_function(fun = dnorm, args=list(mean=mean_expd, sd=sd_expd/sqrt(n_exp)), size = 2)
g + facet_grid(. ~ simulations)
library(ggplot2)
n_exp <- 40 #number of samples
lambda <- .2
mean_expd <- 1/lambda #mean of an exponential distribution
sd_expd <- 1/lambda #standard deviation of an exponential distribution
set.seed(111) #set seed
#Funtion that creates a matrix with a predefined simulations of a number
#of 40 (default) exponentials
fmat<-function(simulations, n_exp=40){
matrix(rexp(n=simulations * n_exp, rate = .2), simulations)
}
sim_a <- 10 #simulations
sim_b <- 100
sim_c <- 1000
sim_d <- 10000
#Data frame with different amount of simulations
dat <- data.frame(
x = c(apply(fmat(sim_a), 1, mean), apply(fmat(sim_b), 1, mean),
apply(fmat(sim_c), 1, mean), apply(fmat(sim_d), 1, mean)),
simulations = factor(c(rep(sim_a, each=sim_a), rep(sim_b, each=sim_b),
rep(sim_c, each=sim_c), rep(sim_d, each=sim_d))))