The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean and the standard deviation of exponential distribution is 1/lambda. Lambda is set to 0.2 for all of the simulations. We will investigate the distribution of averages of 40 exponentials and will do a thousand simulations.
Running simulations of 40 expotentials and Calculate the Means and Variances
set.seed(2017)
simulations1<-1000
n<-40
lambda<-0.2
matrix<-matrix(rexp(n*simulations1,lambda), simulations1)
mns <- apply(matrix, 1, mean)
vr <- apply(matrix, 1, var)
dat<-as.data.frame(cbind(mns,vr))
colnames(dat)<-c("Means","Variance")
g4 <- ggplot(dat, aes(x = dat$Means))+geom_histogram(binwidth=.3, fill=DRGcolorstran[2],colour = "black")+geom_vline(xintercept = mean(dat$Means), size = 1,color="red")
g4+labs(title="Figure1. Histogram of Simulated Means",x="Simulation Means",y="Frequency")
SampleMean<-round(mean(dat$Means),2)
TheoriticalMean<-1/lambda
Sample Mean :4.98 Empirical Mean :5 No difference between the sample mean and theoritical mean noticed
g3 <- ggplot(dat, aes(x = dat$Variance))+geom_histogram(binwidth=2, fill=DRGcolorstran[1],colour = "black")+geom_vline(xintercept = mean(dat$Variance), size = 1,color="red")
g3+labs(title="Figure2. Histogram of Simulated Variances",x="Simulation Variances",y="Frequency")
SampleVariance<-round(mean(dat$Variance),2)
TheoriticalVariance<-(1/lambda)^2
Sample Variance:24.95 Theoritical Variance :25 No difference between the sample Variance and theoritical Variance noticed
In red line we see the sample means,the normal distrubution with Theoritical Mean and variance in Black and simulated values distributions in Blue
set.seed(2017)
simulations2<-40000
dat2<-data.frame(rexp(simulations2,lambda))
colnames(dat2)<-c("random")
g1 <- ggplot(dat, aes(x = dat$Means))+geom_histogram(binwidth=.2, fill=DRGcolorstran[2],colour = "black",
aes(y = ..density..))+geom_density(size=1,
color="blue")+geom_vline(xintercept = mean(dat$Means), size = 1,color="red")
g1<-g1+stat_function(fun = dnorm , args = list(mean = 1/lambda, sd = 1/(lambda*sqrt(n))), size = 1,color="black")+labs(x="Simulation Means",y="Density")
g2 <- ggplot(dat2, aes(x = dat2$random))+geom_histogram(binwidth=.6, fill=DRGcolorstran[2],colour = "black",
aes(y = ..density..))+geom_density(size=1,
color="blue")+geom_vline(xintercept = mean(dat2$random), size = 1,color="red")
g2<-g2+stat_function(fun = dnorm , args = list(mean = 1/lambda, sd = 1/(lambda*sqrt(n))), size = 1,color="black")+labs(x="Simulation Expotentials",y="")
title1=textGrob("Figure3. Difference between the distribution \n of 40.000 random exponentials (right) \n and the distribution of 1.000 averages of 40 Exponentials (left)", gp=gpar(fontface="bold",fontsize=10))
grid.arrange(g1, g2, ncol=2,top=title1)
We see that the distribution of means of our sampled exponential distributions appear to follow a normal distribution but that is not the case also for the simulated expotentials theselves.