The aim of this report is to illustrate via simulation and explanatory analysis the properties of the distribution of the mean of 40 exponential. The report will highlight the following:
Sample mean and compare it to the theoretical mean of the distribution.
How variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
That the distribution is approximately normal.
require(ggplot2)
So we need to create a simulation of 1000 sample of an exponential distribution with mean equalling 1/lambda, variance equalling 1/lambda, and no of observation equalling 40 with lambda being 0.2.
set.seed(1)
After we have set.seed() we can now easily reproduce our exponential results at any time.We now proceed in creating our objects i.e mean,variance,etc.
n<-40
lambda<-0.2
mu<-1/lambda
var<-(1/lambda)^2
nsim<-1000
We can now run our simulation
sim<-matrix(rexp(n*nsim,lambda),nsim)
i.mean<-apply(sim,1,mean)
dim(sim)
## [1] 1000 40
We now have a data set consisting of 1000 exponential distributions with 40 observations each. We can now begin our exploratory analsis to achieve the objectives mentioned earlier
samp.mean<-round(mean(apply(sim,1,mean)),4)
samp.var<-round(var(apply(sim,1,mean)),4)
samp.sd<-sqrt(samp.var)
pop.mean<-1/lambda
pop.var<-((1/lambda)^2)*((1/n))
pop.sd<-sqrt(pop.var)
diff.mean<-round(samp.mean-pop.mean,4)
diff.var<-round(samp.var-pop.var,4)
Having run the simulations we achieved a sample mean of 4.99 round to 4 dp, compared to the theoretical mean of 5. we can see there is a difference -0.01 between means. We can see that the sample mean is close to the centre of the distribution.The graph illustrates this:
hist(i.mean,xlab="Mean",main="Sample Mean vs Theoretical Mean")
abline(v=samp.mean,col="blue",lwd=2)
abline(v=pop.mean,col="green",lwd=2,lty=3)
Sample mean is illustrated with a blue solid line, while the theoretical mean is show with a green dashed line.
We also achieved a sample variance of 0.6177 round to 4 dp, compared to the theoretical variance of 0.625 with a difference of -0.0073.
We wil now look at fitting a normal distribition curve with mean 4.99, and standard deviation 0.7859389.
h.mean <- hist(i.mean, breaks=n, col= "Orange", xlab= "Means",main= "Plot of Simulated Distribution" )
xfit <- seq(min(i.mean),max(i.mean),length=100)
yfit <- dnorm(xfit,mean=samp.mean, sd=samp.sd)
yfit <- yfit*diff(h.mean$mids[1:2])*length(i.mean)
lines(xfit, yfit, col="blue",lwd=2)
We can see that the distribution of the means follow the bell shaped curve showing normality, but also follows that of a normal distribution with mean 4.99, and standard deviation 0.7859389.So we can conclude that the distribution is approximately normal.