Generating The Averages of 40 Exponentials Through 1000 Runs

1- loading the required libraries

library(ggplot2)

2- Defining the simulation variables

lambda <- 0.2
pMean <- 1/0.2
pSD <- 1/0.2
n <-40
nosim <- 1000

3- generating data

data <- replicate(nosim,rexp(n,0.2))

The result is a 40 * 1000 matrix. We need to take the column mean in order to get the mean of the sample for each run.

4- Calculating sample means for the 1000 runs

dataMeans <- colMeans(data)

Comparing the sample mean to the theoretical mean of the distripution

1- Compute the sample mean

sMean <- mean(dataMeans)

The sample mean is 4.9781977 compared to the population mean which is 1/lambda = 5

2-Compute the z confidence interval of the sample mean

zConf <- sMean + c(-1,1)*qnorm(0.95)*(pSD/sqrt(n))

We are 95% confident that the population mean lies withen this zConfidence interval 3.6778267, 6.2785686

Comparing the sample variance to theoritical variance

1- Calculating the sample variance

sVar <- var(dataMeans)
pVar <- pSD*pSD

The sample variance is 0.6244459 compared to the theoritical variance of the distribution 25 The theoritical variance divided by the number of samples 0.625 which shows that the distribution follows the centeral limit theorem with standard deviation equal to standard error of the mean. We will show below that the distribution is normal.

Showing that the distribution is approximately normal

dat <- data.frame( x = dataMeans,size = factor(rep(n,nosim)))

g <- ggplot(dat, aes(x = x, fill = size)) + geom_histogram( binwidth=.3, colour = "black", aes(y = ..density..)) 
g <- g + geom_density(size=2,colour = "black",alpha=.1)
g <- g + geom_vline(x = sMean,colour="black",size=2)
g <- g + xlab("Sample Mean")  
print(g)