First we generate 1000 random numbers which follow the exponential distribution with lambda as 0.2.

lambda <- 0.2
pop <- rexp(1000, lambda)

Here we generate 40 random numbers which follow exponential distribution with lambda as 0.2. Then we investigate the averages of these 40 numbers. We simulate the 40 numbers for 1000 times and investigate the distibution of the 1000 averages and the variance of this distribution.

sim = NULL
for (i in 1 : 1000) sim = c(sim, mean(rexp(40, lambda)))

This is the distribution of these 1000 means of every 40 random numbers.

hist(sim, main="Histogram of Means from Simulated Data",
     xlab="Simulated Number",prob=T)
abline(v=mean(sim),col="red")
text(7,180, "Population Mean", col="blue")
abline(v=mean(pop),col="blue")
text(7,200, "Mean of Simulated Means", col="red")

As shown in this figure above, the red line is the mean of these 1000 means from simulation, i.e. 4.99. The blue line is where the mean of 1000 random numbers which follow the exponential distribution with lambda of 0.2., i.e. 5.05. These two numbers are very close to each other, thus we can say the simulated mean is a good estimate of the true population mean.

Here we compare the distribution of 1000 random number representing the exponential distribution and the distribution of means from 1000 times of simulations.

par(mfrow=c(1,2))
hist(pop, main="1000 Random Exp Numbers", xlab="Simulated Number",prob=T)
text(20,0.1,paste("Std:",round(sd(pop),2)),col="blue")
hist(sim, main="Means from Simulated Data", xlab="Simulated Number",prob=T)
text(3.3,0.38,paste("Std:",round(sd(sim),2)),col="red")

As shown in this figure, after the 1000 times of simulation, we get the standard deviation of the means as 0.8 which describes the variation of the means. As note, the standard deviation of population is 4.94. And we can see the relation between these two numbers is:

population standard deviation divided by square root of sample size should be equal to standard deviation of simulated means.

sd(pop)/sqrt(40)
## [1] 0.780548

is very close to

sd(sim)
## [1] 0.7970354

Therefore, we conclude that a large number of random sampling from a population would be a good approximation to the real population in terms of estimating the distribution of the population.

The distribution of means from a large number of random sampling is also approximate to normal distribution, shown as in the following figure.

hist(sim, main="Means from Simulated Data",
     xlab="Simulated Number",prob=T, ylim=c(0,0.5))
x <- seq(floor(range(sim)[1]),ceiling(range(sim)[2]),0.01)
curve(dnorm(x, mean=mean(sim), sd=sd(sim)), add=TRUE,col="blue")
curve(dnorm(x, mean=mean(pop), sd=sd(pop)/sqrt(40)), add=TRUE,col="red")

The blue line describes the normal distribution with mean and standard deviation from the simulated data suggesting the distribution of simulated means follows normal distribution. The red line is the normal distribution with mean and standard deviation divided by square root of sampling size from the 1000 random exponential numbers. As the blue curve and red curve are close to each other, we can also conclude that a large number of random sampling from a population would be a good approximation to the real population in terms of estimating the distribution of the population.