®This is written by RDZ on 9 October 2018, Nanyang Technological University LWN Library

Overview

In this project, I investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution is simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda the standard deviation is also 1/lambda. I Set lambda = 0.2 for all of the simulations.


First of all, always set seed for any simulation experiments. This is for reproducible research. Then I set up the basic variables.

In this section, I take the size of each sample to be 40 and iterate 1000 simulations. The mean and standard deviation of each sample are stored in 2 varibles named sampleMean and sampleSd respectively

set.seed(1)
lambda = 0.2
sampleSize = 40
theoryMean = 1 / lambda
theorySd = 1 / lambda
simulation = 1000
sampleMean = NULL
sampleSd = NULL
i = 1L
for (i in 1:simulation)
{
        random = rexp(sampleSize,lambda)
        sampleMean = c(mean(random), sampleMean)
        sampleSd = c(sd(random), sampleSd)
}
  1. Is sample mean a good estimator of population mean? Yes. The red vertical line denotes the theory population mean and we can see it clearly that the red line is about in the middle of the distribution.
hist(sampleMean, main = "distribution of mean value of 1000 samples")
abline(v=theoryMean,col="red",lwd=10)

  1. Is sample sd a good estimator of population sd? Yes. The red vertical line denotes the theory population sd and we can see it clearly that the red line is about in the middle of the distribution.
hist(sampleSd, main = "distribution of sd value of 1000 samples")
abline(v=theorySd,col="red",lwd=10)

Here is a table summarising the key statistics

x_hat = mean(sampleMean)
s = mean(sampleSd)

rbind(c("theoretical mean", theoryMean),
      c("sample mean", x_hat),
      c("theoretical sd", theorySd),
      c("sample sd", s))
##      [,1]               [,2]              
## [1,] "theoretical mean" "5"               
## [2,] "sample mean"      "4.99002520077716"
## [3,] "theoretical sd"   "5"               
## [4,] "sample sd"        "4.89577686514373"
  1. Is the distribution of the mean values of samples normal? Yes, approximately normal. As we can see the diagrams below, the means of random exponential (right diagram) are approximately normally distributed, contrast to distribution of random exponentials on the left/
mar =c(1,1,1,1)
par(mfcol = c(1,2))
set.seed(1)
hist(rexp(1000,lambda),main = "1000 random exponentials")
hist(sampleMean, main = "1000 means of random exponentials")