Investigate the Exponential Distribution and Verify the Central Limit Theorem

author: liuyubobobo
date: Sunday, March 22, 2015

Overview

In this project, we will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution will be simulated with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. We set lambda = 0.2 for all of the simulations.

Simulations

We do a 1000 simulations. in each simulation, we investigate the distribution of average 40 exponentials. we set lambda = 0.2 for all of the simulations. for further investigation, we store all the 1000 means and variance in two vectors seperately - mns and vars.

set.seed(19851216)
num = 40
lambda <- 0.2
mns <- c()
vars <- c()
for(i in 1:1000){
    mns <- c( mns , mean( rexp( num, lambda ) ) )
    vars <- c( vars , var( rexp( num, lambda ) ) )
}

Sample Mean versus Theoretical Mean

First of all, we can calculate the sample mean.

sampleMean <- mean( mns )
sampleMean
## [1] 4.977941

We know, the theoretical mean is 1/lambda.

theoreticalMean = 1 / lambda
theoreticalMean
## [1] 5

Compare these two means, we can see they are extremely close.

Besides, to see this point closer, we can plot the histogram of the sample mean and highlight the theoretical mean in the same plot.

hist( mns , main="Distribution of Sample Means" , xlab = "Sample Means")
abline( v = sampleMean , col = "blue" )
abline( v = theoreticalMean , col = "red" )

From the figure, we can see that the blue sample mean is approximately to 5, which is the red theoretical mean of the sample, besides we can see that the sample means are approximately normal distributed.

Sample Variance versus Theoretical Variance

Firstly, we can calculate the sample variance.

sampleVar <- mean( vars )
sampleVar
## [1] 25.40612

We know, the theoretical var is (1/lambda)^2.

theoreticalVar = ( 1 / lambda )^2
theoreticalVar
## [1] 25

Compare these two variances, we can see they are close.

We can plot the histogram of the sample variances and highlight the theoretical variance in the same plot.

hist( vars , main="Distribution of Sample Variances" , xlab = "Sample Variances")
abline( v = sampleVar , col = "blue" )
abline( v = theoreticalVar , col = "red" )

From the figure, we can see that the blue sample variance is approximately to 25, which is the red theoretical variance of the sample.

Distribution

To see the distribution of the sample means, we plot the histogram of the sample means and set the freq = FALSE, to let the histogram graphic is a repredentation of probability denisities. At the same time, we plot the probablity density of a normal distribution, which the mean = 1/lambda and the standard deviation is same as sd(mns).

hist( mns , main="Distribution of Sample Means" , xlab = "Sample Means" , freq = FALSE)
x <- seq( 2 , 8 , by = 0.01 )
y <- dnorm( x , mean=1/lambda , sd = sd(mns) )
lines( x , y )

From the above figures, we can see that the histogram is approximately like the curve of standard normal density curve. The distribution is approximately normal!