Overview:

In this project we will investigate the exponential distribution be simulated in R with rexp(n, lambda) and compare it with the Central Limit Theorem.

Simulations:

  1. Show the sample mean and compare it to the theoretical mean of the distribution.
  2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
  3. Show that the distribution is approximately normal.

Sample Mean versus Theoretical Mean:

The histogram simulate the distribution of the mean, and the red line indicates the theoretical mean.

lambda = 0.2
TheoreticalMean = 1/lambda
TheoreticalSD = 1/lambda

mns = NULL
for (i in 1 : 1000) mns = c(mns, mean(rexp(40,lambda)))
hist(mns,col="green",freq=FALSE)
abline(v=TheoreticalMean,col = "red", lwd = 2)

# Sample Variance versus Theoretical Variance: Include figures (output from R) with titles. Highlight the variances you are comparing. Include text that explains your understanding of the differences of the variances.

vars = NULL
for (i in 1 : 1000) vars = c(vars, sd(rexp(40,lambda)))
hist(vars,col="grey",freq=FALSE)
abline(v=TheoreticalSD,col = "red", lwd = 2)

Distribution:

We first look at the distribution of a large collection of random exponentials, and to see how it compared with normal distribution.

par( mfrow = c(1,2) )
xsim = rexp(1000,lambda)
shapiro.test(xsim)
## 
##  Shapiro-Wilk normality test
## 
## data:  xsim
## W = 0.81696, p-value < 2.2e-16
mns = NULL
for (i in 1 : 100) mns = c(mns, mean(rexp(1000,lambda)))
shapiro.test(mns)
## 
##  Shapiro-Wilk normality test
## 
## data:  mns
## W = 0.98764, p-value = 0.4821
qqnorm(xsim)
qqline(xsim,col='red')

qqnorm(mns)
qqline(mns,col='red',lwd=2,lty=2)

For the distribution of a large collection of random exponentials, the p-value is way to low, < 0.05 to be considered a Normal disttribution. The Q-Q plot confirmes that. On the other hand, when we study the distribution of a large collection of averages of exponentials. Not only the p-value is high enough > 0.05, the Q-Q plot also confirmes that it is approximately a normal distribution.