Overview

This project investigates the exponential distribution compared with the Central Limit Theorem in R. The exponential distribution is simulated with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. lambda = 0.2 for all of the simulations. The distribution of averages of 40 exponentials will be performed with a thousand simulations.

The process is as follows:

  1. Show the sample mean and compare it to the theoretical mean of the distribution.

  2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

  3. Show that the distribution is approximately normal.

Simulations

This project uses the following data:

set.seed(1234)
n <- 40
lambda <- 0.2
mns <- NULL

Sample Mean versus Theoretical Mean

Theoretical Mean \[\lambda^{-1}\]

tmean <- 1/lambda
print(tmean)
## [1] 5

Sample Mean:

# The distribution of 1000 averages of 40 exponentials (sample)
for (i in 1:1000) mns = c(mns, mean(rexp(n, rate = lambda)))
smean <- mean(mns)
print(smean)
## [1] 4.974239

Histogram of Means:

hist(mns, breaks = 50, col="gray", main="Histogram of Means", xlab="Means")
abline(v = tmean, col = "red", lwd = 2)
abline(v = smean, col = "blue", lwd = 2)

The sample mean is very close to the theoretical mean.

Sample Variance versus Theoretical Variance

Theoretical standard deviation: \[\frac{\lambda^{-1}}{\sqrt{n}}\]

tsd <- (1/lambda)/sqrt(n)
print(tsd)
## [1] 0.7905694

Theoretical variance (tsd^2):

tvar <- tsd^2
print(tvar)
## [1] 0.625

Sample variance:

svar <- var(mns)
print(svar)
## [1] 0.5706551

The sample variance is not as close to the theoretical variance than the sample mean is close to the theoretical mean.

Distribution

The Central Limit Theorem states that the distribution of averages of iid variables becomes that of a standard normal as the sample size increases. A Q-Q plot with an added qqline is used to show the linearity of the data points, suggesting that the data are normally distributed.

qqnorm(mns, ylab = "Normal data quantiles", xlab = "Normal theoretical quantiles")
qqline(mns, col="red")