Part 1 of Class Project

The goal of the simulation exercise was to investigate the exponential distribution and compare it with the Central Limit Theorem. Specifically, we were to investigate the distribution of averages of 40 iid variables randomly drawn from the exponential distribution. Our simulation parameters were lambda = 0.2 and n=40.

## Setting parameters
n <- 40
lambda <- 0.2
mu <- 1/lambda
s <- 1/lambda
var <- 1/lambda^2

First Step: Show the sample mean and compare it to the theoretical mean of the distribution.

Here we will run 1000 times a selection of 40 iid values from the exponential dist with lambda = 0.2 and take the mean of the 40 each time

SampleMeans <- NULL   ## here we initialize the variable for our simulation
for (i in 1:1000) SampleMeans <- c(SampleMeans, mean(rexp(n, lambda)))

Now lets check out the mean

SimMean <- mean(SampleMeans)
SimMean
## [1] 4.99492

You can see the mean of the 1000 simulations is 4.9949199 which is close to the theoretical mean of 5 (which we can determine because we know that mean of an exponential distribution is \(1/\lambda\) or in this case \(1/0.2=5\)).

Below is an histogram of the 1000 means. You can see that they tend toward the theoretical mean of 5 and look very Gaussian (or normally distributed about 5).

par(family = "mono")
hist(SampleMeans, 100)

Second Step: Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

We run our simulation again this time looking for the variance.

SampleVar <- NULL
for (i in 1:1000) SampleVar <- c(SampleVar, var(rexp(n, lambda)))

So, what is the average variance from the 1000 iterations?

AvgVar <- mean(SampleVar)

We can see that the average variance 24.4605226 is very close to the theoretical variance of \(1/\lambda^2\) which is in this case 25.

Below is histogram of 1000 variances. You can see that they tend toward the theoretical variance of 25.

hist(SampleVar, 100)

Third Step:Show that the distribution is approximately normal.

But if we wanted to compare the average of 10,000 iid values taken from an exponential distribution and the average of 40 iid draws done 1000 times, then we write:

LargeSample <- mean(rexp(10000,lambda))
SampleMeans <- NULL
for (i in 1:1000) SampleMeans <- c(SampleMeans, mean(rexp(n, lambda)))

And our values are:

Average from LargeSample = 4.9931831

Average from Many Small Samples = 4.9925269

And we can see that they are almost identical values.

Let’s look at the distribution of averages of the iid values when normalized to show that they become that of a standard normal (remember that both the mean and standard deviation are \(1/\lambda\) for an exponential distribution)

SampleMeans <- NULL
lambda <- 0.2
mu <- 1/lambda
sigma <- 1/lambda
n <- 40
for (i in 1:1000) SampleMeans <- c(SampleMeans, mean(rexp(n, lambda)))
Standard <- (SampleMeans - mu)/(sigma/sqrt(n))
hist(Standard, 200)

So the mean of Standard is 0.017 which is very close to 0 and the variance is 1.001 which is very close to 1.