Overview

In this simulation, I will be demonstrating the behavior of an exponential distribution compared the distribution of averages of 40 exponentials. I compare the means and variance (via standard deviation), then will show histograms comparing the two distributions.

Simulation

Define the number of simulations (nosim), the size (n) of each simulation, define lambda, and calculate the theoretical mean and variance (sd) of the exponential distribution for lambda. Run a single simulation of size 1000, and run 1000 simulations of size 40 taking the mean of each. This gives two distrubutions of size 1000 to compare.

nosim <- 1000
n <- 40
lambda <- 0.2
mean <- 1/lambda
sd <- 1/lambda

expDist <- rexp(nosim,lambda)

avgDist = NULL
for (i in 1:nosim) {
  avgDist <- c(avgDist,mean(rexp(n,lambda)))
}

Sample Mean versus Theoretical Mean

Sample mean

mean(avgDist)
## [1] 5.012434

Theoretical mean

mean
## [1] 5

Comparison

data <- c(round(mean(avgDist),3),mean)
bp <- barplot(data,
              names.arg=c("Sample Mean","Theoretical Mean"),
              main="Sample/Theoretical Means")
text(x=bp,y=data+0.5,labels=as.character(data),xpd=TRUE)

The numbers above are very similar because the Central Limit Theorem states that given a sufficiently large N, the mean of all samples will converge to the population/theoretical mean.

Sample Variance versus Theoretical Variance

Sample Variance

sd(avgDist)
## [1] 0.7956717

Theoretical Variance

sd
## [1] 5

Comparison

data <- c(round(sd(avgDist),3),sd)
bp <- barplot(data,
              names.arg=c("Sample Var","Theoretical Var"),
              main="Sample/Theoretical Variance")
text(x=bp,y=data+0.5,labels=as.character(data),xpd=TRUE)

The variances above are so different because we are taking the variance (or standard deviation in this case) of the the means, whereas the theoritcal variance of an exponential distribution is of individual values. The distribution of means is much less variable (and is normally distributed) than the distribution of all exponentials (which is not normally distributed).

Distribution

Distribution of a large collection of random exponentials

hist(expDist,main="Random Exponentials",xlab="Exponential Values")

Distribution of averages of 40 exponentials

hist(avgDist,main = "Averages of Random Exponentials",xlab="Means of 40 Exponential Values")

The first histogram above shows the expopential distribution. One can tell due to it’s heavily skewed shape. This was built by taking 1000 random pulls from the exponential distribution.

The second histogram shows a roughly normal distribution. One can tell due to its bell-like shape. This was built by taking the means of 1000 distributions of 40 exponential random variables. This is normal because the distribution of the sample mean always converges to normal as N increases, despite the distrution of the underlying data.