Investigate the exponential distribution in R and compare it with the Central Limit Theorem. Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials. Use rexp(n, lambda), set lambda = 0.2, mean = 1/lambda, sd = 1/lambda do a thousand simulations.

  1. Show the sample mean and compare it to the theoretical mean of the distribution.
  2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
  3. Show that the distribution is approximately normal.

Required packages:

library(ggplot2)

Simulation

# Make the simulation
set.seed(123)
matrixSIM <- matrix(rexp(1000 * 40, rate = 0.2), 1000, 40)
# Calculate the mean
meanMatrixSIM <- rowMeans(matrixSIM)
# Plot the data
ggplot(data.frame(meanMatrixSIM), aes(meanMatrixSIM)) +
        geom_histogram(fill = "lightblue", 
                       color = "black", bins = 15) +
        labs(title = "Histogram of Means from the Simulation", x = "Means", y = "Frequency") +
        theme_classic() +
        theme(plot.title = element_text(color = "#666666", face = "bold")) +
        theme(axis.title = element_text(color = "#666666", face = "bold")) +
        theme(axis.text = element_text(color = "#666666"))


1. Show the sample mean and compare it to the theoretical mean of the distribution.

# Sample mean of the distribution
mean(meanMatrixSIM)
## [1] 5.011911
# Theoretical mean of the distribution
1/0.2
## [1] 5

Sample mean and the theoretical mean are very close to each other.

2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

# Sample variance of the distribution
var(meanMatrixSIM)
## [1] 0.6088292
# Theoretical variance of the distribution
(1/0.2)^2/40
## [1] 0.625

Sample variance and the theoretical variance are very close to each other.

3. Show that the distribution is approximately normal.

# Plot a histigram and a density curve
ggplot(data.frame(meanMatrixSIM), aes(meanMatrixSIM)) +
        geom_histogram(aes(y = ..density..), 
                       fill = "lightblue", color = "black", bins = 15) +
        geom_density(colour = "red") +
        labs(title = "Density plot", x = "Means", y = "Density") +
        theme_classic() +
        theme(plot.title = element_text(color = "#666666", face = "bold")) +
        theme(axis.title = element_text(color = "#666666", face = "bold")) +
        theme(axis.text = element_text(color = "#666666"))

The distribution match very close the normal distribution.

# Plot QQ-Plot
ggplot(data.frame(meanMatrixSIM), aes(sample = meanMatrixSIM)) +
        stat_qq() +
        labs(title = "Q-Q Plot", x = "Theoretical Quantiles", y = "Sample Quantiles") +
        theme_classic() +
        theme(plot.title = element_text(color = "#666666", face = "bold")) +
        theme(axis.title = element_text(color = "#666666", face = "bold")) +
        theme(axis.text = element_text(color = "#666666"))

The theoretical quantiles match very close the sample quantiles.