Simulation of Exponential Distribution & Comparison with Central Limit theorem

Author : Ramamoorthy Vanamamalai Nanguneri

Overview:

Illustraion via Simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials.

 1. Show the sample mean and compare it to the theoretical mean of the distribution.
 2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
 3. Show that the distribution is approximately normal.

Simulations

Initial Setup

library(knitr)
opts_chunk$set(echo = TRUE, results = "markup", warning = TRUE, cache = TRUE, tidy = TRUE)
set.seed(2297)
lambda_var <- 0.2

Simulating Exponential Distributions and Exploring

Single samples of exponential distributions with n = 40 and n = 1000

single_40 <- rexp(n = 40, lambda_var)
single_1000 <- rexp(n = 1000, lambda_var)
summary(single_40)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.01227  1.36200  3.06100  4.45400  5.46200 30.21000
summary(single_1000)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.00187  1.49900  3.61900  5.04200  7.03800 46.31000
We may observe that the sample size increases , the mean value nears the value 5.

1. Show the sample mean and compare it to the theoretical mean of the distribution.

mean_1000 <- numeric()
for (i in 1:1000) {
    mean_1000 <- c(mean_1000, mean(rexp(n = 40, lambda_var)))
}
par(mfrow = c(1, 2), mar = c(6, 4, 4, 2))
hist(mean_1000, main = "1000 samples of n = 40", xlab = "Means", col = "green")
boxplot(mean_1000, main = "1000 samples of n = 40", xlab = "Means", horizontal = TRUE)

summary(mean_1000)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.748   4.481   4.996   5.025   5.542   7.898

The distribution of 1000 samples is even more closely centered on 5 at 5.025. The theoretical center is at: 1 / lambda
1/lambda_var
## [1] 5

2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

sd(mean_1000)
## [1] 0.7933147
(1/lambda_var)/sqrt(40)
## [1] 0.7905694
Both the theoretical and actual variance (of 1000 samples of n = 40) are at 0.79

3. Show that the distribution is approximately normal.

par(mfrow = c(1, 1), mar = c(6, 4, 4, 2))
qqnorm(mean_1000)
qqline(mean_1000, col = "green", lw = 3)
legend("topleft", "Normal", lwd = 3, col = "green", inset = c(0.1, 0.1))

As shown above, the actual distribution falls almost exactly on the theoretical normal line in a quantile-to-quantile plot, even if outliers at the extreme lower and upper ends fall above this line.

The distribution (in thinner bins) looks approximately normal as shown below
mu <- mean(mean_1000)
s <- sd(mean_1000)
hist(mean_1000, breaks = 40, xlab = "Means", main = "Means of 1000 samples of n = 40", 
    col = "blue")

Peer Assignment Part 1 is Concluded