Synopsis

In this project, we will simulate the exponential distribution in R and compare it with the Central Limit Theorem.

The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of an exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. You will investigate the distribution of averages of 40 exponentials. Note that you will need to do a thousand simulations.

Tasks for this project

  1. Show the sample mean and compare it to the theoretical mean of the distribution.
  2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
  3. Show that the distribution is approximately normal.

Loading ggplot2 library

library("ggplot2")

Calculating the mean of simulated exponentials

#set seed for reproducibility
set.seed(54)

#set lambda to 0.2
lambda <- 0.2

#40 samples
n <- 40

#1000 simulations
simulations <- 1000

Running Simulations

#simulation
simulated_Exp <- replicate(simulations, rexp(n, lambda))

#mean of simulated exponentials
mean_Exp <- apply(simulated_Exp, 2, mean)

Task 1: Sample Mean versus Theoretical Mean

Show the sample mean and compare it to the theoretical mean of the distribution.

#sample mean
Samp_mean <- mean(mean_Exp)
Samp_mean
## [1] 4.969918
#theoretical mean
theor_mean <- 1/lambda
theor_mean
## [1] 5
#plot showing the exponential function simulations of sample means
hist(mean_Exp, xlab = "Means", main = "Exponential Function Simulations of Sample Means", col = "yellow")

abline(v = Samp_mean, col = "red")
abline(v = theor_mean, col = "blue")

The Sample mean is 4.9699178 while the theoretical mean is 5. The center of the distribution of averages for 40 exponentials is very close to the theoretical center of the distribution.

Task 2: Sample Variance versus Theoretical Variance

Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

#sample standard deviation
samp_sd <- sd(mean_Exp)
samp_sd
## [1] 0.7826315
#theoretical standard deviation
theor_sd <- (1/lambda)/sqrt(n)
theor_sd
## [1] 0.7905694
#sample variance
samp_var <- samp_sd^2
samp_var
## [1] 0.612512
#theoretical variance
theor_var <- theor_sd^2
theor_var
## [1] 0.625

The sample varies by 0.612512, which is very close to the theoretical variance of 0.625

Task 3: Showing that the distribution is approximately normal

#distribution density of means
xfit <- seq(min(mean_Exp), max(mean_Exp), length=100)

yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(n)))

hist(mean_Exp,breaks=n, prob=T, col="yellow", xlab = "Means", main="Density Distribution of Means",ylab="Density")

lines(xfit, yfit, pch=22, col="black", lty=1)

#compare the distribution for averages of 40 exponentials to a normal distribution

qqnorm(mean_Exp)
qqline(mean_Exp, col = 2)