Synopsis

This project investigates the exponential distribution using R. The exercise will compare the distribution of averages of 40 exponentials from 1000 simulations, against the theoretical results provided by the Central Limit Theorm.

The results from the simulation exercise will show that sample distribution will tend towards the theoretical distribution, as proposed by the Central Limit Theorm.

Setup and Run Simulation.

The exponential distribution will be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda.

For the simulation exercise,

  1. lambda is set as 0.2.
  2. sample size (n) is set as 40.
# Set seed
set.seed(100)

# Set Lambda as 0.2
lambda <- 0.2

# Set sample size, n
n <- 40

# Set no. simulations
sim <- 1000

# Run simulation, and collect mean and standard deviation data

simulate_exp<- replicate(sim, rexp(n, lambda))

simulate_mean<- apply(simulate_exp, 2, mean)
simulate_sd<- sd(simulate_mean)

Question 1 - Compare the sample mean from the simulation to the theoretical mean of the exponential distribution.

1.1 Calculate sample mean from simulation exercise.

# Show sample mean obtained from simulation exercise

sample_mean<- mean(simulate_mean)
sample_mean
## [1] 4.999702

1.2 Calulate theoretical mean of exponential distribution.

# Calculate and show theoretical mean of exponential distribution

exp_mean<- 1/lambda

exp_mean
## [1] 5

Illustrating the result of the simulation exercise, we can see from the chart below that the sample mean is not distinguishable from the theoretical mean of the exponential distribution (blue and red lines are very close!).

hist(simulate_mean, breaks = 40, xlab = "Mean", main = "Distribution of Sample Means", col = "green")
abline(v = sample_mean, col = "blue", lwd=3)
abline(v = exp_mean, col = "red", lwd=1)

Conclusion:

As suggested by the Central Limit Theorm, the sample mean from 1000 simulations is a good approximation for the theoretical mean of the exponential distribution.

Question 2 - Compare the sample variance from the simulation to the theoretical variance of the exponential distribution.

2.1 Calculate the sample variance.

# Show sample variance obtained from the simulation exercise

sample_var<- simulate_sd^2
sample_var
## [1] 0.6432442

2.2 Calculate the theoretical variance.

# Show sample variance obtained from the simulation exercise

exp_sd <- (1/lambda)/sqrt(n)
exp_var <- exp_sd^2
exp_var
## [1] 0.625

Conclusion:

The sample variance obtained from the simulation exercise is a good approximation for the theoretical variance of the exponential distribution. This is again in line with the proposition put forth by the Central Limit Theorm.

Question 3 - Show that the exponential distribution is approximately normal.

# Plot distribution of sample means from simulation

hist(simulate_mean, breaks = 80, xlab = "Mean", main = "Sample Distribution vs Normal Distribution", col = "yellow")


# Overlay normal distribution

xfit <- seq(min(simulate_mean), max(simulate_mean), length=100)
yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(n)))
lines(xfit, yfit*50, pch=22, col="red", lty=1)

Conclusion:

The result from the simulation exercise shows that the sample distribution will tend towards a normal distribution as the sample size increases. This is in line with the proposition put forth by the Central Limit Theorm.