Overview

This project investigates the exponential distribution in R and compares it with the Central Limit Theorem. This project simulates the exponential distribution by R function rexp(n, lambda) where lambda is the rate parameter. This project performs one thousand simulations and investigates the distribution of averages of 40 exponentials.

Simulations

Execute 1000 simulations for 40 samples. UseR function rexp(n, lambda) where lambda is the rate parameter. Set lambda = 0.2 for all of the simulations. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda.

set.seed(12345)
lambda <- 0.2
n <- 40 
nosim <- 1000
sample_means = NULL
for (i in 1 : nosim) {
  sample_means = c(sample_means, mean(rexp(n,lambda)))
}
hist(sample_means)

sample_means_df <- as.data.frame(sample_means)

Sample Mean versus Theoretical Mean

Compair sample mean and theoretical mean.

simulation_mean <- mean(sample_means)
round(simulation_mean, 3)
## [1] 4.972
theoretical_mean <- 1/lambda
round(theoretical_mean, 3)
## [1] 5

Conclusion:

As Figure-1 in Appendix indicates, the sample mean (blue dashed line) is close to the theoretical mean (red sashed line).

Sample Variance versus Theoretical Variance

Compare sample variance and theoretical variance.

simulation_variance <- var(sample_means)
round(simulation_variance, 3)
## [1] 0.595
theoretical_variance <- (1/lambda)^2/n
round(theoretical_variance, 3)
## [1] 0.625
simulation_sd <- sd(sample_means)
round(simulation_sd, 3)
## [1] 0.772
theoretical_sd <- (1/lambda)/sqrt(n)
round(theoretical_sd, 3)
## [1] 0.791

Conclusion:

As the above calculation results indicate, the sample variance is close to the theoretical variance. In Figure-2 in Appendix indicates, one sample standard deviation (green vertical line) is close to one theoretical standard deviation (orange vertical line).

Distribution

Investigate if the sample distribution is approximately normal.

Conclusion:

As Figure-2 in Appendix shows the sample density curve (green curving line) is similar to the normal distribution curve (orange curving line).

Appendix

Sample Mean versus Theoretical Mean

library(ggplot2)
g <- ggplot(sample_means_df, aes(x=sample_means))
g <- g + geom_histogram(binwidth = .3, color="black") +
  geom_vline(aes(xintercept = theoretical_mean, 
                 color="theoretical_mean"), size=1, linetype=2) +
  geom_vline(aes(xintercept = simulation_mean, 
                 color="simulation_mean"), size=1, linetype=2) +
  scale_color_manual(values = c(simulation_mean = "blue", theoretical_mean = "red"))+
  labs(x="Sample means distribution", y= "Frequecy", 
       title="Figure-1: Comparing theoretical and simulated means")
g

Sample Variance versus Theoretical Variance

g <- ggplot(sample_means_df, aes(x=sample_means))
g <- g + geom_histogram(binwidth = .3, color="black") +
  geom_vline(aes(xintercept = theoretical_mean, 
                 color="theoretical_mean"), size=1, linetype=2) +
  geom_vline(aes(xintercept = simulation_mean, 
                 color="simulation_mean"), size=1, linetype=2) +
  geom_vline(aes(xintercept = simulation_mean+simulation_sd, 
                 color="simulation_sd"), size=1, linetype=1) +
  geom_vline(aes(xintercept = theoretical_mean+theoretical_sd, 
                 color="theoretical_sd"), size=1, linetype=1) +
  geom_vline(aes(xintercept = simulation_mean-simulation_sd, 
                 color="simulation_sd"), size=1, linetype=1) +
  geom_vline(aes(xintercept = theoretical_mean-theoretical_sd, 
                 color="theoretical_sd"), size=1, linetype=1) +
  scale_color_manual(values = c(simulation_mean = "blue", 
                                theoretical_mean = "red",
                                simulation_sd = "green",
                                theoretical_sd = "orange"))+
  labs(x="Sample means distribution", y= "Frequecy", 
       title="Figure-2: Comparing theoretical and simulated variances")
g

Sample distribution versus Theoretical distribution

g <- ggplot(sample_means_df, aes(x=sample_means))
g <- g + geom_histogram(binwidth = .3, color="black", aes(y=..density..)) +
  stat_function(fun=dnorm, args=list(mean=theoretical_mean, sd=theoretical_sd), 
                aes(color="normal_distribution"), size =1) +
  stat_density(geom = "line", aes(color = "simulation_density"), size =1)  +
  geom_vline(aes(xintercept = theoretical_mean, 
                 color="theoretical_mean"), size=1, linetype=2) +
  geom_vline(aes(xintercept = simulation_mean, 
                 color="simulation_mean"), size=1, linetype=2)+ 
  scale_color_manual(values = c(simulation_mean = "blue", 
                                theoretical_mean = "red",
                                simulation_density = "green",
                                normal_distribution = "orange"))+
  labs(x="Sample means distribution", y= "density", 
       title="Figure-3: Density of Simulated Exponential Samples Means")
g