In this stimulation exercise, we are trying to see if the Central Limit Theorem is applicable to random exponential samples.
  1. A 40 x 1000 matrix of random exponential samples are generated.
  2. Take the mean of every 40 samples to produce sample means.
  3. Plot the 40-sample means.
  4. See if the distribution resembles a Normal Distribution
In addition, the stimulated mean and variance were compared to the theoretical mean (1/lambda) and variance (1/lambda^2).
# the parameters
lambda <- 0.2
n <- 40
nos <- 1000 # no. of simulations

# 40 by 1000 matrix of rexp variables
set.seed(3)
m <- matrix(rexp(n*nos, lambda), n, nos)

# taking mean and variance for each of 1000 40-sample sets
mns <- colMeans(m)
v <- apply(m, 1, var)


data.frame("Simulation" = c(mean(mns), mean(v)),  
                  "Theoretical" = c(1/lambda, 1/lambda^2),
                        row.names = c("Mean", "Variance"))
##          Simulation Theoretical
## Mean        4.98662           5
## Variance   24.74123          25
# plots stimulation
mnsdf <- data.frame(mns)
library(ggplot2)
g <- ggplot(mnsdf, aes(mnsdf$mns))
g <- g + geom_histogram(binwidth = .1, fill="blue", 
                        color = "black", aes(y = ..density..))
g <- g + geom_vline(xintercept = mean(mns), color = "blue", 
                    linetype = "dotted", size = 1)

# plots normal distribution
g <- g + stat_function(fun = dnorm, 
                       args = c(mean = mean(mns), sd = sqrt(mean(v)/n)),
                       color = "purple", size = 2)
g <- g + geom_vline(xintercept = 1/lambda, color = "purple", size = 1)

# naming the figure
g <- g + labs(title = "Means from Random Exponential Samples", 
              x = "40-sample means")
g

*** Disclaimer: The suggestions and remarks in this page are based on personal research experience. Research practices and approaches vary. Exercise your own judgment regarding the suitability of the content.
*** Analysis environment
sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 14393)
## 
## locale:
## [1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252   
## [3] LC_MONETARY=English_Singapore.1252 LC_NUMERIC=C                      
## [5] LC_TIME=English_Singapore.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_2.2.1
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.8      assertthat_0.1   digest_0.6.10    rprojroot_1.2   
##  [5] plyr_1.8.4       grid_3.3.2       gtable_0.2.0     backports_1.0.5 
##  [9] magrittr_1.5     evaluate_0.10    scales_0.4.1     stringi_1.1.2   
## [13] lazyeval_0.2.0   rmarkdown_1.3    labeling_0.3     tools_3.3.2     
## [17] stringr_1.1.0    munsell_0.4.3    yaml_2.1.14      colorspace_1.3-2
## [21] htmltools_0.3.5  knitr_1.15.1     tibble_1.2