Final Exam - Course 6 - Statistical Inference - First Part

In this project you will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. You will investigate the distribution of averages of 40 exponentials. Note that you will need to do a thousand (1000) simulations.

Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials.

1. Loading data and exploratory

sessionInfo()
## R version 3.2.5 (2016-04-14)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.1 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] magrittr_1.5    formatR_1.3     tools_3.2.5     htmltools_0.3.5
##  [5] yaml_2.1.13     Rcpp_0.12.5     stringi_1.1.1   rmarkdown_0.9.6
##  [9] knitr_1.12.3    stringr_1.0.0   digest_0.6.9    evaluate_0.9
set.seed(32768)  # define a seed defined for all the test.

lambda_v <- 0.2

Processing data (single)

s40   <- rexp( 40,   lambda_v)
s1000 <- rexp( 1000, lambda_v)

# Summary
summary(s40)   # summary 40
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.06064  1.59600  4.31700  5.12500  6.38300 24.54000
summary(s1000) # summary 1000
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.01116  1.44800  3.62600  5.11200  7.06500 31.87000

All is near 5

Processing data in groups of

g1000 <- 0

# store 1000 times , the mean of 40 values generated with rexp
for (i in 1:1000) 
  { 
   g1000[i] <- mean( rexp(n = 40, lambda_v) )    
  }

m1000 <- mean(g1000)

Simualtions

par(mfrow= c(1,3) )

# Mean ( 40) 
mean(s40) # Mean of 40 rexps values
## [1] 5.125419
barplot(s40, xlab="values x", main="values for 40")

# Mean (1000 )
mean(s1000) # Mean of 1000 rexps values
## [1] 5.112376
hist(g1000, xlab = "Means", main = "Means of 1000 samples of n = 40")

qqnorm(g1000)
qqline(g1000)

The theorical center is 1/lambda ( 1/0.2 ) = 5

sd(g1000) # near 0.8160

EOF