library(knitr)
library(ggplot2)
opts_chunk$set(echo=TRUE, results='asis', cache=TRUE)
The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. For these simulations, we set lambda to 0.2. We investigate the distribution of averages of 40 exponentials. For this purpose, we perform a thousand or so simulated averages of 40 exponentials.
set.seed(123)
lamda <- 0.2
nSamples <- 40
nSimulation <- 1000
exp_simulation <- replicate(nSimulation, expr = rexp(nSamples, lamda))
## Histogram of total simulated values
plot1 <- qplot(as.vector(exp_simulation), type = "histogram")+ xlab("values") + ylab("count")
plot1
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Now we’ll take the average of every simulation, we need to remember that this is a random variable itself.
exp_simulation_means <- apply(exp_simulation, MARGIN = 2, FUN = mean)
plot2 <-ggplot(as.data.frame(exp_simulation_means), aes(x = exp_simulation_means))+geom_histogram(aes(y=..density..)) + xlab("means of simulations") + geom_vline(xintercept= mean(exp_simulation_means), colour = "white", linetype = 4) + stat_function(fun = dnorm, args = list(mean = 1/lamda, sd = (1/lamda)/sqrt(nSamples)), size = 2)
plot2
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
On the latter graph you can see a vertical line that represent the average of the means of every simulation.
## Value of mean of simulation means
mean(exp_simulation_means)
[1] 5.012
## Value from expression
1/lamda
[1] 5
## Value of standard diviation of simulation means
sd(exp_simulation_means)
[1] 0.7749
## Value from expression
(1/lamda)/sqrt(nSamples)
[1] 0.7906
We have already shown on our second graph how close are the average of means from the simulation to the normal distibution. We can also see the same thing trough a QQ plot:
qqnorm(exp_simulation_means)
qqline(exp_simulation_means, col = "red")