This report describes a brief experiment to evaluate the relationship between a exponential distribution and the Central Limit Theorem. In this brief experiment, the data was generated by working with 1000 averages of exponential distributions (n = 40) and lambda = 0.2. These values are part of the project’s instructions.
The code below performs the simulation of the 1000 averages of the 40 exponentials with a fixed seed. A table is produced as output, which contains the Simulated and Theoretical means. The package pander was used to produce the table.
library(pander)
Sim <- 1000
Vals.sim <- 40
lambda <- 0.2
set.seed(640)
Data.Sim <- t(replicate(Sim, rexp(Vals.sim, lambda)))
d.frame <- data.frame(Mean=c(mean(rowMeans(Data.Sim)), 1/lambda),
Variance=c(mean(apply(Data.Sim, 1, var)), 1/lambda^2))
rownames(d.frame) <- c("Simulated", "Theoretical")
pander(d.frame, round=2)
| Mean | Variance | |
|---|---|---|
| Simulated | 5.03 | 24.74 |
| Theoretical | 5 | 25 |
Plot to compare the two distributions (Empirical vs Normal) produced as result of the simulation performed above.
library(ggplot2)
Vals.Means <- rowMeans(Data.Sim)
Z.mean <- (Vals.Means - mean(Vals.Means)) / sd(Vals.Means)
qplot(Z.mean, geom = "blank") +
geom_line(aes(y = ..density.., colour = 'Empirical'), stat = 'density') +
stat_function(fun = dnorm, aes(colour = 'Normal')) +
geom_histogram(aes(y = ..density..), alpha = 0.4, binwidth=.35) +
geom_vline(xintercept=0, colour="blue", linetype="longdash") +
scale_colour_manual(name = 'Density', values = c('brown', 'black')) +
ylab("Density") + xlab("Z") + ggtitle("Mean Values of the Distribution") +
theme_bw() + theme(legend.position = c(0.80, 0.80))
The code below 1) first verifies that the mean is within the confidence interval, 2) then estimates each simulations’ coverage, and 3) finally performs a simulation and calculates its mean.
set.seed(640)
lambda <- 0.2
## 1)
withinconfint <- function(lambda) {
events <- rexp(1000, lambda)
stderr <- sd(events)/sqrt(1000)
l1 <- mean(events) - 1.96 * stderr
l2 <- mean(events) + 1.96 * stderr
(l1 < 1/lambda & l2 > 1/lambda)
}
## 2)
Sim.coverage <- function(lambda) {
Vals.coverage <- replicate(100, withinconfint(lambda))
mean(Vals.coverage)
}
## 3)
repsim <- replicate(100, Sim.coverage(lambda))
paste("Mean of the Simulation is", mean(repsim))
## [1] "Mean of the Simulation is 0.9496"
94.84% of the time the confidence interval contains the value, which supports the expectation (close to the 95% expected).