In this document is reported the first part of the peer-graded assignment for the Statistical Inference Course. In this part of the project the exponential distribution will be investigated and compared with the Central Limit Theorem. The exponential distribution will be simulated in R with the command rexp(...) and compared with the relative theoretical values (mean, variance and normal distribution).
The R’s rexp(...) is used to generate random samples like showed in this histogram:
hist(rexp(1000, 0.2), breaks = 50)
Before proceeding we set the parameters:
lambda <- 0.2
n <- 40
simulations <- 10000
So the distribution of 10,000 averages of 40 random exponential with lambda as rate that we are going to compare with the Central Limit Theorem is:
means = NULL
for (i in 1 : simulations) means = c(means, mean(rexp(n, lambda)))
h<-hist(means, breaks=50, density=10, col="lightgray", xlab="Means", ylab="Frequency", main="Distribution of Averages of rexp(n, lambda)")
The theoretical mean and the sample mean are respectively:
theoretical_X <- 1/lambda
sample_X <- mean(means)
As returned in the below graph, the two values are very closed each other:
h<-hist(means, breaks=50, density=10, col="lightgray", xlab="Means", ylab="Frequency", main="Sample vs Theoretical Mean")
abline(v = sample_X, lwd="4", col="red")
abline(v = theoretical_X, lwd="2", col="black")
legend("topright", legend=c(paste("Sample Mean = ", round(sample_X,3)), paste("Theoretical Mean = ", theoretical_X)), col=c("red", "black"), lty=1, cex=0.8)
The expression relative to the theoretical variance is given by:
std <- (1 / lambda) / sqrt ( n )
theoretical_variance <- std ^ 2
and it returns the value 0.625.
For the sample variance we have:
sample_variance <- var(means)
that returns the value 0.621. As you can see the values are pretty close each other.
To verify that the exponential distribution follows the Central Limit Theorem we have to see if our sample means distribution follows the normal distribution. To see this we use the histogram with density (probability) instead of the frequency:
h <- hist(means, prob=TRUE, breaks=50, density=10, col="lightgray", xlab="Means", ylab="Density", main="Mean Distribution for rexp() based on Density")
lines(density(means), col="black", lwd=2)
As you can see the exponential distribution is following the normal distribution so the Central Limit Theorem is verified.