In this project we will investigate the exponential distribution be simulated in R with rexp(n, lambda) and compare it with the Central Limit Theorem.
The histogram simulate the distribution of the mean, and the red line indicates the theoretical mean.
lambda = 0.2
TheoreticalMean = 1/lambda
TheoreticalSD = 1/lambda
mns = NULL
for (i in 1 : 1000) mns = c(mns, mean(rexp(40,lambda)))
hist(mns,col="green",freq=FALSE)
abline(v=TheoreticalMean,col = "red", lwd = 2)
# Sample Variance versus Theoretical Variance: Include figures (output from R) with titles. Highlight the variances you are comparing. Include text that explains your understanding of the differences of the variances.
vars = NULL
for (i in 1 : 1000) vars = c(vars, sd(rexp(40,lambda)))
hist(vars,col="grey",freq=FALSE)
abline(v=TheoreticalSD,col = "red", lwd = 2)
We first look at the distribution of a large collection of random exponentials, and to see how it compared with normal distribution.
par( mfrow = c(1,2) )
xsim = rexp(1000,lambda)
shapiro.test(xsim)
##
## Shapiro-Wilk normality test
##
## data: xsim
## W = 0.81696, p-value < 2.2e-16
mns = NULL
for (i in 1 : 100) mns = c(mns, mean(rexp(1000,lambda)))
shapiro.test(mns)
##
## Shapiro-Wilk normality test
##
## data: mns
## W = 0.98764, p-value = 0.4821
qqnorm(xsim)
qqline(xsim,col='red')
qqnorm(mns)
qqline(mns,col='red',lwd=2,lty=2)
For the distribution of a large collection of random exponentials, the p-value is way to low, < 0.05 to be considered a Normal disttribution. The Q-Q plot confirmes that. On the other hand, when we study the distribution of a large collection of averages of exponentials. Not only the p-value is high enough > 0.05, the Q-Q plot also confirmes that it is approximately a normal distribution.