This report will highlight the differences/similarities between the sample mean & variance and the theoretical mean & variance.
Histogram will be plotted to highlight the differences/similarities between the distribution of the sample mean and the normal distribution.
#First set the seed for reproducibility.
set.seed(21)
#Now, simulate 1000 averages of 40 random exponentials with lambda = 0.2 and assign these 1000 values to mns.
mns <- NULL
for(i in 1:1000){
mns <- c(mns, mean(rexp(40, 0.2)))
}
#Theoretical mean = 1/lambda
tmean <- 1/0.2
#Sample mean = mean of the 1000 averages of 40 random exponentials
smean <- mean(mns)
#Plot the histogram of the 1000 averages of 40 random exponentials
hist(mns)
#The theoretical mean is highlighted with a blue line
abline(v = tmean, lwd = 3, lty = 3, col = "blue")
#The sample mean is highlighted with a red line
abline(v = smean, lwd = 3, lty = 4, col = "red")
Therefore, looking at the histogram plotted, the sample mean is trying to estimate the population mean (the theoretical mean). By simulating a thousand averages, the sample mean, 4.9814404 approximates the theoretical mean, 5
#Theoretical variance = (1/lambda^2)/n
tvar <- (1/0.2^2)/40
#Sample variance = variance of the 1000 averages of 40 random exponentials
svar <- var(mns)
The sample variance, 0.5904182 approximates the theoretical variance, 0.625
#Plot the histogram of the 1000 averages of 40 random exponentials
hist(mns, breaks = 40, xlab = "Averages of 40 random exponentials", main = "Normal curve over the Histogram", freq = FALSE)
xf <- seq(min(mns), max(mns), length = 100)
yf <- dnorm(xf, mean = 1/0.2,sd = (1/0.2)/sqrt(40))
lines(xf, yf, lty=2, col = "blue")
By overlaying a normal curve on top of the histogram plotted, we can see that the sample mean distribution approximates the normal distribution.