Title: Comparing Sample Mean & Variance and Theoretical Mean & Variance

Author: Jun Yitt

Overview

This report will highlight the differences/similarities between the sample mean & variance and the theoretical mean & variance.

Histogram will be plotted to highlight the differences/similarities between the distribution of the sample mean and the normal distribution.

Simulations

#First set the seed for reproducibility.
set.seed(21)

#Now, simulate 1000 averages of 40 random exponentials with lambda = 0.2 and assign these 1000 values to mns.
mns <- NULL
for(i in 1:1000){
        mns <- c(mns, mean(rexp(40, 0.2)))
}

Sample Mean versus Theoretical Mean

#Theoretical mean = 1/lambda
tmean <- 1/0.2  
#Sample mean = mean of the 1000 averages of 40 random exponentials
smean <- mean(mns)

#Plot the histogram of the 1000 averages of 40 random exponentials
hist(mns)

#The theoretical mean is highlighted with a blue line
abline(v = tmean, lwd = 3, lty = 3, col = "blue")
#The sample mean is highlighted with a red line
abline(v = smean, lwd = 3, lty = 4, col = "red")

Therefore, looking at the histogram plotted, the sample mean is trying to estimate the population mean (the theoretical mean). By simulating a thousand averages, the sample mean, 4.9814404 approximates the theoretical mean, 5

Sample Variance versus Theoretical Variance

#Theoretical variance = (1/lambda^2)/n
tvar <- (1/0.2^2)/40 
#Sample variance = variance of the 1000 averages of 40 random exponentials
svar <- var(mns)

The sample variance, 0.5904182 approximates the theoretical variance, 0.625

Distribution of Sample Mean

#Plot the histogram of the 1000 averages of 40 random exponentials
hist(mns, breaks = 40, xlab = "Averages of 40 random exponentials", main = "Normal curve over the Histogram", freq = FALSE)
xf <- seq(min(mns), max(mns), length = 100)
yf <- dnorm(xf, mean = 1/0.2,sd = (1/0.2)/sqrt(40))
lines(xf, yf, lty=2, col = "blue")

By overlaying a normal curve on top of the histogram plotted, we can see that the sample mean distribution approximates the normal distribution.