Library used:

library(ggplot2)

Simulate Data using rexp(n, lambda), where lambda is rate parameter and n is the sample size, and both mean and standard deviation of the distribution is 1/lambda. In this project, we will plot the distribution of mean of 40 exponential of 0.2.

Given the following:

lambda = 0.2; n = 40; simulation=1000; mu = 1/lambda; sigma = 1/lambda;

Next, create a sample distribution using rexp, and calculate the sample means and sample standard deviation

sample_dist <- sapply(1:simulation, FUN=function(x) { mean(rexp(n, lambda))})
sd_mean <- mean(sample_dist)
sd_stddev <- sd(sample_dist)
df <- data.frame(mns=sample_dist)

Now answer the following four questions:

  1. Show where the distribution is centered at and compare it to the theoretical center of the distribution. plot of chunk unnamed-chunk-4

Here the theoretical mean is shown in red, and the mean of the sample distribution is in blue. They are nearly identical.

  1. Show how variable it is and compare it to the theoretical variance of the distribution. The theoretical variance (theoretical standard variation / sample size ) = , 0.125, and the variance of sample mean ( standard variation of sample mean / sqrt(sample size)) = 0.1219. They are very close.

  2. Show that the distribution is approximately normal. plot of chunk unnamed-chunk-5 The distribution of sample mean matches to the normal distribution line.

  3. Evaluate the coverage of the confidence interval for 1/lambda: mu ± 1.96 (sample standard deviation) / sqrt(n). Using sample distribution, the t interval comes out to be [4.761, 5.239]. Given that the goal is to estimate theoretical mean (1/lambda) 5, the result is accurate because the t interval range for 97.5% (symmetrical) confidence accurately set the range.