Part 1: Simulation Exercise Instructions

1. Show the sample mean and compare it to the theoretical mean of the distribution

Consider \(n\) random variables \(X_1, \ldots, X_n\), following an Exponential distribution with rate parameter \(\lambda\): \[ X_{1} \sim \text{exponential}(x \mid \lambda) \] The corresponding mean \(\mu\) and variance \(\sigma^2\) are \[ \mathbb{E}\left[ X_1 \right] = \mu = \frac{1}{\lambda} ; \quad \mathbb{V}\left[ X_1 \right] = \sigma^2 = \frac{1}{\lambda^2} \] Via the Central Limit Theorem, the average sample of the \(n\) exponentially distributed random variables, \[ \bar{X} = \frac{1}{n}\sum_{i}^{n} X_i \]

follow a normal distribution with mean \(\mu\) and variance \(\sigma^2/n\):

\[ \bar{X} \sim \text{Normal}(x \mid \mu, \frac{\sigma^2}{n}) \] For the case where the rate parameter corresponding to the Exponential distribution is set to \(\lambda = 0.2\) and the number of samples considered is set to \(n = 40\), the theoretical mean and the theoretical standard distribution corresponding to the Normal distribution gouverning the average random variabve \(\bar{X}\) are \[ \mu = \frac{1}{\lambda} = \frac{1}{0.2} = 5; \quad \text{sd} = \frac{\sigma}{\sqrt{n}} = \frac{5}{\sqrt{40}} \approx 0.79. \]

40 (nSamples) exponential samples are simulated and the corresponding average mean 1000 times (nSimulations) are computed. The histogram corresponding to the 1000 simulated averages is presented together with the theoretical Normal distribution \(\text{Normal}\left( x \mid \mu = 5, \text{sd} = 0.79 \right)\) and the theoretical mean (\(\mu = 5\), black solid line) is compared with the sample mean computed from the available 1000 average samples (\(\bar{x}\), black dotted line).

2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

This time the histogram corresponding to the 1000 simulated averages is presented together with the theoretical Normal distribution \(\text{Normal}\left( x \mid \mu = 5, \text{sd} = 0.79 \right)\) and the theoretical mean plus/minus the theoretical variance (\(\mu \pm \sigma/\sqrt{n}\), black solid line) is compared with the sample mean plus/minus the sample standard deviation, both computed from the available 1000 average samples (\(\bar{x} \pm s\), black dotted line).

3. Show that the distribution is approximately normal.

The histogram and it’s corresponding estimated density is compared with the density corresponding to the (theoretical) Normal distribution, \(\text{Normal}(x \mid \mu, \sigma^2/n)\) when the sample size (correponding to the number of Exponential used to compute the average mean) is \(40\) and the number of simulations is 100, 1000 and 10000. As the number of simulations increases, the estimated density is approaching the density of the limiting Normal distribution.

The same comparison is considered the sample size set to \(100\). Note that in this case, compared the previous case, for equal number of simulations, the estimation is better, due to the increase number of samples for computing the average.

library(ggplot2)
expMeanSamples1<-as.data.frame(expMeanSamples1)
ggplot(expMeanSamples1, aes(sample=expMeanSamples1))+stat_qq()