Overview

One of the great advantages of using R, is the ability to simulate samples from various probability distributions and statistical models as real simulations is computationally intensive.In this report we are going analyse the exponential distribution which are simulated using with rexp(n, lambda) where lambda\(\lambda\) is the rate parameter and its mean and standard deviation is \(1/\lambda\) with averages of 40 exponential(0.2)s.

In this project we are going to illustrate the distribution of the mean of 40 exponentials by answering the below questions:
  1. Show the sample mean and compare it to the theoretical mean of the distribution.
  2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
  3. Show that the distribution is approximately normal.
   

Simulations

First get the means of the 1000 simulations where each simulation will contain 40 observations and compare them against the theoretical mean.

1. Sample Mean vs Theoretical Mean

Generate the sample data first using lambda = 0.2, n = 40, simulations = 1000.

 set.seed(284)
 means <- data.frame(x = sapply(1:numsim, function(x) {mean(rexp(n, lambda))}))

Now calculate sample mean of n=1000 and theoretical mean of exponential distribution

        sample_mean <- mean(means$x)
        theor_mean <- 1/lambda

We found the simulation mean of 1000 sample is 4.969464 which is very close to the theoretical mean of 5.

Histogram plot of the exponential distribution n = 1000

From the histogram we can prove that the mean and sample and theoretical mean is very close to each other.

2. Sample Variance vs Theoretical Population Variance

Next We will compare the variance present in the sample means of the 1000 simulations to the theoretical variance of the population. The variance of the sample means estimates the variance of the 1000 entries in the means vector times the sample size, 40. That is, Ï2=Var(samplemeans)ÃN.

Like mean comparison , the variance of sample mean value is 0.5905787 is also very close to the theoretical variance of the distribution is \(\sigma^2 / n = 1/(\lambda^2 n) = 1/(0.04 \times 40)\) =0.625.

3. Show that the distribution is approximately normal

As the CLT states that when the number of sample sizes increases then the distribution of averages of iid variables (properly normalized) becomes that of a standard normal,hence We compare the difference between the distribution of a large collection of random exponentials with the distribution of a large collection of averages of 40 exponentials by plotting

First we generate sample data containing 10,000 simulations each of sample size 40 using lambda = 0.2 as below. Then compare the distribution by plotting them in histogram as show in figure below

      set.seed(284)
      bignumsim= 10000
      bigmeans <- data.frame(x = sapply(1:bignumsim, function(x) {mean(rexp(n, lambda))}))

It is clear from the curve line, that sample distribution is approximately normal as the distribution of mean of random sampled exponantial distributions, as it overlaps very closely with the normal distribution for \(\lambda=0.2\)

Conclusion

For \(\lambda=0.2\) ,the sample distribution is approximately normal with Mean and Variance value match closely with theoritical mean \(\mu= \frac{1}{\lambda}\) and variance \(Var = \sigma^2\). Also We notice that as when we increase the sample size, the distribution of means follow a bell curve as of normal distribution sample and conclude that it is approximately normal.