Statistical Inference Course Project

Overview

In this project I will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. I will set lambda = 0.2 for all of the simulations. I will investigate the distribution of averages of 40 exponentials. Note that I will need to do a thousand simulations.

Simulations

# load neccesary libraries
library(ggplot2)

# set constants
λ <- 0.2# lambda for rexp
n <- 40 # number of exponetials
numberOfSimulations <- 1000 # number of tests

# set the seed to create reproducability
set.seed(11081979)

# run the test resulting in n x numberOfSimulations matrix
exponentialDistributions <- matrix(data=rexp(n * numberOfSimulations, λ), nrow=numberOfSimulations)
exponentialDistributionMeans <- data.frame(means=apply(exponentialDistributions, 1, mean))

Sample Mean versus Theoretical Mean

The expected mean \(μ\) of a exponential distribution of rate \(λ\) is

\(μ = \frac{1}{λ}\)

μ <- 1/λ
μ
## [1] 5

Let \(\bar X\) be the average sample mean of 1000 simulations of 40 randomly sampled exponential distributions.

meanOfMeans <- mean(exponentialDistributionMeans$means)
meanOfMeans
## [1] 5.027126

As you can see the expected mean and the avarage sample mean are very close

Sample Variance versus Theoretical Variance

The expected standard deviation \(σ\) of a exponential distribution of rate \(λ\) is

\(σ = \frac{1/λ}{\sqrt{n}}\)

The e

σ <- 1/λ/sqrt(n)
σ
## [1] 0.7905694

The variance \(Var\) of standard deviation \(σ\) is

\(Var = σ^2\)

Var <- σ^2
Var
## [1] 0.625

Let \(Var_x\) be the variance of the average sample mean of 1000 simulations of 40 randomly sampled exponential distribution, and \(σ_x\) the corresponding standard deviation.

σ_x <- sd(exponentialDistributionMeans$means)
σ_x
## [1] 0.8020334
Var_x <- var(exponentialDistributionMeans$means)
Var_x
## [1] 0.6432577

As you can see the standard deviations are very close Since variance is the square of the standard deviations, minor differnces will we enhanced, but are still pretty close.

Distribution

Comparing the population means & standard deviation with a normal distribution of the expected values. Added lines for the calculated and expected means

As you can see from the graph, the calculated distribution of means of random sampled exponantial distributions, overlaps quite nice with the normal distribution with the expected values based on the given lamba