Sypnosis

This study seeks to investigate the exponential distribution in R and comparing it with the Central Limit Theorem. This study will investigate the distribution of averages of 40 exponentials and simulating it one thousand times.

Introduction

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution that describes the time between events in a Poisson process, i.e. a process in which events occur continuously and independently at a constant average rate. The probability density function (pdf) of an exponential distribution is \[ f(x;\lambda) = \begin{cases} \lambda e^{-\lambda x} & x \ge 0, \\ 0 & x < 0. \end{cases}\]

The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is \[1/\lambda\] and the standard deviation is also \[1/\lambda.\]

Simulation

This sections seeks to simulate 40 averages of data following exponential distribution.

set.seed(1)         ##Setting seed 
lambda<-0.2         ##lambda equals 0.2
num_samples <-40    ##number for exponentials equals 40
num_sim <-1000      ##number for simulations equal 1000
simulation <-replicate(num_sim,rexp(num_samples,lambda))

Comparing Sample Mean versus Theoretical Mean

This section aims at comparing the the sample mean of the exponential distribution simulated above and comparing it with the theoretical mean of 1/lamda.

Sample_Mean<- mean(apply(simulation,2,mean))

The sample mean of the expnential distribution is 4.9900252

Theoretical_Mean <-1/lambda 

The theoreitcal mean of the expnential distribution is 5

Note It can be clearly seen that the sample mean (4.9900252) is approximately equal to the theoretical mean (5). The histogram below further shows this.

hist(apply(simulation,2,mean),breaks=50,main="Histogram of 40 samples from Exponential Distribution with lamda=0.2",xlab="Mean of the distribution",col=4)
abline(v=Sample_Mean,lwd="9",col="red")
abline(v=Theoretical_Mean,lwd="5",col="green")
legend('topright', c("Sample mean", "Theoretical mean"), lty=c(1,2), col=c("red", "green"))

Comparing Sample Variance versus Theoretical Variance

This section seeks to compare the sample variance of 40 data simulated from exponential distribution and compare same with the theoretical variance.

Sample_Stdev <- sd(apply(simulation,2,mean))
Sample_variance <- Sample_Stdev^2

The sample variance of the exponential distribution is 0.6111165

Theoretical_Variance <- (1/lambda)/sqrt(num_samples)

The theoretical variance of the exponential distribution is 0.7905694

Note It can be clearly seen that the sample mean (0.6111165) is approximately equal to the theoretical mean (0.7905694). The histogram below further shows this.

x <- seq(min(apply(simulation,2,mean)), max(apply(simulation,2,mean)), length=1000)
y <- dnorm(x, mean=Theoretical_Mean, sd=Theoretical_Variance)
hist(apply(simulation,2,mean),breaks=num_samples,prob=T,main="Histogram of 40 samples from Exponential Distribution with lamda=0.2",xlab="Mean of Distribution",col="blue")
lines(x,y,lwd=5,col="red")

Conclusion

The calcualated distribution of means of random sampled exponantial distribution is closely matched with a normal distribution as shown in the histogram above. This is in support of the central limit theorem that states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed, regardless of the underlying distribution [1].

Reference

[1]: Rice, John (1995), Mathematical Statistics and Data Analysis (Second ed.), Duxbury Press, ISBN 0-534-20934-3)