Author : Stephanie Roark

Overview:

The exponential distribution is the probability distribution that describes the time between events in a Poisson process, a process in which events occur continuously and independently at a constant average rate. The exponential distribution is characterized as where X is an absolutely continuous random variable and in the set of positive real numbers, then X has an exponential distribution with rate parameter \(λ\) if its probability density function is: \(f(x) = λ*exp^(- λ*x)\).A random variable having an exponential distribution is also called an exponential random variable. As the number of exponential randon variables, \(n\), increases, the exponential distribution becomes approximately normally distribuated.

The Simulated Exponential Distribution:

The Exponenital Distribution as n gets larger:

Simulation of the Exponential Distribution:

The exponential distribution can be simulated in R with \(rexp(n, λ)\) where \(λ\) is the rate parameter. The mean or expected value of an exponential distribution is \(1/ λ\) and the standard deviation is also \(1/ λ\). The variance of an exponential distribution is given by \(1/ λ^2\).

Genertaing the sample exponential distribution:

n = 40   # number of exponentials
lambda = 0.2     #rate parameter chosen for the simulations
simnum <- 10000

simdata <- matrix(rexp(simnum * n, rate=lambda), simnum)
sim_rowmean <- apply(simdata,1,mean)
simdata_mean <- mean(sim_rowmean)
sim_sd <- sd(sim_rowmean)
sim_var <- var(sim_rowmean)

Calculating the Theoretical exponential distribution:

#Theoretical exponential distribution
Tmean = 1/lambda
Tsd = ((1/lambda) * (1/sqrt(n)))
Tvar = Tsd^2

Inferential Data Analysis:

Exponential Distribution Theoretical Value Sample Value
Mean 5 5.0104725
Varaiance 0.625 0.6215941
Standard Deviation 0.7905694 0.7884124

Plotting Histograms of the sample exponential distribution and means.

par(mfrow=c(1,2))
hist(simdata, col = "blue", main = "Simulated Exp Distribution", xlab = "40 Random Exponentials")
hist(sim_rowmean, col = "blue", main = "Means of Simulated Exponentials", xlab = "Average of 40 Exponentials")
abline(v = simdata_mean, col = "red", lwd = 2)
abline(v = Tmean, col = "green", lwd = 2)

Sample Mean versus Theoretical Mean:

The sample mean of the exponential distribution is centered at 5.0104725 while the theoretical mean on of the exponential distribution \(1/lamba\) is centered at 5.

Sample Variance versus Theoretical Variance:

The sample variance, \(1/λ^2 =\) 0.6215941, is approximatly equal to the theoretical variance 0.625.

Distribution:

The above graph of the distribution of the 40 exponentials vs the averages of the 40 exponentials demonstrates the approximation to the normal distribution whose characteristics are the area under the normal curve is equal to 1.0 with a greater density in the center and than in the tails.. The distribution of the averages of the exponentials becomes normal at large n, displaying the characteristic symmetrical bell shaped curve centered around the mean.

simdata_rowmean <- data.frame(sim_rowmean)

ggplot(simdata_rowmean,aes(x=sim_rowmean)) +
    geom_histogram(binwidth = lambda,fill="black",color="blue",aes(y = ..density..)) +
    labs(title="The Probablity Density of the Exponential Distribution with Large n", x="Mean of 40 Exponents", y="Density") +
    geom_vline(xintercept=simdata_mean,size=1.0, color="blue") + # add a line for the actual mean
    stat_function(fun=dnorm,args=list(mean=simdata_mean, sd=sim_sd),color = "yellow", size = 1.0) +
    geom_vline(xintercept=Tmean,size=1.0,color="yellow",linetype = "longdash") +
    stat_function(fun=dnorm,args=list(mean=Tmean, sd=Tsd),color = "red", size = 1.0) 

Summary

The Central Limit Theorem, or CLT, states that averages of independent and identically distributed (IID) random variables converge in distribution to the normal, i.e. they become normally distributed when the number of random variables, \(n\), is sufficiently large. The simulation of the exponential distribution at large n shows this convergence to the normal distribution.