Project Description

In this project you will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. You will investigate the distribution of averages of 40 exponentials. Note that you will need to do a thousand simulations.

Setting the parameter

n <- 40
lambda <- 0.2
simulation <- 1000

Calculation of theoretical values

mean_t = 1/lambda
sd_t = ((1/lambda) * (1/sqrt(n)))
var_t = sd_t^2

Calculation of actual values

data <- matrix(rexp(n*simulation, lambda), simulation)
row_means <- apply(data,1,mean)
mean_act <- mean(row_means)
sd_act <- sd(row_means)
var_act <- var(row_means)

Question

1. Show where the distribution is centered at and compare it to the theoretical center of the distribution.

The actual distribution is centered at 5.0450596 while the theoretical distribution is centered at 5

2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

Variable Theoretical Value Actual Value
Mean 5.0450596 5
Standard Deviation 0.80881 0.7905694
Variance 0.6541736 0.625

3. Show that the distribution is approximately normal.

dfrow_means<-data.frame(row_means) 
pp<-ggplot(dfrow_means,aes(x=dfrow_means))
pp<-pp+geom_histogram(binwidth = lambda,fill="pink",color="black",aes(y = ..density..))
pp<-pp + labs(title="Density of 40 Numbers from Exponential Distribution", x="Mean of 40 Selections", y="Density")
pp<-pp + geom_vline(xintercept=mean_act,size=1.0, color="black") # actual mean line
pp<-pp + stat_function(fun=dnorm,args=list(mean=mean_act, sd=sd_act),color = "blue", size = 1.0)
pp<-pp + geom_vline(xintercept=mean_t,size=1.0,color="yellow",linetype = "longdash")
pp<-pp + stat_function(fun=dnorm,args=list(mean=mean_t, sd=sd_t),color = "green", size = 1.0)
pp
## Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous.

The plot shows that Central Limit Theory works by trying to shape the actual data to follow the normal curve.