Exponential distribution in R vs. Central Limit Theorem

Jill Beck

March 28 2016

Project Overview

In this project you will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. You will investigate the distribution of averages of 40 exponentials. Note that you will need to do a thousand simulations.

Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials. You should:

  1. Show the sample mean and compare it to the theoretical mean of the distribution.
  2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
  3. Show that the distribution is approximately normal.

Project Simulation

Sample Mean vs. Theoretical Mean

First we analyze the sample mean and compare it to the theoretical mean.

set.seed(1)
lambda <- 0.2 ## Set lambda as per instructions
nexp <- 40 ## number of distributions
nsim <- 1000 ## number of simulations
mns <- NULL  ## set msn to null 
for (i in 1 : nsim) mns <- c(mns, mean(rexp(40,lambda)))
hist(mns,col="red",main="Distribution of Means of rexp")

lambda <- 0.2 ## Set lambda as per instructions (does not change)

nexp <- 40 ## number of distributions (does not change)

nsim <- 1000 ## number of simulations (does not change)

varxp <- ((1/lambda)^2)/nexp ## theoretical variance varmean <- var(mns) ## variance of the means

library(ggplot2) plotdata <- data.frame(mns) plot1 <- ggplot(plotdata,aes(x = mns)) plot1 <- plot1 +geom_histogram(aes(y=..density..), colour=“black”,fill=“blue”) plot1<-plot1+labs(title=“Distribution of Means of rexp”, y=“Density”) plot1<-plot1 +stat_function(fun=dnorm,args=list( mean=1/lambda, sd=sqrt(varxp)),color = “red”, size = 1.0) plot1<-plot1 +stat_function(fun=dnorm,args=list( mean=mean(mns), sd=sqrt(varmean)),color = “black”, size = 1.0) print(plot1)

set.seed(1) lambda <- 0.2 ## Set lambda as per instructions nexp <- 40 ## number of distributions nsim <- 100000 ## number of simulations mns <- NULL ## set msn to null for (i in 1 : nsim) mns <- c(mns, mean(rexp(40,lambda))) varxp <- ((1/lambda)^2)/nexp ## theoretical variance varmean <- var(mns) ## variance of the means

library(ggplot2) plotdata <- data.frame(mns) plot1 <- ggplot(plotdata,aes(x = mns)) plot1 <- plot1 +geom_histogram(aes(y=..density..), colour=“black”,fill=“blue”) plot1<-plot1+labs(title=“Distribution of Means of rexp”, y=“Density”) plot1<-plot1 +stat_function(fun=dnorm,args=list( mean=1/lambda, sd=sqrt(varxp)),color = “red”, size = 1.0) plot1<-plot1 +stat_function(fun=dnorm,args=list( mean=mean(mns), sd=sqrt(varmean)),color = “black”, size = 1.0) print(plot1)