This report performs simulation to investigate the exponential distribution analysis using R.

Synopsis

The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. This simulation investigates the distribution of averages of 40 exponential(0.2)s.

Analysis

Show where the distribution is centered at and compare it to the theoretical center of the distribution.

The estimated or theoretical mean for the distribution is 1/0.2, which is 5. The below code performs 40 and 1000 simulations for the above distribution and stores the mean in variables forty and thousand respectively.

est<-1/0.2
set.seed(1000)
forty<-mean(rexp(40,0.2))
thousand<-mean(rexp(1000,0.2))

Performing 40 simulations generates mean 4.457 and performing 1000 simulations generates mean 5.06, which is closer to the estimated/theoretical mean. Hence, concluding that as the number of simulations increases, the mean becomes closer to the theoretical mean.

Show how variable it is and compare it to the theoretical variance of the distribution.

Theoretical standard deviation is 1/(0.2*sqrt(n)), which is 0.79 for 40 simulations, and 0.158 for 1000 simulations. The estimated variance is 0.6241 for 40 simulations and 0.025 for 1000 simulations. The below code calculates the variance for 40 and 1000 simulations and stores them in variables f and t respectively.

f<-var(data.frame(x = sapply(1:40, function(x) {mean(rexp(40, 0.2))})))
t<-var(data.frame(x = sapply(1:1000, function(x) {mean(rexp(1000, 0.2))})))

The calculated variance is 0.68 for 40 simulations and 0.0247 for 1000 simulations, which are both very close to the estimated variances.

Show that the distribution is approximately normal.

For standard distribution, standard deviation is 1/sqrt(n). at n = 40, this is 0.158, which is the same as the standard deviation received above for 1000 simulations.

The below code generates the bell curve for the same.

library(ggplot2)
temp<-data.frame(x = sapply(1:1000, function(x) {mean(rexp(1000, 0.2))}))
qplot(temp, geom="blank") + geom_histogram(aes(y = ..density..), colour = "black", fill = "salmon") +stat_density(geom = "line", colour = "black") 
## Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## ymax not defined: adjusting position using y instead

plot of chunk unnamed-chunk-2

Evaluate the coverage of the confidence interval for 1/lambda: X¯±1.96S/√n.

The below code uses seed 1000 which has been set previously to generate lower and upper limits of the confidence interval at 4.877 and 4.94 respectively.

upper<-thousand + 1.96*sqrt(t/100)
lower<-thousand - 1.96*sqrt(t/100)