Part 1: Simulation Exercise Instructions

OVERVIEW

The file will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations.Down there,will show how it investigates the distribution of averages of 40 exponentials. And for the project , will do a thousand simulations.

Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials,will show:

1.Show the sample mean and compare it to the theoretical mean of the distribution. 2.Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution. 3.Show that the distribution is approximately normal.

Simulations

First,we will simulate 1000 times with 30 exponetials and the lambda is 0.2:

set.seed(101)
sims = 1000;
n = 40; ## number of distributions
lambda = 0.2; ## number of simulations
means <- vector("numeric")
means_sum <- vector("numeric")
means_cum <- vector("numeric")

for (i in 1:sims) { means[i] <- mean(rexp(n, lambda))}
means_sum[1] <- means[1]
for (i in 2:sims) { means_sum[i] <- means_sum[i-1] + means[i] }
for (i in 1:sims) { means_cum[i] <- means_sum[i]/i }

Sample Mean versus Theoretical Mean:

As u can see from the plot, the sample means is closer to the theoretical mean ,let see more detail:

sample_means <-means_cum[sims]
theoretical_mean<- 1 / lambda

sample_means |theoretical_mean 5.030641 |5.0

Sample Variance vs Theoretical Population Variance

Now , check out how it looks like from the histergram:

g2<-ggplot(data =data.frame(x=means), aes(x = x))
g2<-g2+ geom_histogram(binwidth=0.1, aes(y=..density..)) 
g2<-g2+labs(x="Means") 
g2<-g2+labs(y="Density")
g2

So , about the difference about the sample variance and theoretical population variance:

By looking at the graph we can see that the distribution of the simulated means (blue) approaches the normal distribution (red) and that their means (blue and red vertical lines, respectively) approach each other as well.

theoreticalVariance | sampleVariance 0.625 | 0.624848

Distribution of Sample Means vs Normal Distribution

g3<-g2+stat_function(fun = dnorm,args =  list(mean = 1/lambda , sd = sampleSD), colour = "red", size=2) 
g3<-g3+geom_vline(xintercept =  1/lambda, size=1, colour="red")
g3<-g3+geom_vline(xintercept = sample_means, size=1, colour="blue") 
g3<-g3+geom_density(colour="blue", size=2) 
g3