This project is for the Coursera - Statistical Inference class (Data Science specialization). It consists of two parts:
1. Simulation exercises
2. Basic inferential data analysis
The exponential distribution can be simulated in R with rexp(n, lambda) where n is the number of observations and lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also also 1/lambda.
In these simulation exercises, we investigate the distribution of averages of 40 exponentials over a thousand observations (n=100), assuming the lambda = 0.2
Let us create a thousand simulated averages of 40 exponentials, i.e. rexp(40,0.2)
expdist <- rep(NA,1000)
for (i in 1:1000){
expdist[i] <- mean(rexp(40,0.2))
}
Let’s calculate where the mean is centered. The theoretical center of the mean is 1/lambda = 1/0.2: 5
calcmean <- mean(expdist)
From above, we can see that the calculated mean is 5.0008 and the theoretical mean is 5; therefore the variation is negligible.
Let’s now see how variable the simulated distribution is compared to the theoretical. The theoretical variance is ((1/0.2)^2)/40 : 0.625
calcvar <- var(expdist)
From above, we can see that the calculated variance is 0.6132 and the theoretical variance is 0.625; therefore both distributions have similar variability.
Let’s now investigate whether the distribution resembles a normal distribution. We use the scale() function to plot the distribution.
expscale <- scale(expdist)
hist(expscale,probability=T, main="", ylim=c(0, 0.5))
lines(density(expscale))
# Compare with the standard normal distribution
curve(dnorm(x,0,1), -3, 3, col="red", add=T)
As can be seen from above plots, the distribution is approximately normal.
Let’s now evaluate the coverage of the confidence interval for 1/lambda: X¯±1.96Snv
# Calculate upper and lower limits using standard deviation as 1/lambda
lowercl <- expdist - qnorm(0.975) * (1/0.2)/sqrt(40)
uppercl <- expdist + qnorm(0.975) * (1/0.2)/sqrt(40)
expci <- mean(lowercl < (1/0.2) & uppercl > (1/0.2))
The confidence interval is thus expci : 95.9%