In this project I will investigate the exponential distribution in R and compare it with the Central Limit Theorem. I simulate the exponential distribution with rexp(n, lambda) where lambda is the rate parameter.
The theoritical mean and standard deviation for an exponential distibutin are equal that is 1/lambda.
The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda.
For this project I Set lambda = 0.2 for all of the simulations. I investigate the distribution of averages of 40 exponentials for a thousand simulations.
Here we aim to :
Here we define variable that will remain constant for this project.
lambda <- 0.2
Theoritical_sd = Theoritical_mean <- 1 / lambda
sim_num <- 1000 # Number of simulations
sample_size <- 40 # Number of exponential in a sample(each simulation)
set.seed(10)
# Generate 40000 values with an exponential distribution
exp_dist <- rexp(sim_num * sample_size, lambda)
exp_distmeans <- apply(matrix(exp_dist, sim_num), 1, mean)
simulated_mean <- mean(exp_distmeans)
hist(exp_distmeans, main = "Distribution For Simulated means", xlab = "Sample Means", col = "pink")
abline(v = Theoritical_mean, col = "red")
abline(v = simulated_mean, col = "green")
From the plot the vertical red line represents the theoritical mean for an expontial distribution while green line represent the simulated mean.From this we can conculude that the simulated mean estimate the theoritical mean.
Theoritical_se <- Theoritical_sd / sqrt(40) # Standard Error
Theoritical_var <- Theoritical_se^2 # The theoritical variance
simulated_sd <- sd(exp_distmeans) # sample distribution standard error
simulated_var <- simulated_sd^2
The sample standard deviation is 0.80881 while the theoritical sd is 0.7905694.
The sample variance is 0.6541736 while the theoritical variance is 0.625.From this we can conculude that the simulated sample variance estimate the theoritical variance.
qqnorm(exp_distmeans)
qqline(exp_distmeans)
The qq-plot suggest normality in the distribution of the simulated sample means.
hist(exp_distmeans, main = "Distributions Normality", probability = TRUE,
breaks = 20, xlab = "")
# Density for the sample means
lines(density(exp_distmeans), col = "red")
# Theoretical density of the averages of samples
xfit <- seq(min(exp_distmeans), max(exp_distmeans), length = 200)
yfit <- dnorm(xfit, mean = Theoritical_mean, sd = Theoritical_sd/sqrt(sample_size))
lines(xfit, yfit, pch=22, col="green", lty=2)
From this the green line is theoritical normal distribution while the red line is the distribution from the simulated samples and seem to estimate the normal distribution.
This plots makes us conclude that the simulated distribution is approximately normal.