library("data.table")
library("ggplot2")
# set seed for random number consistency
set.seed(69)
# set lambda to 0.2
lambda <- 0.2
# assign samples (n) and simulation (sims) are allocated
n <- 40
sims <- 1000
# simulate the setup for repeated random distribution
sim_exp <- replicate(sims, rexp(n, lambda))
# calculate mean of exponential
means_exp <- apply(sim_exp, 2, mean)
Demonstrate distribution’s center line for experimental and theoretical mean
# calculate mean of the experimental simulation
e_mean <- mean(means_exp)
e_mean
## [1] 5.015158
# calculate mean from theoretical expression
t_mean <- 1/lambda
t_mean
## [1] 5
# Generate the histogram for the experimental mean simulation while add the averages of
hist(means_exp,
xlab = "Mean",
main = "Exponential Function Simulation",
col = "#FEFFAC")
abline(v = e_mean, col = "#BB0000")
abline(v = t_mean, col = "#1191D3")
The experimental mean is 5.015158 and the theoretical mean 5. The center of distribution of averages of 40 exponentials is very close to the theoretical center of the distribution.
Demonstrate distribution’s variance for experimental and theoretical distribution.
# standard deviation of the exponential simulation
stdev_e <- sd(means_exp)
stdev_e
## [1] 0.7982937
# standard deviation from theoretical expression
stdev_t <- (1/lambda)/sqrt(n)
stdev_t
## [1] 0.7905694
# variance of distribution
var_dist <- stdev_e^2
var_dist
## [1] 0.6372728
# variance from analytical expression
var_t <- ((1/lambda)*(1/sqrt(n)))^2
var_t
## [1] 0.625
Experimental stdev is 0.7982937 with the theoretical stdev calculated as 0.7905694. The Theoretical variance is calculated as ((1 / lambda) * (1/sqrt(n))^2 = 0.625 while the experimental variance is 0.6372728
Indicate the distribution is approximately normal.
x_set <- seq(min(means_exp), max(means_exp), length=100)
y_set <- dnorm(x_set, mean=1/lambda, sd=(1/lambda/sqrt(n)))
hist(means_exp,
breaks=n,
prob=TRUE,
col="#1191D3",
xlab = "Means",
main="Density of Means",
ylab="Density")
lines(x_set, y_set, pch=22, col="#BB0000", lty=5)
# Compare the distribution of averages of 40 exponentials to a normal distribution
qqnorm(means_exp)
qqline(means_exp, col = 2)
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 498590 26.7 1102717 58.9 643651 34.4
## Vcells 938393 7.2 8388608 64.0 1649449 12.6
Based on Central Limit Theorem (CLT), the distribution of 40 exponentials averages is very close to a normal distribution.