This is a research of exponential distribution where I compare sample mean and variance to their theoretical values. And verify Central Limit Theorem that states that averages have are approximately normal distribution.
## Warning: package 'ggplot2' was built under R version 3.1.3
Perform 1000 simulations of 40 random exponentials with lambda = 0.2
nosim <- 1000
n <- 40
lambda <- 0.2
data <- matrix(rexp(nosim * n, lambda), nosim)
#Theoretical mean
t_mean <- 1/lambda
#Simulation means
s_mean <- data.frame(apply(data, 1, mean))
names(s_mean) <- c("mean")
(ggplot(s_mean, aes(x = mean))
+ geom_histogram(colour="black", fill="white", binwidth=0.1)
+ geom_vline(aes(xintercept = t_mean), size = 1))
From the plot we can see that distribution of means of exponential distrubution is centered around its teoretical mean i.e. 1/lambda = 5
#Theoretical variance
t_var <- (1/lambda)^2
#Simulation variances
s_var <- data.frame(apply(data, 1, var))
names(s_var) <- c("var")
(ggplot(s_var, aes(x = var))
+ geom_histogram(colour="black", fill="white", binwidth=1)
+ geom_vline(aes(xintercept = t_var), size = 1))
From the plot we can see that distribution of variances of exponential distrubution is more or less centered around its teoretical variance i.e. \(1/lambda^2 = 25\). The bias can be explained by small \(n=40\) during each simulation.
#Theoretical mean and standard error
t_mean <- 1/lambda
t_se <- (1/lambda)/sqrt(n)
#Simulated normalized means
s_norm_mean <- data.frame((apply(data, 1, mean) - t_mean) / t_se)
names(s_norm_mean) = c("mean")
(ggplot(s_norm_mean, aes(x=mean))
+ geom_histogram(aes(y=..density..), colour="black", fill="white", binwidth=0.1)
+ stat_function(fun = dnorm, size = 2))
From the plot we can see that distribution of normalized means of exponential distribution looks very similar to standard normal distribution.