Exponential distribution

author: Daria Alekseeva

In this project I will investigate the exponential distribution and compare it with the Central Limit Theorem. I will sumulate exponential distribution. The mean and the standard deviation of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Main parameters which will be used are: lambda = 0.2, sample size = 40 exponentials, number of simulations = 1000.


library(ggplot2)
# sample size
n = 40

# lambda
lambda = 0.2

# create exponential 
dist <- rexp(n*1000, lambda)
exp <- matrix(dist, ncol = n)

# check data dimention
dim(exp)
## [1] 1000   40

Sample mean and theoretical mean of the distribution

# find sample and theoretical means
sample_mean <- apply(exp, FUN = mean, 1)
mean(sample_mean)
## [1] 4.994471

Sample means are concentraded around 5. Mean of sample means is very close to 5.

theor_mean <- 1/lambda
theor_mean
## [1] 5

Theoretical mean of exponential distribution is 5.

How variable the sample is (via variance)? Compare it to the theoretical variance of the distribution

var_of_sample_mean <- var(sample_mean)
var_of_sample_mean
## [1] 0.5931883

Varience of sample means is concentrated around 0.625.

theor_var_of_sample <- (1/ lambda**2)/n
theor_var_of_sample
## [1] 0.625

Theoretical varience of exponential distribution sample mean is 0.625.

Exponential distributionis is approximately normal

# plot exponential distribution
qplot(dist, main ="Exponential Distribution Histogram")
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

In original exponential distribution dataset we had 40.000 samples. As we can see on the first plot it’s not normal distribution however in the second plot sample mean is plotted which appears to be normal distribution.

According to Central Limit Theorem all sample means create random variable with mean value very close to population mean and variance = population variance / sample size.

qplot(sample_mean, main = "Sample Mean Histogram")
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.