This is the project for the statistical inference class. Simulation technique is used to explore inference and inferential data analysis. The project consists of two parts:
In this project we will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter.
The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. We will investigate the distribution of averages of 40 exponentials.
# set seed for reproducability
set.seed(31)
# set lambda to 0.2
lambda <- 0.2
# total 40 exponential samples
n <- 40
# 1000 simulations
simulations <- 1000
# simulate the exponential samples
simulated_exponentials <- replicate(simulations, rexp(n, lambda))
# calculate mean of exponentials
means_exponentials <- apply(simulated_exponentials, 2, mean)
head(means_exponentials)
## [1] 5.631731 6.657405 5.450215 5.109764 4.916998 3.523230
Now with this simulation we will try to answer the questions asked in the project
Sample mean :-
sample_mean<- mean(means_exponentials)
sample_mean
## [1] 4.993867
Theoretical mean :-
theoretical_mean <- 1/lambda
theoretical_mean
## [1] 5
Below picture explains the comparision between above two means graphically.
hist(means_exponentials, xlab = "Mean", main = "Exponential Function Simulations")
abline(v = sample_mean, col = "red")
abline(v = theoretical_mean, col = "blue")
The sample mean 4.9938666 and theoretical mean 5 are approximately same. Hence the mean is very close to the centre of distribution of averages of 40 exponentials.
Standard Deviation of the distribution :-
SD_dist <- sd(means_exponentials)
SD_dist
## [1] 0.7931608
Standard Deviation of the theorytical expression :-
SD_theory <- 1/lambda/sqrt(n)
SD_theory
## [1] 0.7905694
Variance of Distribution :-
var_dist <- SD_dist^2
var_dist
## [1] 0.6291041
Variance of thorytical expression :-
var_theory <- SD_theory ^2
var_theory
## [1] 0.625
The distribution SD and theorytical SD are same. 0.7931608 ~ 0.7905694.
The distribution variance and theorytical variance are same. 0.6291041 ~ 0.625.
xfit <- seq(min(means_exponentials), max(means_exponentials), length=100)
yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(n)))
hist(means_exponentials,breaks=n,prob=T,col="blue",xlab = "means",main="Density of means",ylab="density")
lines(xfit, yfit, pch=22, col="black", lty=5)
Compare the distribution of averages of 40 exponentials to a normal distribution
qqnorm(means_exponentials)
qqline(means_exponentials, col = 2)
The distribution of 40 exponentials almost similar to normal distribution.