This is the project for the statistical inference class. Simulation technique is used to explore inference and inferential data analysis. The project consists of two parts:

  1. A simulation exercise
  2. Basic inferential data analysis

1. A simulation exercise

In this project we will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter.
The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. We will investigate the distribution of averages of 40 exponentials.

# set seed for reproducability
set.seed(31)

# set lambda to 0.2
lambda <- 0.2

# total 40 exponential samples
n <- 40

# 1000 simulations
simulations <- 1000

# simulate the exponential samples
simulated_exponentials <- replicate(simulations, rexp(n, lambda))

# calculate mean of exponentials
means_exponentials <- apply(simulated_exponentials, 2, mean)
head(means_exponentials)
## [1] 5.631731 6.657405 5.450215 5.109764 4.916998 3.523230

Now with this simulation we will try to answer the questions asked in the project

1. Show the sample mean and compare it to the theoretical mean of the distribution.

Sample mean :-

sample_mean<- mean(means_exponentials)
sample_mean
## [1] 4.993867

Theoretical mean :-

theoretical_mean <- 1/lambda
theoretical_mean
## [1] 5

Below picture explains the comparision between above two means graphically.

hist(means_exponentials, xlab = "Mean", main = "Exponential Function Simulations")
abline(v = sample_mean, col = "red")
abline(v = theoretical_mean, col = "blue")

Inference 1

The sample mean 4.9938666 and theoretical mean 5 are approximately same. Hence the mean is very close to the centre of distribution of averages of 40 exponentials.

2. Show how variable of the sample is (via variance) and compare it to the theoretical variance of the distribution.

Standard Deviation of the distribution :-

SD_dist <- sd(means_exponentials)
SD_dist
## [1] 0.7931608

Standard Deviation of the theorytical expression :-

SD_theory <- 1/lambda/sqrt(n)
SD_theory
## [1] 0.7905694

Variance of Distribution :-

var_dist <- SD_dist^2
var_dist
## [1] 0.6291041

Variance of thorytical expression :-

var_theory <- SD_theory ^2
var_theory
## [1] 0.625

Inference 2

The distribution SD and theorytical SD are same. 0.7931608 ~ 0.7905694.
The distribution variance and theorytical variance are same. 0.6291041 ~ 0.625.

3. Show that the distribution is approximately normal.

xfit <- seq(min(means_exponentials), max(means_exponentials), length=100)
yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(n)))
hist(means_exponentials,breaks=n,prob=T,col="blue",xlab = "means",main="Density of means",ylab="density")
lines(xfit, yfit, pch=22, col="black", lty=5)

Compare the distribution of averages of 40 exponentials to a normal distribution

qqnorm(means_exponentials)
qqline(means_exponentials, col = 2)

Inference 3

The distribution of 40 exponentials almost similar to normal distribution.