Synopsis

The project consists of two parts: 1 Simulation Exercise to explore Statistical inference 2 Basic Inferential Data Analysis

Simulation Exercise

In this section will execute simulations and data analysises to illustrate application of the central limit theorem.

1 Show the sample mean and compare it to the theoretical mean distribution

Sample Mean

sampleMean <- mean(mean_data) # Mean of sample means
print (paste("Sample Mean result = ", sampleMean))
## [1] "Sample Mean result =  4.99336179838295"

Theoretical Mean

# the expected mean of the exponential distribution of rate = 1/lambda
theoretical_mean <- (1/lambda)
print (paste("Theoretical Mean = ", theoretical_mean))
## [1] "Theoretical Mean =  5"

# Histogram shows differences

hist(mean_data, col="light green", xlab = "Mean Average", main="Distribution of Exponential Average")
abline(v = theoretical_mean, col="yellow")
abline(v = sampleMean, col="blue")

# 2 Show the sample is (via variance) and compare it to the thoretical variance of the distribtution

The theoretical standard deviation of the distribution is also 1/lambda , which, for a lambda of 0.2 , equates to 5 . The variance is the square of the standard deviation, which is 25

# sample deviation & variance
sampledev <- sd(mean_data)
print(sampledev)
## [1] 0.8235759
# The variance is the square of the standard deviation
samplevar <- sampledev^2
print(samplevar)
## [1] 0.6782773

3 Show that the distribution is approximately normal

hist(mean_data, breaks = n, prob = T, col = "light green", xlab = "Means")
x <- seq(min(mean_data), max(mean_data), length = 100)
lines(x, dnorm(x, mean = 1/lambda, sd = (1/lambda/sqrt(n))), pch = 25, col = "blue")

### Means

qqnorm(mean_data)
qqline(mean_data, col = "green")

The distribution averages of 40 exponential is very close to a normal distribution.