Course Statistical Inference

Simulations

A thousand simulations are performed. The value for lambda is set at 0.2 and the distribution of means of 40 exponential distributions are used.

The simulations illustrate:

comparison of the sample mean with the theoretical mean of the distribution,
how variable the sample is and compare it to the theoretical variance of the distribution,
that the distribution is approximately normal.

Sample vs Theoretical means

Setup

set.seed(2612)
library(tidyverse)
library(ggrepel)
#library(qqplotr)

#install.packages("ggrepel")
#install.packages("qqplotr")

# given by the assignment
lambda <- 0.2

# number of exponential distributions to use in simulations
n_of_distributions <- 40

#number of simulations
number_of_simulations <- 1000 

#Running the simulations
simulations <- replicate(number_of_simulations, 
                         rexp(n_of_distributions, lambda))
glimpse(simulations)

##  num [1:40, 1:1000] 7.75 1.68 6.89 6.86 21.05 ...

Mean comparison

Theoretical mean is computed as 1 over lambda:

theoretical_mean <- 1/lambda
paste0('Theoretical mean: ', theoretical_mean)

## [1] "Theoretical mean: 5"

For a sample mean, I need to compute a mean for each simulation and then find a mean of the sample means.

simluated_means <- apply(simulations, 2, mean)

mean_of_sampled_means <- mean(simluated_means)
paste0('Sample mean: ', mean_of_sampled_means)

## [1] "Sample mean: 5.00834889322193"

Visually this can be displayed as:

Sample vs Theoretical variances

Theoretical variance of the exponential distribution is 1/lambda^2:

theoretical_variance <- 1/(lambda^2)
paste0('Theoretical variance: ', theoretical_variance)

## [1] "Theoretical variance: 25"

In the same way I calculated means for the simulated dataset, variance calculation is possible:

simulated_variances <- apply(simulations, 2, var)

mean_of_sampled_variances <- mean(simulated_variances)
paste0('Sample data variance: ', mean_of_sampled_variances)

## [1] "Sample data variance: 25.3161795497843"

Visually the differences can be displayed for example as following:

Distribution of the means is normally distributed

quantile-quantile (qq) plot

distribution density compared to normal distribution’s density if it had the same mean

If we find normal distribution with the same standard deviation and mean as our resampled means and compare the densities, we can “eyeball” the plot to see how close the densities are.

Course “Statistical Inference” - Project: Part 1

Alexander Barrantes Herrera