Central Limit Theorm (CLT) Demo

26 December 2017

Central Limit Theorm

The central Limit Theorm establishes that, in most situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution a bell curve, even if the original variables themselves are not normally distributed.

The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions

Description

In this project, we generate the random numbers by exponential distribution in R and compare it with the Central Limit Theorem. We will investigate the distribution of averages of n random numbers, where n is selected by the user from the first slider. We will do number of the simulations as per the input from the second slider.

The input is taken from slider 1 and it is used to generate exponential distribution. Here we assume some input values.

        set.seed(0)
        randomNumCount <- 50
        distribution <- rexp(randomNumCount,0.2)
        mean(distribution)

## [1] 4.787154

The mean of the original distribution is 4.7871537

Plot of exponential distribution

We see that it is not a normal distribution.

SIMULATIONS

We will do thousand simulations. This part of the code describes the simulation of the exponential distribution to 1000 values. We make a matrix of rows of the 1000 simulations and 50 columns for 50 exponential numbers we had before.

        set.seed(0)
        resample <- matrix(sample(distribution, randomNumCount * 1000, replace=TRUE), 1000 , randomNumCount)
        # apply mean funtion on the simulated matrix data
        resamplemean <- apply(resample,1,mean)
        df <- data.frame(x = resamplemean)
        # calculate the sample mean
        mean(df$x)

## [1] 4.77257

The mean of the simulated distribution is 4.77257

Plot of the averages of simulated distribution

This histogram clearly dipicts that the averages of samples out of random numbers follows a normal distribution and mean converges with the original population mean. Also the density curve proves the same.