2017 june 29

What is normal distribution?

According to wikipedia, normal (or Gaussian) distribution is "a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known". The mean and standard deviation in normally distributed numbers tends to 0 and 1

num_norm <- rnorm(1000, mean = 0, sd = 1)
mean(num_norm)
## [1] -0.01360619
sd (num_norm)
## [1] 0.9878264

CLT and normal distribution

One of the most important theories behind normal distribution is the Central Limit Theorem (CLT). Based on this theorem, when independent random values are added together, their sums tends towards normal distribution. This is true even when the values (originally) are not normaly distributed.

There are different approaches for CLT in independent sequances:

  • Classical CLT
  • Lyapunov CLT
  • Lindeberg CLT
  • Multidimensional CLT

The famous Bell curve

Normal distribution is often represented by the Bell Curve. This graph has it's own characteristics which, for example can be visualised using the standard deviation. 68% of a normally distributed sample will fall in the range of mean +- 1 sd (standard deviation), 95% in mean +- 2 sd, and 98% in mean +-3 sd.

The evolution of the Bell curve

My shiny application simulates the evolution of the Bell Curve by giving the opportunity to the user to set the size of the sample. The differences within the sample size (10, 100, 1.000, and 10.000) shapes the curve to look more and more like the one associated with normal distribution.