2023-04-15

What is Normal Distribution

A “Normal Distribution” is often called a “Gaussian Distribution”, and it is a continuous probability distribution for a random variable.

The equation for normal distribution is:

\({f(x)=}{1\over{\sigma\sqrt{2\pi}}}{e^{-{1\over2}{({{x-\mu}\over{\sigma}})^2}}}\)

Where \({\mu}\) is the mean, \({\sigma}\) is the standard deviation, and \({\sigma^2}\) is the variance of the distribution.

You may have heard of normal distributions being referred to as a bell curve .

Normal distributions or Gaussian distributions are incredibly important in statistics and analytics.

Mean

Mean is calculated as follows:

  • Discrete:
    \(\displaystyle R_{x} = \{x_1,x_2,x_3,\dots\},{\mu_x= \sum_{i}x_{i}\cdot P(X=x_{i})}\)
  • Continuous (density):
    \(\displaystyle{\mu_x= \int_{-\infty}^{\infty}x \cdot f(x)dx }\)
  • Sample Mean:
    \(\displaystyle{\bar{X}= {1\over n}\sum_{i=0}^{n}X_i}\)

Variance, and Standard Deviation

Discrete:
\({\displaystyle {\sigma_{x}^{2}=\sum_{i}(x_{i}-{\mu}x)^{2}{\cdot}P(X=x_i)}}\)
Continuous (density):
\({\displaystyle{\sigma_{x}^{2}}=\int_{-\infty}^{\infty}(x-{\mu}x)^{2}{\cdot}f(x)dx}\)
Standard Deviation:
\({\sigma_{x}=\sqrt{\sigma_{x}^{2}}}\)
Sample variance:
\({\displaystyle S^{2}={1\over{n-1}}\sum_{i=1}^{n}(X_{i}-{\bar{X}})^{2}}\)

A Histogram with Normal Distribution

set.seed(42) #Sets a randomization seed
df = data.frame(x = rnorm(n=100, mean=0, sd=1))
ggplot(df, aes(x=x)) + 
  geom_histogram(aes(y=after_stat(density)),bins=50, 
                 color= "blue", fill = "gold", na.rm = T) + 
  xlim(-4,4) + ggtitle("Normal Distribution, n=100") +
  theme(plot.title = element_text(color="blue", size=14, hjust=0.5))

Multiple Distributions, with Normal in Red

An Interactive Plotly Plot

References