In order to understand the normal distribution, you must understand the concept of the mean and standard deviation.
Mean
The mean, also known as the average, is the value you get when you divide the sum of a series of observations by the total quantity of observations.
For example:
- Let’s say we have 4 observations: 5,5,5,5
- The average of these 4 observations is = (5+5+5+5)/4 = 20/4 = 5
Standard Deviation
The standard deviation is a measure used in statistics to evaluate how widely dispersed observations are from the mean.
- Lets say we have 5 observations: 1,1,2,1,1
- The mean of these observations is = (1+1+2+1+1)/5 = 1.2
- For each observation, subtract the mean & square the result:
- observation 1: 1 - 1.2 = -.2^2 = .04
- observation 2: 1 - 1.2 = -.2^2 = .04
- obseration 3: 2 - 1.2 = -.2^2 = .64
- obseration 4: 1 - 1.2 = -.2^2 = .04
- observation 5: 1 - 1.2 = -.2^2 = .04
- The mean of these squared differences = (.04+.04+.64+.04+.04)/5 = 0.168
- The square root of 0.168 = .40
The standard deviation for these 5 observations is .40.
Normal Distribution
A Normal distribution refers to a set of observations where the highest concentration of observations occurs around the mean(average), with fewer observations occuring towards the tails (far from the mean).
In order to be considered a normal distribution:
- 68% of observed values should fall within 1-standard deviation from the mean
- 95% of observed values should fall within 2-standard deviation from the mean
- 99% of observed values should fall within 3-standard deviation from the mean
Normal Distribution, Mean, & Standard Deviation Visualized
Below is a visualization of a normal distribution.
- The blue line indicates the mean value.
- The red bars indicate 1 standard deviation above/below the mean.
library(ggplot2)
library(dplyr)
library(knitr)
groupA <- c(rnorm(1500, mean = 0)) #Produce list of random values w/ mean of 0 & normal distribtion
normal_hist_A <- data.frame(groupA)
ggplot(normal_hist_A)+
geom_histogram(aes(x=groupA),fill="salmon") +
geom_vline(aes(xintercept=mean(normal_hist_A$groupA)), color="blue")+
geom_vline(aes(xintercept=(mean(normal_hist_A$groupA)+sd(normal_hist_A$groupA))),color="red")+
geom_vline(aes(xintercept=(mean(normal_hist_A$groupA)-sd(normal_hist_A$groupA))),color="red")+
theme_minimal()+ xlab("observed value")+ ylab("number of observations")

LS0tCnRpdGxlOiAiVGhlIE5vcm1hbCBEaXN0cmlidXRpb24iCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCjxici8+Cgo8aHIgd2lkdGg9IjkwJSI+CgpJbiBvcmRlciB0byB1bmRlcnN0YW5kIHRoZSBub3JtYWwgZGlzdHJpYnV0aW9uLCB5b3UgbXVzdCB1bmRlcnN0YW5kIHRoZSBjb25jZXB0IG9mIHRoZSAqKm1lYW4qKiBhbmQgKipzdGFuZGFyZCBkZXZpYXRpb24qKi4KCjxociB3aWR0aD0iOTAlIj4KCgojI01lYW4KCjxociB3aWR0aD0iOTAlIj4KCjxici8+CgpUaGUgKiptZWFuKiosIGFsc28ga25vd24gYXMgdGhlICoqYXZlcmFnZSoqLCBpcyB0aGUgdmFsdWUgeW91IGdldCB3aGVuIHlvdSBkaXZpZGUgdGhlIHN1bSBvZiBhIHNlcmllcyBvZiBvYnNlcnZhdGlvbnMgYnkgdGhlIHRvdGFsIHF1YW50aXR5IG9mIG9ic2VydmF0aW9ucy4KCkZvciBleGFtcGxlOgoKKiBMZXQncyBzYXkgd2UgaGF2ZSA0IG9ic2VydmF0aW9uczogNSw1LDUsNQoqIFRoZSBhdmVyYWdlIG9mIHRoZXNlIDQgb2JzZXJ2YXRpb25zIGlzID0gKDUrNSs1KzUpLzQgPSAyMC80ID0gKio1KioKCjxici8+Cgo8aHIgd2lkdGg9IjkwJSI+CgojI1N0YW5kYXJkIERldmlhdGlvbgoKPGhyIHdpZHRoPSI5MCUiPgoKPGJyLz4KClRoZSBzdGFuZGFyZCBkZXZpYXRpb24gaXMgYSBtZWFzdXJlIHVzZWQgaW4gc3RhdGlzdGljcyB0byBldmFsdWF0ZSBob3cgd2lkZWx5IGRpc3BlcnNlZCBvYnNlcnZhdGlvbnMgYXJlIGZyb20gdGhlIG1lYW4uIAoKKiBMZXRzIHNheSB3ZSBoYXZlIDUgb2JzZXJ2YXRpb25zOiAxLDEsMiwxLDEKKiBUaGUgbWVhbiBvZiB0aGVzZSBvYnNlcnZhdGlvbnMgaXMgPSAoMSsxKzIrMSsxKS81ID0gMS4yCiogRm9yIGVhY2ggb2JzZXJ2YXRpb24sIHN1YnRyYWN0IHRoZSBtZWFuICYgc3F1YXJlIHRoZSByZXN1bHQ6CiAgICArIG9ic2VydmF0aW9uIDE6IDEgLSAxLjIgPSAtLjJeMiA9IC4wNAogICAgKyBvYnNlcnZhdGlvbiAyOiAxIC0gMS4yID0gLS4yXjIgPSAuMDQKICAgICsgb2JzZXJhdGlvbiAzOiAyIC0gMS4yID0gLS4yXjIgPSAuNjQKICAgICsgb2JzZXJhdGlvbiA0OiAxIC0gMS4yID0gLS4yXjIgPSAuMDQKICAgICsgb2JzZXJ2YXRpb24gNTogMSAtIDEuMiA9IC0uMl4yID0gLjA0CiAgICArIFRoZSBtZWFuIG9mIHRoZXNlIHNxdWFyZWQgZGlmZmVyZW5jZXMgPSAoLjA0Ky4wNCsuNjQrLjA0Ky4wNCkvNSA9IDAuMTY4CiAgICArIFRoZSBzcXVhcmUgcm9vdCBvZiAwLjE2OCA9IC40MAoKVGhlIHN0YW5kYXJkIGRldmlhdGlvbiBmb3IgdGhlc2UgNSBvYnNlcnZhdGlvbnMgaXMgLjQwLgoKCgo8YnIvPgoKPGhyIHdpZHRoPSI5MCUiPgoKIyNOb3JtYWwgRGlzdHJpYnV0aW9uCgo8aHIgd2lkdGg9IjkwJSI+Cgo8YnIvPgoKQSBOb3JtYWwgZGlzdHJpYnV0aW9uIHJlZmVycyB0byBhIHNldCBvZiBvYnNlcnZhdGlvbnMgd2hlcmUgdGhlIGhpZ2hlc3QgY29uY2VudHJhdGlvbiBvZiBvYnNlcnZhdGlvbnMgb2NjdXJzIGFyb3VuZCB0aGUgbWVhbihhdmVyYWdlKSwgd2l0aCBmZXdlciBvYnNlcnZhdGlvbnMgb2NjdXJpbmcgdG93YXJkcyB0aGUgdGFpbHMgKGZhciBmcm9tIHRoZSBtZWFuKS4gCgojIyMjI0luIG9yZGVyIHRvIGJlIGNvbnNpZGVyZWQgYSBub3JtYWwgZGlzdHJpYnV0aW9uOgoKKiA2OCUgb2Ygb2JzZXJ2ZWQgdmFsdWVzIHNob3VsZCBmYWxsIHdpdGhpbiAxLXN0YW5kYXJkIGRldmlhdGlvbiBmcm9tIHRoZSBtZWFuIAoqIDk1JSBvZiBvYnNlcnZlZCB2YWx1ZXMgc2hvdWxkIGZhbGwgd2l0aGluIDItc3RhbmRhcmQgZGV2aWF0aW9uIGZyb20gdGhlIG1lYW4KKiA5OSUgb2Ygb2JzZXJ2ZWQgdmFsdWVzIHNob3VsZCBmYWxsIHdpdGhpbiAzLXN0YW5kYXJkIGRldmlhdGlvbiBmcm9tIHRoZSBtZWFuCgo8YnIvPgoKIyMjIyNOb3JtYWwgRGlzdHJpYnV0aW9uLCBNZWFuLCAmIFN0YW5kYXJkIERldmlhdGlvbiBWaXN1YWxpemVkCgpCZWxvdyBpcyBhIHZpc3VhbGl6YXRpb24gb2YgYSBub3JtYWwgZGlzdHJpYnV0aW9uLgoKKiBUaGUgKipibHVlKiogbGluZSBpbmRpY2F0ZXMgdGhlIG1lYW4gdmFsdWUuCiogVGhlICoqcmVkKiogYmFycyBpbmRpY2F0ZSAxIHN0YW5kYXJkIGRldmlhdGlvbiBhYm92ZS9iZWxvdyB0aGUgbWVhbi4KCgpgYGB7ciwgZWNobz1UUlVFLCBmaWcuaGVpZ2h0PTIsIGZpZy53aWR0aD03LCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQpsaWJyYXJ5KGdncGxvdDIpCmxpYnJhcnkoZHBseXIpCmxpYnJhcnkoa25pdHIpCgoKZ3JvdXBBIDwtIGMocm5vcm0oMTUwMCwgbWVhbiA9IDApKSAgI1Byb2R1Y2UgbGlzdCBvZiByYW5kb20gdmFsdWVzIHcvIG1lYW4gb2YgMCAmIG5vcm1hbCBkaXN0cmlidGlvbgpub3JtYWxfaGlzdF9BIDwtIGRhdGEuZnJhbWUoZ3JvdXBBKQoKZ2dwbG90KG5vcm1hbF9oaXN0X0EpKyAKICBnZW9tX2hpc3RvZ3JhbShhZXMoeD1ncm91cEEpLGZpbGw9InNhbG1vbiIpICsKICAgIGdlb21fdmxpbmUoYWVzKHhpbnRlcmNlcHQ9bWVhbihub3JtYWxfaGlzdF9BJGdyb3VwQSkpLCBjb2xvcj0iYmx1ZSIpKwogICAgZ2VvbV92bGluZShhZXMoeGludGVyY2VwdD0obWVhbihub3JtYWxfaGlzdF9BJGdyb3VwQSkrc2Qobm9ybWFsX2hpc3RfQSRncm91cEEpKSksY29sb3I9InJlZCIpKwogICAgZ2VvbV92bGluZShhZXMoeGludGVyY2VwdD0obWVhbihub3JtYWxfaGlzdF9BJGdyb3VwQSktc2Qobm9ybWFsX2hpc3RfQSRncm91cEEpKSksY29sb3I9InJlZCIpKwogICAgdGhlbWVfbWluaW1hbCgpKyAgeGxhYigib2JzZXJ2ZWQgdmFsdWUiKSsgeWxhYigibnVtYmVyIG9mIG9ic2VydmF0aW9ucyIpCgpgYGAKCjxici8+Cjxici8+Cjxici8+