This is a summary of my page: https://dataZ4s.com/statistics/normal-distribution-characteristics/
As explained by the Empirical Rule, 68% of the area of the normal distribution is within one standard deviation of the mean and approximately 95% of the area is within two standard deviations of the mean:
Alt text
\[\mu = \text{Population mean}\] \[\sigma^2=\text{Population variance}\] \[\sigma=\text{Population standard deviation}\] \[\bar x=\text{Sample mean}\] \[s^2=\text{Sample variance}\] \[s=\text{Sample standard deviation}\] \[SE=\text{Standard error}\]
The higer n, the higher the proportion of data that center around the mean
This can also be deducted from the formula of teh sample variance: \[s^2_{n-1}=\frac{\sum^n_{i=1} (x_i-\bar x)^2}{n-1}\]
95% confidence interval examples
The critical value of the t-distribution is greater than the one of the normal distribution.
Calculating probabilities, percentiles and taking random samples from a normally distributed variable.
Example I will follow the example of X being normally distributed with a mean of 65 and a standard deviation of 4:
\[X \sim N\big(\mu=65, \sigma^2 = 4^2\big)\]
The pnorm command can be used to calculate probabilties for a normal random variable:
# P(X <= 60):
pnorm(q=60, mean = 65, sd = 4, lower.tail = T)
## [1] 0.1056498
# Can also be written:
pnorm(60,65,4)
## [1] 0.1056498
#P(X >= 75)
pnorm(75, 65, 4, F)
## [1] 0.006209665
pnorm can also be used to calculate Z, the standard normal
# P(Z >= 1)
pnorm(q=1.5, mean = 0, sd = 1, lower.tail = FALSE)
## [1] 0.0668072
pnorm(1.5,0,1,F)
## [1] 0.0668072
The qnorm function can be used to calculate quantiles or percentiles for a normal random variable
# Find first quartile (Q1)
qnorm(p=0.25, mean=65, sd=4, lower.tail = T)
## [1] 62.30204
the dnorm function can be used to find and/or plot the probability density function
# First, we create a sequence and assign this to x
x <- seq(from=50, to=80, by=0.25)
x
## [1] 50.00 50.25 50.50 50.75 51.00 51.25 51.50 51.75 52.00 52.25 52.50
## [12] 52.75 53.00 53.25 53.50 53.75 54.00 54.25 54.50 54.75 55.00 55.25
## [23] 55.50 55.75 56.00 56.25 56.50 56.75 57.00 57.25 57.50 57.75 58.00
## [34] 58.25 58.50 58.75 59.00 59.25 59.50 59.75 60.00 60.25 60.50 60.75
## [45] 61.00 61.25 61.50 61.75 62.00 62.25 62.50 62.75 63.00 63.25 63.50
## [56] 63.75 64.00 64.25 64.50 64.75 65.00 65.25 65.50 65.75 66.00 66.25
## [67] 66.50 66.75 67.00 67.25 67.50 67.75 68.00 68.25 68.50 68.75 69.00
## [78] 69.25 69.50 69.75 70.00 70.25 70.50 70.75 71.00 71.25 71.50 71.75
## [89] 72.00 72.25 72.50 72.75 73.00 73.25 73.50 73.75 74.00 74.25 74.50
## [100] 74.75 75.00 75.25 75.50 75.75 76.00 76.25 76.50 76.75 77.00 77.25
## [111] 77.50 77.75 78.00 78.25 78.50 78.75 79.00 79.25 79.50 79.75 80.00
# Find the value of the probabililty density function for each of these x-values
dens <- dnorm(x, mean=65, sd=4)
# Adding a vertical line at our mu. The abline
plot(x, dens, type = "l", main = "Normal dist for X: Mean=65, s=4)", xlab = "x", ylab = "Probability density",las=1) + abline(v=65)
## integer(0)
The rnorm function can be used to draw a random sample from a normally distributed population
rand30 <- rnorm(n=30, mean=65, sd=4)
rand30
## [1] 64.24202 63.90724 63.18491 69.22113 60.49885 65.74011 62.94402
## [8] 54.97123 66.95889 64.87745 63.71434 62.96648 63.28726 66.32949
## [15] 64.39445 68.28887 62.09151 67.42206 62.19808 65.99918 58.14922
## [22] 67.31560 65.19037 58.80577 61.65911 62.14693 73.73459 62.77703
## [29] 53.21033 66.37164
hist(rand30)
Though the sample is taken from a normally distributed population, the sample might not look normally distributed, specially with small sample sizes like this.
Carsten Grube
Sharing and freelancing from my site: https://dataZ4s.com
carsten@dataZ4s.com