Normal distribution is a continuous, symetric, bell-shaped probability. The distribution is define by its mean (μ) , and its spread determined by the standar deviation (σ)
2026-04-09
Normal distribution is a continuous, symetric, bell-shaped probability. The distribution is define by its mean (μ) , and its spread determined by the standar deviation (σ)
The bell-shape is define by the probability density function (PDF) \[ f(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi},\sigma} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]
While the area under the curve is define by the cumulative distribution function (CDF). Which returns the cumulative probability that a given number occurs. \[ F(x) = P(X \le x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2\pi}\,\sigma} e^{-\frac{(t-\mu)^2}{2\sigma^2}}\,dt \]
On R studio the PDF is calculated using the function “dnorm(x, mean = μ, sd = σ)” \[ f(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi},\sigma} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]
gas_prices <-c (4.12, 4.35, 4.58, 4.41, 4.77, 4.29, 4.66,
4.51, 4.38, 4.72, 4.19, 4.55, 4.63, 4.44,
4.31, 4.69, 4.53, 4.47, 4.24, 4.60
)
distribution <-dnorm(gas_prices,mean(gas_prices),sd(gas_prices))
distribution
## [1] 0.3468815 1.7413370 1.8263275 2.0504065 0.5831476 1.3293805 1.2864139 ## [8] 2.1245425 1.9148993 0.8734603 0.6689962 1.9835487 1.5000159 2.1377892 ## [15] 1.4718683 1.0742281 2.0650238 2.1703047 0.9785988 1.7031069
The probability can be better understand with the characteristic bell-shape plot
ggplot(data.frame(x = gas_prices, y = distribution)) + aes(x = x, y = y) + geom_line() + labs(x = "Gas Price", y = "Density")
RStudio also allows us to find the cumulative probability that a given number, or a smaller number, occurs. To do so, we use the function pnorm (q,mean = μ, sd = σ, lower tail,log.p).
Here, q represent the value we are evaluating, lower tail determine which side of the distribution the probability is take from, and log.p represents the return, the probability is return normally or on the log scale.
pnorm(4.3,mean(gas_prices),sd(gas_prices),lower.tail = FALSE)
## [1] 0.8253018
It is important to highlight that pnormr() is left- tailed by default ( x≤ q)
To find the probability that X falls between two poits we must
\[ P(a < X < b) = P(X \le b) - P(X \le a) \] Therefore, we calculate the two probabilities and subtract them from each other
pnorm(4.40) - pnorm(4.25)
## [1] 5.275982e-06
pnorm(4.40) - pnorm(-4.25)
## [1] 0.9999839
min(gas_prices)
## [1] 4.12
Probability of falling between 4.25 to 4.40
Probability of X ≤ 4.25