2026-04-09

What is normal distribution

Normal distribution is a continuous, symetric, bell-shaped probability. The distribution is define by its mean (μ) , and its spread determined by the standar deviation (σ)

Math model

The bell-shape is define by the probability density function (PDF) \[ f(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi},\sigma} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]

While the area under the curve is define by the cumulative distribution function (CDF). Which returns the cumulative probability that a given number occurs. \[ F(x) = P(X \le x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2\pi}\,\sigma} e^{-\frac{(t-\mu)^2}{2\sigma^2}}\,dt \]

R studio PDF

On R studio the PDF is calculated using the function “dnorm(x, mean = μ, sd = σ)” \[ f(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi},\sigma} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]

gas_prices <-c (4.12, 4.35, 4.58, 4.41, 4.77, 4.29, 4.66,
                4.51, 4.38, 4.72, 4.19, 4.55, 4.63, 4.44, 
                4.31, 4.69, 4.53, 4.47, 4.24, 4.60
)
distribution <-dnorm(gas_prices,mean(gas_prices),sd(gas_prices))
distribution
##  [1] 0.3468815 1.7413370 1.8263275 2.0504065 0.5831476 1.3293805 1.2864139
##  [8] 2.1245425 1.9148993 0.8734603 0.6689962 1.9835487 1.5000159 2.1377892
## [15] 1.4718683 1.0742281 2.0650238 2.1703047 0.9785988 1.7031069

Gas prices plot

The probability can be better understand with the characteristic bell-shape plot

ggplot(data.frame(x = gas_prices, y = distribution)) +
  aes(x = x, y = y) +
  geom_line() +
  labs(x = "Gas Price", y = "Density")

R studio CDF

RStudio also allows us to find the cumulative probability that a given number, or a smaller number, occurs. To do so, we use the function pnorm (q,mean = μ, sd = σ, lower tail,log.p).

Here, q represent the value we are evaluating, lower tail determine which side of the distribution the probability is take from, and log.p represents the return, the probability is return normally or on the log scale.

pnorm(4.3,mean(gas_prices),sd(gas_prices),lower.tail = FALSE)
## [1] 0.8253018

Left-Tailed Default and Probability Between two points

It is important to highlight that pnormr() is left- tailed by default ( x≤ q)

To find the probability that X falls between two poits we must

\[ P(a < X < b) = P(X \le b) - P(X \le a) \] Therefore, we calculate the two probabilities and subtract them from each other

pnorm(4.40) - pnorm(4.25)
## [1] 5.275982e-06
pnorm(4.40) - pnorm(-4.25)
## [1] 0.9999839
min(gas_prices)
## [1] 4.12

Area under the curve

Probability of falling between 4.25 to 4.40

Area under the curve

Probability of X ≤ 4.25

Normal Distribution Function