- What is a Probability Distribution Function?
- Visualization of the Area under curve
- Normal Curve
- R Code Implementation
- Interactive Density Plot
- 3D Probability Surface
- Real world application of the concept
- Summary
It is a curve that shows how likely the different outcomes are.
The area under curve should be 1 or 100%.
We find the area under the curve by integration. If we are given a curve and we want to find the probability of an event that falls within a certain range then we find the area under the curve for that specific range.
Let’s say we want to find the probability of value \(X\) that falls between two points \(a\) and \(b\):
\[P(a\le X\le b)=\int_{a}^{b}f(x)\,dx\]
For a function to be a valid probability function the total area from negative infinity to infinity should be 1:
\[\int_{-\infty}^{\infty}f(x)\,dx=1\]
If we look at a Uniform Distribution where every outcome is equally likely then it would create just a simple rectangle.
Let’s assume that we have a normal distribution with standard deviation = 1 and mean = 0.
Given question: find the probability that X < 1.96
## [1] "The probability (area) is: 0.975"
The probability density function of the standard normal distribution is
\[ f(x)=\frac{1}{\sqrt{2\pi}}e^{-x^2/2} \]
xValues<-seq(-4,4,length.out=200)
yValues<-dnorm(xValues)
dfnorm<-data.frame(x=xValues,y=yValues)
ggplot(dfnorm,aes(x,y))+geom_line(color="lightpink",size=1)+
geom_area(fill="lightcoral",alpha=0.2)+theme_minimal()+
labs(title="The Standard Normal Curve",
subtitle="Total Area=1")This concept is used in many real world applications like: