Normal Distribution: This distribution illustrates how data points are symmetrically dispersed around a mean, with the spread influenced by the standard deviation. It applies to data naturally clustering around a central value without bias towards higher or lower extremes.
Binomial Distribution: Employed for counting the successes in a certain number of independent trials, each with the same probability of success. It’s pertinent for binary outcomes, such as win/lose or success/failure scenarios.
Poisson Distribution: Suitable for counting the number of events occurring within a fixed period or space when these events happen at a known average rate and independently of the time since the last event. It fits situations with events that occur at a constant average rate.
PDF: Provides the probability of a random variable assuming a specific value, essential for continuous distributions to discern the likelihood of different outcomes.
CDF: Indicates the probability that a random variable takes a value less than or equal to a particular threshold, offering the cumulative probability for continuous or discrete distributions up to a certain point.
Using the Normal Distribution as an example, its PDF aligns with the expectation that values cluster around the mean, with the probability density decreasing as one moves away from the mean.
# Normal Distribution
# pdf
?dnorm
#cdf
?pnorm
# Binomial Distribution
# pdf
?dbinom
#cdf
?pbinom
# Poisson Distribution
# pdf
?dpois
#cdf
?ppois
N <- 100
x <- 5
pi <- 0.1
# Binomial
binom_prob <- dbinom(x, N, pi)
binom_prob
## [1] 0.0338658
# Poisson
lambda <- N * pi
poisson_prob <- dpois(x, lambda)
poisson_prob
## [1] 0.03783327
The probabilities calculated using both the binomial and Poisson
distributions represent the likelihood of observing exactly 5 deaths out
of 100 neurosurgical procedures, given a national average death rate of
10%. The binomial probability is approximately 0.0339 and
the Poisson approximation is 0.0378.
The answers are quite similar but not identical.
Reason for Similarity: The small difference arises due to the nature of the Poisson distribution serving as an approximation to the binomial distribution under certain conditions (large \(N\) and small \(\pi\)), which are somewhat met here.
Reason for the Difference: The Poisson distribution simplifies calculations by approximating the binomial distribution when the number of trials is large and the probability of success per trial is small. However, since it’s an approximation, minor discrepancies can occur, especially near the mean of the distribution.
# Binomial
x_values <- 0:N
binom_distr <- dbinom(x_values, N, pi)
plot(x_values, binom_distr, type="h", lwd=2, col="blue", main="Binomial Distribution", xlab="Number of Deaths", ylab="Probability")
# Poisson
poisson_distr <- dpois(x_values, lambda)
plot(x_values, poisson_distr, type="h", lwd=2, col="red", main="Poisson Distribution", xlab="Number of Deaths", ylab="Probability")