In the previous modules, we discussed discrete random variables. The Binomial Distribution is one of the most famous discrete probability distributions. It models the number of “successes” in a fixed number of independent trials.
Before understanding the Binomial distribution, we must define a Bernoulli Trial. A Bernoulli trial is a random experiment with exactly two possible outcomes: 1. Success (S): Usually coded as 1. 2. Failure (F): Usually coded as 0.
Example: Flipping a coin once (Heads = Success, Tails = Failure).
For a process to be considered a Binomial distribution, it must meet the BINS criteria:
If \(X\) is a random variable following a Binomial distribution, we denote it as: \[X \sim B(n, p)\]
The probability of getting exactly \(k\) successes in \(n\) trials is given by:
\[P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}\]
Where: * \(n\): Total number of trials. * \(k\): Number of successes (\(0, 1, 2, ..., n\)). * \(p\): Probability of success on an individual trial. * \(\binom{n}{k}\): The binomial coefficient, calculated as \(\frac{n!}{k!(n-k)!}\).
R provides four essential functions for the binomial distribution:
dbinom(k, n, p): Probability Mass Function \(P(X = k)\).pbinom(k, n, p): Cumulative Distribution Function \(P(X \le k)\).qbinom(q, n, p): Quantile function (finds \(k\) such that \(P(X \le k) = q\)).rbinom(m, n, p): Generates \(m\) random variables from the
distribution.Suppose a student takes a 10-question multiple-choice quiz. Each question has 4 options (only 1 correct). If the student guesses randomly: * \(n = 10\) * \(p = 0.25\)
What is the probability of getting exactly 3 correct?
## [1] 0.2502823
What is the probability of getting 3 or fewer correct?
## [1] 0.7758751
The shape of the Binomial distribution depends on \(p\).
n <- 20
k <- 0:n
p_values <- c(0.2, 0.5, 0.8)
data <- expand.grid(k = k, p = p_values) %>%
mutate(prob = dbinom(k, n, p),
p_label = paste0("p = ", p))
ggplot(data, aes(x = k, y = prob, fill = p_label)) +
geom_col(show.legend = FALSE) +
facet_wrap(~p_label) +
labs(title = "Binomial Distribution (n = 20)",
x = "Number of Successes (k)",
y = "Probability") +
theme_minimal()Visualizing Binomial Distributions with different probabilities
For a Binomial distribution \(X \sim B(n, p)\):
A factory produces light bulbs with a 1% defect rate. If a sample of 100 bulbs is taken, what is the probability that more than 2 are defective?
## [1] 0.0793732
An email campaign has a historical click-through rate (CTR) of 5%. If you send the email to 500 customers, what is the expected number of clicks?
| Feature | Description |
|---|---|
| Parameters | \(n\) (Trials), \(p\) (Prob of Success) |
| Domain | \(k \in \{0, 1, 2, ..., n\}\) |
| R Function (Exact) | dbinom() |
| R Function (Cumulative) | pbinom() |
| Shape | Left-skewed if \(p > 0.5\), Right-skewed if \(p < 0.5\), Symmetric if \(p = 0.5\) |
```
File -> New File -> R Markdown.ggplot2 to
create a comparison plot showing how the distribution shifts based on
the probability \(p\).