1. Introduction to Binomial Distributions

In the previous modules, we discussed discrete random variables. The Binomial Distribution is one of the most famous discrete probability distributions. It models the number of “successes” in a fixed number of independent trials.

1.1 The Bernoulli Trial

Before understanding the Binomial distribution, we must define a Bernoulli Trial. A Bernoulli trial is a random experiment with exactly two possible outcomes: 1. Success (S): Usually coded as 1. 2. Failure (F): Usually coded as 0.

Example: Flipping a coin once (Heads = Success, Tails = Failure).


2. Criteria for a Binomial Experiment

For a process to be considered a Binomial distribution, it must meet the BINS criteria:

  • B - Binary? Outcomes must be classified as “Success” or “Failure”.
  • I - Independent? The outcome of one trial must not affect the others.
  • N - Number? The number of trials (\(n\)) must be fixed in advance.
  • S - Success Probability? The probability of success (\(p\)) must be the same for each trial.

3. The Probability Mass Function (PMF)

If \(X\) is a random variable following a Binomial distribution, we denote it as: \[X \sim B(n, p)\]

The probability of getting exactly \(k\) successes in \(n\) trials is given by:

\[P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}\]

Where: * \(n\): Total number of trials. * \(k\): Number of successes (\(0, 1, 2, ..., n\)). * \(p\): Probability of success on an individual trial. * \(\binom{n}{k}\): The binomial coefficient, calculated as \(\frac{n!}{k!(n-k)!}\).


4. Working with Binomial Distributions in R

R provides four essential functions for the binomial distribution:

  1. dbinom(k, n, p): Probability Mass Function \(P(X = k)\).
  2. pbinom(k, n, p): Cumulative Distribution Function \(P(X \le k)\).
  3. qbinom(q, n, p): Quantile function (finds \(k\) such that \(P(X \le k) = q\)).
  4. rbinom(m, n, p): Generates \(m\) random variables from the distribution.

Example 1: Calculating Probabilities

Suppose a student takes a 10-question multiple-choice quiz. Each question has 4 options (only 1 correct). If the student guesses randomly: * \(n = 10\) * \(p = 0.25\)

What is the probability of getting exactly 3 correct?

# P(X = 3)
dbinom(x = 3, size = 10, prob = 0.25)
## [1] 0.2502823

What is the probability of getting 3 or fewer correct?

# P(X <= 3)
pbinom(q = 3, size = 10, prob = 0.25)
## [1] 0.7758751

5. Visualizing the Distribution

The shape of the Binomial distribution depends on \(p\).

n <- 20
k <- 0:n
p_values <- c(0.2, 0.5, 0.8)

data <- expand.grid(k = k, p = p_values) %>%
  mutate(prob = dbinom(k, n, p),
         p_label = paste0("p = ", p))

ggplot(data, aes(x = k, y = prob, fill = p_label)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~p_label) +
  labs(title = "Binomial Distribution (n = 20)",
       x = "Number of Successes (k)",
       y = "Probability") +
  theme_minimal()
Visualizing Binomial Distributions with different probabilities

Visualizing Binomial Distributions with different probabilities


6. Mean and Variance

For a Binomial distribution \(X \sim B(n, p)\):

  • Mean (\(\mu\)): \(E(X) = n \times p\)
  • Variance (\(\sigma^2\)): \(Var(X) = n \times p \times (1 - p)\)
  • Standard Deviation (\(\sigma\)): \(\sqrt{np(1-p)}\)

7. Real-Life Applications

7.1 Quality Control (Manufacturing)

A factory produces light bulbs with a 1% defect rate. If a sample of 100 bulbs is taken, what is the probability that more than 2 are defective?

  • \(n = 100\)
  • \(p = 0.01\)
  • Find \(P(X > 2) = 1 - P(X \le 2)\)
1 - pbinom(2, 100, 0.01)
## [1] 0.0793732

7.2 Digital Marketing (Click-Through Rate)

An email campaign has a historical click-through rate (CTR) of 5%. If you send the email to 500 customers, what is the expected number of clicks?

  • \(E(X) = n \times p = 500 \times 0.05 = 25\) clicks.

7.3 Genetic Inheritance

If two parents are carriers of a recessive gene (25% chance of passing it to a child), and they have 4 children, what is the probability that exactly 1 child inherits the condition?

dbinom(1, 4, 0.25)
## [1] 0.421875

8. Summary Table Checklist

Feature Description
Parameters \(n\) (Trials), \(p\) (Prob of Success)
Domain \(k \in \{0, 1, 2, ..., n\}\)
R Function (Exact) dbinom()
R Function (Cumulative) pbinom()
Shape Left-skewed if \(p > 0.5\), Right-skewed if \(p < 0.5\), Symmetric if \(p = 0.5\)

```

How to use this:

  1. Open RStudio.
  2. Go to File -> New File -> R Markdown.
  3. Delete the default text and paste the code above.
  4. Click the Knit button (the ball of yarn icon) to generate the formatted lecture notes.

Key additions included:

  • Visualizations: I used ggplot2 to create a comparison plot showing how the distribution shifts based on the probability \(p\).
  • Math notation: Uses LaTeX for professional rendering of the PMF formula.
  • Code integration: Shows students exactly which R functions to use for their assignments.
  • Contextual Examples: Covers manufacturing, marketing, and genetics.