Binomial Experiment

An binomial experiment is any experiment that has

  • a fixed number of trials \(n\)
  • two possible outcomes, “success” and “failure”
  • constant probability of success on each trial \(p\)
  • independent trials

For example, tossing a coin ten times and recording the number of heads and tails is a binomial experiment. There are ten trials, each with two possible outcomes: heads or tails. Let’s suppose we are interested in heads so we will call that a success. The probability of success is the same for each trial. And the outcome of each trial is independent, i.e. the outcome of one coin toss does not affect the outcome of another coin toss.

Example 1: Which of the following is a binomial experiment?

  • A random sample of 15 college seniors is taken and the individuals selected are asked to state their ages
  • Three cards are selected from a standard deck without replacement. The number of aces selected is recorded.
  • A basketball player who makes 80% of his free throws is asked to shoot until he misses. The number of attempts is recorded.
  • A poll of 1200 registered voters is conducted in which they are asked whether they think Congress should reform Social Security.

  • Not binomial since more than two outcomes
  • Not binomial since trials are dependent due to sampling without replacement
  • Not binomial since number of trials is not fixed
  • Binomial

Binomial Random Variable

Let \(X\) be the number of successes in a binomial experiment with \(n\) trials with probability \(p\) of success. Then we say that \(X\) is a binomial random variable and we write \[X \sim Bin(n,p)\]

Probability Distribution of Binomial

What is the probability distribution of a binomial random variable? Specifically, what values can a binomial random variable take on and how often will it take on those values?

It’s fairly easy to see that if we have \(n\) trials and \(X\) = the number of successes then the values that \(X\) can take on must be \(0, 1, 2, \cdots, n\). However, the probabilities are not as straightforward.

Let’s consider an example. Suppose you play an instant lottery three times where your chances of winning each time are 20%. Assuming that the outcomes are independent, the number of times you win is a binomial random variable with \(n=3\) and \(p=0.20\). What is the probability distribution of X?

We can start by making a tree diagram where each branch shows whether we win \(W\) or lose \(W^c\) for each time we play:

Outcome X = # of wins Probability
\(WWW\) \(3\) \(0.2 \cdot 0.2 \cdot 0.2 = 0.08\)
\(WWW^c\) \(2\) \(0.2 \cdot 0.2 \cdot 0.8 = 0.032\)
\(WW^cW\) \(2\) \(0.2 \cdot 0.8 \cdot 0.2 = 0.032\)
\(W^cWW\) \(2\) \(0.8 \cdot 0.2 \cdot 0.2 = 0.032\)
\(WW^cW^c\) \(1\) \(0.2 \cdot 0.8 \cdot 0.8 = 0.128\)
\(W^cWW^c\) \(1\) \(0.8 \cdot 0.2 \cdot 0.8 = 0.128\)
\(W^cW^cW\) \(1\) \(0.8 \cdot 0.8 \cdot 0.2 = 0.128\)
\(W^cW^cW^c\) \(0\) \(0.8 \cdot 0.8 \cdot 0.8 = 0.512\)

Combining values together, we get the probability distribution of X

\(x\) \(0\) \(1\) \(2\) \(3\)
\(p(x)\) \((0.8)^3=0.512\) \(3(0.8)^2(0.2)=0.384\) \(3(0.8)(0.2)^2=0.096\) \(0.2^3=0.008\)

Notice the pattern

\[p(x) = C_{x}^{3} \cdot 0.2^x \cdot (1- 0.2)^{3-x} \text{ for } x = 0, 1, 2, 3\]

So, in general, the probability distribution function for a binomial random variable with \(n\) trials and probability of success \(n\) is \[p(x) = C^{n}_{x} p^x (1-p)^{n-x} \text{ for } x = 0,1,2, \cdots, n\]

In R, the dbinom function gives the probability distribution for a binomial random variable. You can get the whole probability distribution by specifying all the possible values that the random variable can take on

> x <- c(0,1,2,3)
> dbinom(x,3,0.2)

[1] 0.512 0.384 0.096 0.008

or you can compute just one probability for one specific value

> dbinom(1,3,0.2)

[1] 0.384

The cumulative distribution function for the binomial can be computed using the pbinom function in R. For example, to find the probability of winning the lottery at most once in three tries is \(P(X \le 1) = P(X=0)+P(X=1) = 0.512 +0.384 = 0.896\)

> pbinom(1,3,0.2)

[1] 0.896

Example 2: About 8% of males are colorblind. A researcher randomly selects 10 men and tests them for colorblindness. Find the probability that

  • exactly two of the men tested are colorblind
  • at least three of the men tested are colorblind
  • at most two of the men tested are colorblind
  • more than four of the men tested are colorblind
  • less than five of the men tested are colorblind

Let \(X=\) the number of the men tested who are colorblind. \(X\) is a binomial random variable with \(n=10\) and \(p=0.08\).

  • \(P(X=2) = C_{2}^{10}\cdot0.08^2\cdot(1-0.08)^{10-2}=\text{ dbinom}(2,10,0.08) =0.1478\)
  • \(P(X \ge 3) = \sum_{x=3}^{10} C_{x}^{10}\cdot0.08^x\cdot(1-0.08)^{10-x}=1-P(X \le 2) = 1 - \text{ pbinom}(2,10,0.08) = 0.0401\)
  • \(P(X \le 2)= \text{ pbinom}(2,10,0.08) = 0.9599\)
  • \(P(X>4)=1-P(X \le 4) = 1-\text{ pbinom}(4,10,0.08) = 0.0006\)
  • \(P(X < 5)= P(X \le 4) = \text{ pbinom}(4,10,0.08) = 0.9994\)

Mean of the Binomial

Let’s first consider what the mean of X would be for our example where X is the number of times you win the lottery if you play 3 times.

\[\mu = E(X) = 0 \cdot 0.512 + 1 \cdot 0.384 + 2 \cdot 0.096 + 3 \cdot 0.008=0.06 = np = 3 \cdot 0.20\] It turns out that for any binomial random variable \(X \sim Bin(n,p)\) has a mean of \(\mu = n \cdot p\).

Standard Deviation of the Binomial

Let’s first consider what the standard deviation of X would be for our example where X is the number of times you win the lottery if you play 3 times.

\[\begin{align} \sigma &= \sqrt{E(X-\mu)^2} \\ &= \sqrt{(0-0.06)^2 \cdot 0.512 + (1-0.06)^2 \cdot 0.384 + (2-0.06)^2 \cdot 0.096 + (3-0.06)^2 \cdot 0.008} \\ &=\sqrt{0.6} \\ &= \sqrt{np(1-p)} \\ &= \sqrt{3 \cdot 0.20} \end{align}\]

It turns out that for any binomial random variable \(X \sim Bin(n,p)\) has a standard deviation of \(\sigma = \sqrt{n p (1-p)}\).

Summary

If \(X\) is a binomial random variable with parameters \(n\) and \(p\) then

\[P(X=x) = C_{x}^{n}p^x(1-p)^{n-x} = \text{ dbinom}(x,n,p) \text{ for } x=0,1,\cdots,n\]

\[P(X \le x) = \sum_{k=0}^x C_{k}^{n}p^k(1-p)^{n-k} = \text{ pbinom}(x,n,p) \text{ for } x=0,1,\cdots,n\]

\[\mu = E(X) = np\]

\[\sigma = \sqrt{E(X-\mu)^2} = \sqrt{np(1-p)}\]