Question

44 Suppose that in the hypergeometric distribution, we let N and k tend to infinity in such a way that the ratio k/N approaches a real number p between 0 and 1. Show that the hypergeometric distribution tends to the binomial distribution with parameters n and p.

Solution

Hypergeometric formula:

\[h(N, k, n, x) = \binom{k}{x}\ \frac {\binom{N-k}{n - x}} {\binom{N}{n}}\]

where N = the total population, n = sample size, k = the amount of the population we’re interested in (i.e., colored balls, face cards, ‘successes’), and x is the same variable’s amount in the sample. In this case, x is the variable we’re modifying and N, k, and n are parameters. It follows that (k C x) is the combinations of successful events of size x taken from the population, (N-k) is the ‘failures’ in the population, (n - x) is the ‘failures’ in the sample, and [(N - k) C (n - x)] is the combinations of failing events complementary to (k C x).  Finally, the denominator is all samples of size n taken from the population - or the sample space.

The binomial distribution by contrast is:

\[b(n, p, k) = \binom{n}{k} p^k q^{n-k}\]

where p is the probability of a success, q = 1 - p, or the probability of a failure, n is the number of samples from an unlimited sample space of trials, and k is the number of successes. (n C k) represents the possible combinations of trials with k successes. It’s also worth mentioning that ‘k’ in the binomial formula maps onto ‘x’ in the hypergeometric formula because they are both the observed amount of successes. By expanding and rearranging the hypergeometric formula, we get:

\[\frac {k!}{x! (k-x)!} * \frac {(N-k)!}{(n-x)![(N-k) - (n - x)]!} * \frac {n!(N-n)!}{N!} = \]

Rearranging the terms in the numerator and denominator:

\[\frac {n!}{x! (n-x)!} * \frac {(N-k)!}{[(N-k) - (n - x)]!} * \frac {k!}{(k-x)!} * \frac {(N-n)!}{N!} =\]

Note that:

\[\frac {n!}{x! (n-x)!} = \binom {n}{x};\]

If our sample is small enough so that we can ignore ‘without replacement’, numerically this would mean that N is sufficiently large enough such that N ~ N - n, and:

\[\frac {(N-n)!}{N!} \approx 1; \\\]

We can also simplify the other two terms by noting that as N and k approach infinity N >> x and k >> x

\[\frac {k!}{(k-x)!} = (k)(k - 1)...(k - x + 1) \approx k^{x -1}; \]

\[\frac {(N-k)!}{[(N-k) - (n - x)]!} = (N - k)(N - k - 1)...(N - k - n + x + 1) \approx (N - k)^{n - x}; \]

We’re left with:

\[\binom {n}{x} k^x * (N - k)^{n - x} = \binom {n}{x} (\frac{k}{N-k})^x * (N - k)^{n} \]

Divide each fraction by N:

\[\binom {n}{x} (\frac{k/N}{1-k/N})^x * (1 - k/N)^{n} \\\]

substitute k/N = p:

\[\binom {n}{x} (\frac{p}{1-p})^x * (1 - p)^{n}\]

\[\binom {n}{x} p^x * (1 - p)^{n -x}\]

As I mentioned earlier, x maps more closely onto k from the binomial distribution. So as N and k approach infinity in the hypergeometric, and assuming N >> n (usually n/N < 0.05), then the binomial distribution is an approximation of the hypergeometric distribution.