The Negative Binomial Distribution is the Poisson-Gamma Mixture
Read in about 5 min
The negative binomial distribution can be thought of as a compound distribution of a gamma and a poisson. Generally speaking, you can view it as a Poisson distribution with a gamma prior on the rate parameter
Derivation
Consider a Poisson model for count data, \[ y \sim \operatorname{Poi}(\theta), \quad \theta \geq 0. \]
The parameter \(\theta\) may be interpreted as the rate of arrivals. More importantly, the poisson distribution for the data generating process implies the “equidispersion” amongst data, i.e., \(\mathbb{E}[y] = \operatorname{Var}[y] = \theta\). Nonetheless, this property of the Poisson model is uncapable of handling overdispersed data or data in which the variance is much larger than the mean. The reason is down to one free parameter in the probability mass function. However, if we impose a Gamma prior on \(\theta\), \[\begin{aligned} y & \sim \operatorname{Po}(\theta) \\ \theta & \sim \operatorname{Ga}\left(r, \frac{p}{(1-p)}\right), \end{aligned} \]
Marginalizing \(\theta\) will result in a Negative Binomial (NB) distribution, which offers a flexibility to tackle to issue of overdispersion. The reason is that the NB distribution has a useful property that its variance can be greater than its mean.
Mathematically, the derivation are straightforward
\[ \begin{aligned} p(y) & = \int_{0}^{\infty} p(y \mid \theta) p(\theta) \, \operatorname{d}\theta \\ & = \int_{0}^{\infty} \left(\frac{\theta^{y} \exp^{\theta}}{y !}\right) \left(\theta^{r-1} \exp^{-\theta(1-p)/p} \frac{1}{\Gamma(r) (\frac{p}{1-p})^{r}}\right) \, \operatorname{d} \theta\\ & = \left(\frac{(1-p)^{r} p^{-r}}{y! \Gamma(r)} \right)\int_{0}^{\infty} \theta^{r + y -1} \exp{-\theta/p}\, \operatorname{d}\theta \\ & = \left(\frac{(1-p)^{r} p^{-r}}{y! \Gamma(r)} \right) p^{r+y} \Gamma(r+y) \\ & = \frac{\Gamma(r+y)}{\Gamma(r) y!} p^{y}(1-p)^{r}\\ & = \frac{(r+ y -1)!}{(r-1)! y!} p^{y}(1-p)^{r} \\ & = \begin{pmatrix} y+ r -1 \\ y \end{pmatrix} p^{y}(1-p)^{r} \\ & = \operatorname{NB}(r, p). \end{aligned} \]
where we have used the fact that \[ \int_{0}^{\infty} x^{b} \exp^{-ax} \, \operatorname{d}x = \frac{\Gamma(b+1)}{a^{b+1}}.\] and \[ \Gamma(x) = (x-1)! .\]
Simulating in R
library(tidyverse)
r = 2
p = 0.9
n = 1e5
lambda <- rgamma(n, shape = r, rate = p / (1-p))
y_mix <- rpois(length(lambda), lambda = lambda)
d1 <- tibble(y = y_mix, dist = 'mixture')
y_nb <- rnbinom(n, r, p)
d2 <- tibble(y = y_nb, dist = 'nbinom') %>% bind_rows(d1)
d2 %>% count(dist , y) %>%
group_by(dist) %>%
mutate(prop = prop.table(n)) %>%
ungroup %>%
select(-n) %>%
pivot_wider(names_from = dist, values_from = prop)## # A tibble: 6 x 3
## y mixture nbinom
## <int> <dbl> <dbl>
## 1 0 0.810 0.813
## 2 1 0.162 0.159
## 3 2 0.0248 0.0248
## 4 3 0.0033 0.003
## 5 4 0.00033 0.00042
## 6 5 0.00007 0.00002
ggplot(d2, aes(x = y), color = dist, fill = dist) + geom_density(alpha = 0.5)Example
See Figure above for simple setting.