The Negative Binomial Distribution is the Poisson-Gamma Mixture

Read in about 5 min

The negative binomial distribution can be thought of as a compound distribution of a gamma and a poisson. Generally speaking, you can view it as a Poisson distribution with a gamma prior on the rate parameter

Derivation

Consider a Poisson model for count data, \[ y \sim \operatorname{Poi}(\theta), \quad \theta \geq 0. \]

The parameter $\theta$ may be interpreted as the rate of arrivals. More importantly, the poisson distribution for the data generating process implies the “equidispersion” amongst data, i.e., $\mathbb{E}[y] = \operatorname{Var}[y] = \theta$. Nonetheless, this property of the Poisson model is uncapable of handling overdispersed data or data in which the variance is much larger than the mean. The reason is down to one free parameter in the probability mass function. However, if we impose a Gamma prior on $\theta$, \[\begin{aligned} y & \sim \operatorname{Po}(\theta) \\ \theta & \sim \operatorname{Ga}\left(r, \frac{p}{(1-p)}\right), \end{aligned} \]

Marginalizing $\theta$ will result in a Negative Binomial (NB) distribution, which offers a flexibility to tackle to issue of overdispersion. The reason is that the NB distribution has a useful property that its variance can be greater than its mean.

Mathematically, the derivation are straightforward

\[ \begin{aligned} p(y) & = \int_{0}^{\infty} p(y \mid \theta) p(\theta) \, \operatorname{d}\theta \\ & = \int_{0}^{\infty} \left(\frac{\theta^{y} \exp^{\theta}}{y !}\right) \left(\theta^{r-1} \exp^{-\theta(1-p)/p} \frac{1}{\Gamma(r) (\frac{p}{1-p})^{r}}\right) \, \operatorname{d} \theta\\ & = \left(\frac{(1-p)^{r} p^{-r}}{y! \Gamma(r)} \right)\int_{0}^{\infty} \theta^{r + y -1} \exp{-\theta/p}\, \operatorname{d}\theta \\ & = \left(\frac{(1-p)^{r} p^{-r}}{y! \Gamma(r)} \right) p^{r+y} \Gamma(r+y) \\ & = \frac{\Gamma(r+y)}{\Gamma(r) y!} p^{y}(1-p)^{r}\\ & = \frac{(r+ y -1)!}{(r-1)! y!} p^{y}(1-p)^{r} \\ & = \begin{pmatrix} y+ r -1 \\ y \end{pmatrix} p^{y}(1-p)^{r} \\ & = \operatorname{NB}(r, p). \end{aligned} \]

where we have used the fact that \[ \int_{0}^{\infty} x^{b} \exp^{-ax} \, \operatorname{d}x = \frac{\Gamma(b+1)}{a^{b+1}}.\] and \[ \Gamma(x) = (x-1)! .\]

Simulating in R

library(tidyverse)

r = 2
p = 0.9
n = 1e5

lambda <- rgamma(n, shape = r, rate = p / (1-p))
y_mix <- rpois(length(lambda), lambda = lambda)

d1 <- tibble(y = y_mix, dist = 'mixture')

y_nb <- rnbinom(n, r, p)
d2 <- tibble(y = y_nb, dist = 'nbinom') %>% bind_rows(d1)

d2 %>% count(dist , y) %>%
  group_by(dist) %>%
  mutate(prop = prop.table(n)) %>%
  ungroup %>%
  select(-n) %>%
  pivot_wider(names_from = dist, values_from = prop)

## # A tibble: 6 x 3
##       y mixture  nbinom
##   <int>   <dbl>   <dbl>
## 1     0 0.810   0.813  
## 2     1 0.162   0.159  
## 3     2 0.0248  0.0248 
## 4     3 0.0033  0.003  
## 5     4 0.00033 0.00042
## 6     5 0.00007 0.00002

ggplot(d2, aes(x = y), color = dist, fill = dist) + geom_density(alpha = 0.5)

$\label{fig:plot}Example$

Example

See Figure above for simple setting.