Binomal and Poisson

Jake Warby

02/04/2022

Binomial Model

Binomial without Censoring

  • We observe \(N_x\) independent lives exactly aged \(x\) at the beginning of the year for one whole year
  • We observe \(d_x\) deaths
  • Each life has a probability \(q_x\) of death over that year (inital rate of mortality)
  • Then the random variable \(D_x\), the number of deaths, is binomially distributed:

\[ D_x\sim Bin(N_x, q_x)\]

\[ P[D_x = d_x] = \left(\begin{matrix}N_x\\d_x\end{matrix}\right)q_x^{d_x}(1-q_x)^{N_x-d_x}\] \[ \mathbb{E}[D_x] = N_xq_x,\quad Var(D_x) = N_xq_x(1-q_x)\]

  • The MLE estimate for \(q_x\) is therefore:

\[ \hat{q}_x = \frac{d_x}{N_x}\]

\[ \hat{q}_x\sim N\left(q_x, \frac{q_x(1-q_x)}{N_x}\right)\] * Therefore a confidence interval for \(\hat{q}_x\) is:

\[ \hat{q}_x\pm z_{1-\alpha/2}\sqrt{\frac{q_x(1-q_x)}{N_x}}\]

Binomial with Censoring

  • Now all lives are not necessarilty observed over the complete year \(x\) to \(x+1\), aka there may be decrements other than death (right censoring) or left truncation (enter late)
    • Therefore we observe a life from age \(x+a_i\) to \(x+b_i\)

  • Therefore the probabilities of death become:

  • Note that since we are considering probabilities of death within the year, we either have to use a continuous parametric model for probabilities of death or a non-integer age assumption.

  • We consider the actuarial estimate for the expected number of deaths:

  • Note that we can simplify the death probability:

  • The balducci assumption is used in conjunction with this note to simplify the calculation. We also note \(\mathbf{E}[D_i] = d_i, \mathbf{E}[D] = d\) where \(d = \sum d_i\)

Intial Exposed to Risk

  • The actuarial estimate of \(\hat{q}_x\) can be rewritten in regards to the inital exposed to risk:

\[ \hat{q}_x=\frac{d}{\sum^N_{i=1}(1-a_i)-\sum_{i;D_i=0}(1-b_i)} = \frac{d_x}{E_x}\] * Where the inital exposed to risk \(E_x\) is defined as:

\[ E_x = \sum^N_{i=1}(1-a_i)-\sum_{i;D_i=0}(1-b_i) = \underbrace{\sum_{i;D_i=1}(1-a_i)}_{\text{Death Observations}} + \underbrace{\sum_{i;D_i=0}(b_i-a_i)}_{\text{Survivor Observations}}\] * As we can see: + Deaths contribute the period of length \((1-a_i)\) from age \(x+a_i\) to \(x+1\) - Even if the death was planned to be censored, if they die before this we do not care about \(b_i\) + Survivors contribute the period of length \((b_i-a_i)\) from \(x+a_i\) to \(x+b_i\)

Central Exposed to Risk

  • If we know when deaths occur, e.g \(x+t_i\), we can then modify the inital exposed to risk to a central exposed to risk:

\[ E_x^c = \sum^N_{i=1}(b_i-a_i)(1-d_i) + \sum_{i=1}(t_i-a_i)d_i\]

  • As we can see:
    • Deaths contribute the period of length \((t_i-a_i)\) from \(x+a_i\) to \(x+t_i\)
    • Survivors contribute the period of length \((b_i-a_i)\) from \(x+a_i\) to \(x+b_i\)
  • When the exact times of deaths are not available, but \(E_x^c\) is the usual approach is to assume that deaths occur on average at age \(x+\frac{1}{2}\) so that the actuarial estimate becomes:

\[ \hat{q}_x = \frac{d}{E^c_x+\frac{d}{2}}\]

  • Due to the assumption

\[ E_x = E^c_x + \frac{d}{2}\]

  • If they are expected to live \(x + y\) then:

\[ E_x = E^c_x + d(1-y)\]

Poisson Model

  • We can use a poisson model to model death
    • We observe \(N_x\) individuals over a year of age, starting from exactly age \(x\)
    • We assume a constant force of mortality, \(\mu_x\), over the observed period for each individual in the age interval \((x,x+1)\)
    • The sum of all of those observed periods is \(E^c_x\), or observed waiting time \(v\)
    • This means that the time of death of each individual (within the year) is exponential and hence we can use a Poisson model to model the amount of deaths within the year
      • Note this means that we are allowing replacement, aka people can die multiple times.
  • The PMF for number of deaths is:

\[ P[D_x=d_x] = \frac{e^{-\mu E^c_x}(\mu E^c_x)^{d_x}}{d_x!}\]

\[ \mathbb{E}[D_x] = Var(D_x) = \mu E^c_x\]

  • The maximum likelihood estimator is therefore:

\[ \hat{\mu}_x = \frac{D_x}{E^c_x} \]

\[ \hat{\mu}_x\sim N(\mu_x,\frac{\mu_x}{E_x^c})\]

  • Therefore the confidence interval for the MLE is:

\[ \hat{\mu}_x \pm z_{1-\alpha/2}\sqrt{\frac{\hat{\mu}_x}{E^c_x}}\]