Estimation of the offspring mean

The problems of mathematical statistics are inverse of those of probability theory. So far for given individual (offspring) distributions we studied the global behavior of the branching processes. In this lecture we will go in the other direction to estimate the individual characteristics of the process using observations (data) collected over its realizaton.

Figure 1 Tree Presentation with Poisson offspring and \(\mu=0.8\).

Assume that we can observe the trajectory (entire family tree) of a Galton-Watson process up to certain time \(t\) (see Figure 2). This is the complete information we can have for a branching process. Given this tree, we can easy calculate the statistics \(X_j(k)\), the individuals from the \(j\)th generation with exactly \(k\) offspring in the next generation, where \(j=0,1,\ldots, t\) and \(k=0,1,\ldots\). Therefore, for the likelihood function, using the independence of the individual evolution, we have \[ L_t(p_0, p_1, \ldots)=\prod_{k=0}^\infty p_k^{\sum_{j=0}^{t-1} X_j(k)}, \qquad (1) \] where \(p=\{ p_0, p_1, \ldots\}\) is the offspring distribution.

Using Lagrange’s method, we will obtain the maximum likelihood estimator (MLE) of \(p\). Denote the total number of individuals from the \(0, 1, \ldots, (t-1)\)st generations having exactly \(k\) offspring by \[ U_t(k):=\sum_{j=0}^{t-1} X_j(k) \] and the total number of individuals up to the \((t-1)\)st generation by \[ U_t :==\sum_{k=0}^\infty U_t(k)=\sum_{k=0}^\infty X_k. \] Then (1) implies \[ \log L_t(p)=\sum_{k=0}^\infty U_t(k)\log p_k, \qquad (2) \] where \(\sum_{k=0}^\infty p_k=1\). To obtain the maximum in (2), according to Lagrange’s method, we need to consider the function \[ \Phi_t(p):=\sum_{k=0}^\infty U_t(k)\log p_k +\lambda (1-\sum_{k=0}^\infty) \] and solve the equations for \(k=0,1,\ldots\) \[ \frac{\partial \Phi_t(p)}{\partial p_k}=\frac{U_t(k)}{p_k}-\lambda=0, \qquad (3) \] where \(\lambda\) is Lagrange’s multiplier. It follows from (3) that \(\hat{p}_k(t)=U_t(k)/\lambda\) and now \(\sum_{k=0}^\infty \hat{p}_k=1\) yields \(\lambda=\sum_{k=0}^\infty U_k(t)=U_t\). Thus, the MLE estimators for the offspring probabilities \(p_k\), \(k=0,1,\ldots\) are given by \[ \hat{p}_k(t)=\frac{U_t(k)}{U_t}. \] Making use of the above estimators, for the MLE \(\hat{\mu_t}\) of the offspring mean \(\mu\) we obtain \[ \hat{\mu_t}=\sum_{k=0}^\infty k\hat{p}_k(t)=\sum_{k=0}^\infty k\frac{U_k(t)}{U_t}=\frac{1}{U_t}\sum_{k=0}^\infty k\sum_{j=0}^{t-1} X_j(k) \] \[ =\frac{1}{U_t}\sum_{j=0}^{t-1}\sum_{k=0}^\infty X_{j+1}(k)=\frac{U_{t+1}-X_0}{U_t} \] \[ = \frac{X_1+X_2+\ldots +X_t}{X_0+X_1+\ldots +X_{t-1}}. \]

Remarks.

  1. Note that the MLS \(\hat{\mu}_t\) equals the ratio of the number of all daughters over the number of all mothers and depends on the observations on the first \(t\) generations: \(X_0, X_1, \ldots X_{t-1}\) only. That is, to estimate the offspring mean \(\mu\) we do not need to observe the complete family tree but instead to know the sizes of the first \(t-1\) consecutive generations (see the stepwise graph on Figure 2).

  2. An intuitive way to interpret \(\hat{\mu}_t\) is to note that we have \(U_{t-1}\) independent and identically distributed random variable (the individual offspring), which sum up to \(U_t-1\). So we just compute the average number of offspring per parent.

Figure 2 Stepwise Presentation. The generation size is labeled on the verstical axis and the time - on the horizontal exis

Example: The Modified Geometric Offspring Distribution

Suppose that we observe the complete family tree with a (zero) modified geometric distribution given by \[ p_k= \left\{ \begin{array}{ll} p_0, & \mbox{if} \quad k=0,\\ (1-p_0)(1-c)c^{k-1}, & \mbox{if}\quad k\ge 1. \end{array} \right. \] It is not difficult to see that the likelihood, for observing up to the \(t\)th, generation becomes \[ L(p_0,c)=p_0^{U_0(t)}[(1-p_0)(1-c)]^{U_{t-1}-U_0(t)}\cdot c^{U_t-X_0}, \] where, as before, \(U_0(t)\) is the total number of individuals with zero offspring. Maximizing the likelihood, for the MLEs of \(p_0\) and \(c\) we obtain \[ \hat{p}_0=\frac{U_0(t)}{U_{t-1}}\qquad \mbox{and}\qquad \hat{c=\frac{U_t-X_0}{U_t+U_{t-1}-X_0-U_0(t)}}, \] respectively. Therefore, \[ \hat{\mu}(t)=\frac{1-\hat{p}_0}{1-\hat{c}}=\frac{U_t-X_0}{U_{t-1}}, \] which is also the formula for \(\hat{\mu}_t\) we obtain earlier using the Lagrange’s method.

Application: The Whooping Crane Population of North America

The whooping crane is a very rare migratory bird with breeding grounds in Canada’s Northwest Territories and wintering grounds in Texas. Miller et al. (1974) give annual counts from 1938 - 1972 of wooping cranes arriving in Texas in the fall. Figure 3 shows the data.

Figure 3 Whooping Cranes Data

Let us use a branching process with modified geometric offspring distribution to describe this population. Since the data are total counts, we do not observe the statistics \((U_0(t), U_{t-1}, U_t)\). The contributions to the likelihhod from each generation therefore will be an average over the possible values of the number of zero offspring individuals in the previous generation.

It is possible to get a rough idea of \(U_0(t)\), since the data contain counts of young birds, who have a differnt plumage, and adult birds. This gives an estimate of the number of adult birds that die between seasons. assuming that those birds had no offspring, that th eyoung birds had no offspring, and tht all otherbirds had offspring, we can estimate \(U_0(t)\). The resulting estimate for \(p_0\) is 0.262, which (using also a corresponding estimate for \(c\)) yelds $=1.035 (see Guttorp (1991), p.47.)

Example: Estimating the Fraction of Community that must be Vaccinated

The problem of determining the fraction of a community that must be vaccinated in order to prevent major epidemics of a communicable disease is a crucial public health problem. In order to describe an epidemic, the population is divided into three possible health states. An individual can be susceptible to infection by a given disease agent, he may have been infected by the agent and be infectious (possibly after an incubation period), or he is removed from the epidemic by death, by isolation, or by immunity or other natural loss of infectiousness. Initially all members of the population are susceptible to infection. The epidemic starts when one or many infectious individuals enter the population and come into contact with its members’ A susceptible person is infected if he has adequate contact with an infectious individual. Following Bartoszyriski (1967) we will see how a branching process can be used to model an epidemic process.

An infected individual contacts a certain number of non-infected members of the population each day. These numbers are iid random variables, which are the pool of “offspring” for that individual. Each contact with an infectious individual may yield an infection independently of the results of other contacts. All individuals act independently, and independent of the history of the process.

The main problem with the Bartoszyriski model is that it assumes an infinite pool of susceptibles. A Galton-Watson process can therefore only be used to approximate the infectious population during the early stages of an epidemic. Since the number of susceptible individuals decreases as the epidemic progresses, it may be unreasonable to assume that the offspring distribution is the same from generation to generation. However, for ealy stages of epidemics in large populations the assumptions underlying the Galton-Watson model are not bad.

In order for an epidemic to become major, a large buildup of cases is needed in the early stage. Consequently we can call an epidemic major if the offspring mean is \(>1\), and minor if it is \(\le 1\) (so that the extinction probability. is 1). In order to prevent major epidemics it is necessary to ensure adequate vaccination in the community so as to make the offspring mean less than one. Suppose that we select a proportion \(\theta\) of the population at random for vaccination. If the vaccination is effective, the offspring distribution changes to \[ p^\ast_k= \left\{ \begin{array}{ll} \theta, & \mbox{if} \quad k=0,\\ (1-\theta)p_k, & \mbox{if}\quad k\ge 1 \end{array} \right. \] with mean \(\mu^\ast=(1-\theta)\mu\), so that \(\mu^\ast<1\) only if \[ \theta>1-\frac{1}{\mu}. \] For numerical illustration, assume that the offspring distribution is Poisson. Then the probability \(1-p_e\) of a major epidemic (non-extinction) is a function of \(\mu\):

Table 17.1 Proportion of vaccination \(\theta\) needed

Offspring Mean 1 1.05 1.1 1.4 1.8
\(1-p_e\) 0 0.09 0.18 0.51 0.73
theta 0 0.05 0.09 0.29 0.44

The third line of the table contains the prooortion of vaccination needed to bring the new mean below 1. If one accepts this model it becomes of considerable importance to be able to estimate \(\mu\) s accurately as possible.

Exercise 17.1 (Smallpox Epidemic in Abakaliki, Nigeria

A total of 30 cases were observed in a population of 120 individuals at risk (becker, 1976). The removal times, in days from the first removal, were

0

13

20 22 25 25 25 26 30

35 38 40 40 42 42

47 50 51 57 58 60 60 61

55 55 56

66 66 71 76

In the following table we have divided the removal times in clusters about multiples of 12 days, the average length of the infectious period, yielding the (approximate) generation sizes

Generation 0 1 2 3 4 5 6 7 Size 1 1 7 6 3 8 4 0

Using the Nigerian smallpox epidemic data above, estimate the offspring mean using a Poisson offspring distribution. Use this to assess the proportion that needs vaccination in order to prevent another outbreak. Discuss the assumptions you are making.