Recall the definition of a Galton-Watson (GW) branching process from Lecture 6: \[ X_n = Y_{1,n−1} + Y_{2,n−1}+. . . +Y_{X_{n−1},n−1}=\sum_{k=1}^{X_{n-1}} Y_{k,n-1}, \]
where \(X_n\) is the size of the population at time \(n\) and \(Y_{i,n-1}\) gives the number of offspring produced by the \(i^{th}\) member of the \(n-1\)st generation (naturally, the total sum of these offspring constitute the population at time (generation) \(n\)). There are \(X_{n-1}\) of these \(Y\) terms because, well, there are \(X_{n-1}\) cells in the population at time \(n-1\)! Here we are concerned with the quantity of information contained in a sample \(\{ X_0=1, X_1, \ldots, X_n\}\) from a GW process.
In this section we will summarize and discuss some of the results in estimation and for the mean of the offspring distribution. The estimates depend on the observation scheme. Consider the following observation schemes will be considered
(OS1) The entire tree up to the moment \(t\): \(Y_{1,1}, \ldots , Y_{X_1,1}, Y_{1,2}, \ldots, Y_{X_2,2}, \ldots Y_{1,t-1}, \ldots, Y_{X_t,t-1}\);
(OS2) The successive generations: \(X_0, X_1, \ldots, X_t\);
(OS3) Two successive generations: \(X_t, X_{t+1}\);
(OS4) The initial and another generation: \(X_0, X_t\);
(OS5) Left censored observations: \(X_t, X_{t+1}, \ldots, X_{t+T}\) for \(T>0\).
Assume now the observation scheme (OS1). This is the full information for the process’ tree. As we have already see in Lecture 17, the MLE estimators for the offspring probabilities \(p_k\), \(k=0,1,\ldots\) are given by \[ \hat{p}_k(t)=\frac{U_t(k)}{U_t}, \] where \(U_t\) is the total number of individuals up to the \((t-1)\)st generation and \(U_t(k)\) denotes the total number of individuals from the \(0, 1, \ldots, (t-1)\)st generations having exactly \(k\) offspring. Making use of the above estimators, for the MLE \(\hat{\mu_t}\) of the offspring mean \(\mu\) we obtain \[ \hat{\mu_t}=\sum_{k=0}^\infty k\hat{p}_k(t) = \frac{X_1+X_2+\ldots +X_t}{X_0+X_1+\ldots +X_{t-1}}. \]
It is a pleasant surprise that \(\hat{\mu}_t\), depends on the successive generations (OS2) only.
As we established in Lecture 17, \(\hat{\mu}_t\) is the maximum likelihood estimator (MLE) for \(\mu\). It can be also proved that, on the explosion set, it is strongly consistent and asymptotically normal. More precisly the following theorem holds.
Theorem 18.1 Let \(\mu>1\) and the offspring variance \(\sigma^2<\infty\). Then on the explosion set \(\{X_t\to \infty\}\):
\(\hat{\mu}_t\to \mu \qquad \mbox{a.s.}\);
\(\sqrt{\frac{U_t}{\sigma^2}} (\hat{\mu}_t -\mu) \to N(0,1)\),
where \(N(0,1)\) is a standard normal random variable and the convergence is in distribution.
A simple method of moments estimator \(\mu^\ast_t\) can be obtained by equating the current generation size to its expected value, \[ X_t=E(X_t)=\mu^t, \] so that \[ \mu^\ast_t=X_t^{1/t} \] It is based on one generation size only (OS4). It follows from Jensen inequality that \[ E(\mu^\ast_t)=E(X_t^{1/t})<\left(E(X_t)\right)^{1/t}=\left(\mu^t\right)^{1/t}=\mu. \] Therefore, \(\mu^\ast_t\) will underestimate \(\mu\) on average. Yet, since \(X_t/\mu^t\to M_\infty\) a.s. (see Lecture 9), we have \[ \frac{\mu^\ast_t}{\mu}=\left(\frac{Z_t}{\mu^t}\right)^{1/t}\sim M_\infty^{1/t}\to 1\qquad \mbox{a.s.} \quad t\to \infty. \]
The estimator \(\mu^\ast_t\) is consistent but not asymptotically normally distributed.
Assume now the observation scheme (O3). WE may also apply the method of moments to the conditionaldistribution of \(X_t\), given the prevous generation sizes. Since (using the Markov property) \[ E(X_t\ |\ X_0, X_1, \ldots, X_{t-1})=E(Z_t\ |\ Z_{t-1})=X_{t-1}\mu, \] we get the following Lotka-Nagaev estimator \(\bar{\mu}_t\) for \(\mu\) \[ \bar{\mu}_t= \left\{ \begin{array}{ll} \frac{X_t}{X_{t-1}} & \mbox{if} \quad X_{t-1}>0,\\ 1, & \mbox{otherwise}. \end{array} \right. \] This estimator has the following properties.
Theorem 18.2 Let \(\mu>1\) and the offspring variance \(\sigma^2<\infty\). Then on the explosion set \(\{X_t\to \infty\}\):
\(\bar{\mu}_t \to \mu \qquad \mbox{a.s.}\);
\(\sqrt{\frac{X_t}{\sigma^2}} (\bar{\mu}_t -\mu) \to N(0,1)\),
where \(N(0,1)\) is a standard normal random variable and the convergence is in distribution.
A forth censored estimator, \(\tilde{\mu}_{t,T}\) for \(\mu\) based on the sample scheme (OS5), is defined as follows. \[ \tilde{\mu}_{t,T}=\frac{X_{t+1}+X_{t+2}+\ldots +X_{t+T}}{X_t+X_{t+1}+\ldots +X_{t+T-1}}, \] which can be considered as a version of \(\hat{\mu}_t\) setting \(t = 0\) or of Lotka-Nagaev estimator for \(T = 1\). On the explosion set, the estimator \(\tilde{\mu}_{t,T}\) is consistent and asymptotically unbiased when \(t\to \infty\) for fixed \(T\) and \(T\to \infty\) for fixed \(t\).
As previously noted, if \(\mu\le 1\) then \(X_n\to 0\) almost surely; so, as \(n\to \infty\) we are left with only a finite number of observations. It should be clear then that there exists no consistent estimator for any unknown parameter. In fact, even when \(\mu >l\), if \(p_0>0\), there is a positive probability of extinction and once again there cannot exist consistent estimators on the whole space.
However, some consistency results will be possible in any of the following situations:
if it is conditioned upon non-extinction or upon \(X_n>0\);
if the sample is observed conditional on the total progeny tending to infinity;
if the initial number of ancestors tends to infinity
if there is immigration added so 0 is not an absorbing state.
These four situations will not be considered here.