This notebook complements the code which follows from the derivation in ‘intuition.pdf’. We want to check out the graphs of \(f(N|A,B)\) for \(f(N)\) a discrete uniform distribution with lower limit \(a = 10\) and upper limit \(b=750\). We follow the diagram in the PDF, setting \(A = 50\), \(B = 150\) (computational issues restrict the upper limits of these values). To make this easier to follow, we include the referenced PDF diagram here also:

Moving on to the plots of \(f(N|A,B)\), the first plot shows the case where both distributions of \(p,q\) are uniform, so that we set the Beta parameters \(\alpha_p=\beta_p=\alpha_q=\beta_q=1\). The mode in this case looks to be just below \(N=300\).

uniform_plot()

Is this what we expected? For \(f(p),f(q)\) uniform, we sample and average 1000 values from \(f(q)\) and use this as \(\hat{p}\) (to check that \(\hat{N} \approx \frac{A}{1-\hat{p}}\)).

phat <- pqhat(1,1)
print(phat)
## [1] 0.5017385

Now we calculate \(\frac{A}{1-\hat{p}}\):

A <- 50
A/(1-phat)
## [1] 100.3489

This is as expected since \(\hat{p} \approx 0.5\) for uniform \(f(p)\) so that \(\frac{A}{1-\hat{p}} \approx 2A = 100\). However, in this case, this does not match the mode of the above distribution, which was near \(N=300\).

Next, we have a peaked distribution for \(q\), with weight near 1, keeping a uniform distribution of \(p\).

peakedq1_plot()

Now is this what we expected? As \(f(q)\) approaches a point mass about 1, \(\hat{N} \to A + \frac{B}{\hat{q}}= A + B\). Since \(A + B = 200\), this appears to coincide in this case.

The following plot is again for a peaked \(q\) distribution, but this time with point mass towards 0. \(p\) is again kept uniform.

peakedq0_plot()

Here, we expect \(\hat{N} \to A + \frac{B}{\hat{q}}\) to mean \(\hat{N} \to 750\), the upper endpoint of our range, since \(\hat{q} \to 0\) means \(\hat{N}\) blows up. Again, this seems to be in line with our expectations, since if \(q\) is near 0 but \(B=150\), this suggests most individuals at the parent of \(B\) follow the other binary path, so that the sister node and parent of \(B\) are likely to have a much higher population than \(B\), in turn resulting in a high value of \(N\).

In conclusion, so long as \(p\) is uniform and \(q\) is “peaky”, the distribution of \(f(N|A,B)\) is as intuitively predicted. However, when \(f(q)\) is also uniform, it is not the case that the mode \(\hat{N} \approx 2A\), and information from \(B\) seems to be skewing the mode upwards.

We test this situation by next setting \(q\) back to uniform, and playing with the distribution of \(p\). We try the same settings for \(\alpha_p,\beta_p\) as we did for \(\alpha_q, \beta_q\), beginning with concentrating weight near 1:

peakedp1_plot()

When \(f(p)\) is close to a point mass near 1, most of the population at \(N\) will move through the parent of \(B\). When \(q\) is kept uniform, it is not providing much information about the other endpoint at the level of node \(B\), and this plot is similar to the case where the distribution of \(q\) is peaked around 0. The most likely value of \(N\) again falls at the highest endpoint of the range, but with increased density at lower values also, since \(q\) is uniform and the population at \(B\) represents only half of the parent population on average.

Again, keeping \(q\) uniform, we shift the weight of \(f(p)\) near 0 instead:

peakedp0_plot()

We again observe similar behavior to the case where \(q\) is peaked near 1, however, in this case, this does not match my intuitive expectations. If \(f(p)\) is concentrated near 0, most of the population from \(N\) will go to \(A\), where the population of \(A = 50\), yet we still see a peak near \(A+B\) instead, so that the information from node \(B\) is still “known” and is affecting the mode.

Now we check out what happens if both \(p,q\) have peaked distributions, starting with \(p\) having weight near 1, and \(q\) having weight near 0:

peakedp1q0_plot()

Again, this performs as expected. If most of the population at \(N\) moves through branch \(p\), and \(B\) represents only a small fraction of those individuals, we expect the density to be heavily weighted at the high end of \(N\), which it is.

Lastly, we observe what happens when \(f(p)\) is as above but \(f(q)\) has most mass near 1 also:

peakedp1q1_plot()

In this case, the population at \(N\) should be close to the population at \(B=150\), which is again not reflected in the mode of this distribution, which is heavily weighted on \(A+B\), suggesting the information at \(A\) is still “seen” and is not allowing a mode as low as 150. Food for thought.

Log density

Now we try the same as above with the log density, as this allows us to take advantage of the built-in lgamma function in R.

loguniform_plot()

We try a similar exercise for different values of \(N, A, B\). Here we take \(A = 5000\), \(B = 15000\), so that \(N \geq 20000\). We also let \(a = 15000\) and \(b = 60000\) be the parameters of discrete uniform \(f(N)\).

logdens_plot(1,1,1,1,5000,15000,15000,75000)

We can compare the outcome in the peaked cases also. For mass on 1:

logdens_plot(1,1,150,1,5000,15000,15000,75000)

And for point mass on 0:

logdens_plot(1,1,1,150,5000,15000,15000,75000)

We previously tried similar values one one-hundredth of those above, and produced a graph with similarly placed mode.

Middle Case Lastly, we revert back to smaller values of \(N\) and try a middle case - \(q \sim Beta(15,15)\).

middlecase_plot()

We use this case in addition to the edge cases to compare with the MM and Bayesian models, which we now compare to.

Multiplier Method Comparison to MM weighted estimate continues in the file intuition_notebook2.Rmd.

Bayesian Model Comparison to a Bayesian model continues in the file intuition_notebook3.Rmd.