9 (IP, Distributions and Densities, pg 198)

A worker for the Department of Fish and Game is assigned the job of estimating the number of trout in a certain lake of modest size. She proceeds as follows: She catches 100 trout, tags each of them, and puts them back in the lake. One month later, she catches 100 more trout, and notes that 10 of them have tags.

There’s a methodology to estimating population size in ecology called “mark and recapture,” with several different estimators.

The Lincoln-Petersen method can be employed based on only two visits close enough in time that they don’t problematize an assumption of local stability (i.e. no immigration, emigration, die-offs, all individuals equally capturable, etc.). The formula is:

\(\hat{N}_{C} = \frac{Kn}{k}\)

Where \(\hat{N}\) is total population estimate \(n\) is animals captured and marked on the first visit \(K\) is animals captured on the second visit \(k\) is recaptured animals discovered marked on the second visit

The Lincoln-Petersen method is biased at small sample sizes, and the Chapman estimator can be used as an alternative:

\(\hat{N} = \frac{(K+1)(n+1)}{(k+1)}-1\)

(a) Without doing any fancy calculations, give a rough estimate of the number of trout in the lake.
# Using the simpler Lincoln-Petersen method:

n = 100  # 100 trout caught and tagged first month 
K = 100  # 100 trout caught the second month
k = 10   # 10 trout recaptured second month

N = K * n / k
N
## [1] 1000
# Using the alternative Chapman estimator:

N_alt = (K + 1) * (n + 1) / (k + 1) - 1
round(N_alt, digits = 0)
## [1] 926

A rough calculation using the Lincoln-Petersen method estimates that there are 1000 trout in the lake, while the Chapman estimator puts the number at 926.


(b) Let N be the number of trout in the lake. Find an expression, in terms of N, for the probability that the worker would catch 10 tagged trout out of the 100 trout that she caught the second time.

The probability of N is given n, K, and k is arrived at as follows:

\(P(N) = \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}\)

Where \(\hat{N}\) is total population estimate \(n\) is animals captured and marked on the first visit \(K\) is animals captured on the second visit \(k\) is recaptured animals discovered marked on the second visit

N = 1000  # based on Lincolen-Petersen estimate
n = 100  # 100 trout caught and tagged first month 
K = 100  # 100 trout caught the second month
k = 10   # 10 trout recaptured second month

# Binomial coefficients can be calculated using the base R choose function.  Coded as custom probability of recapture function for reuse.
prob_recapt <- function (N, n, K, k) {
  choose(K, k) * choose((N - K), (n - k)) / choose(N, n)
}
prob_recapt(N, n, K, k)
## [1] 0.1389853

The probability of recapturing 10 tagged trout of of the 100 originally captured is 13.9%.


(c) Find the value of N which maximizes the expression in part (b). This value is called the maximum likelihood estimate for the unknown quantity N. Hint: Consider the ratio of the expressions for successive values of N.

The maximum likelihood estimate of \(N\) should be arrived at by \(\hat{N}_{C} = \frac{Kn}{k}\). We’ll test this by calculating \(P(N)\) for several smaller integer values of \(N\).

# Sequence of hypothetical N values less than estimated
N_hyp <- seq(from = 1000, to = 990, by = -1)

# Probability of recapture based on sequence of hypothetical N values
P_N <- prob_recapt(N_hyp, n, K, k)

data.frame(N_hyp, P_N)
##    N_hyp       P_N
## 1   1000 0.1389853
## 2    999 0.1389853
## 3    998 0.1389835
## 4    997 0.1389801
## 5    996 0.1389749
## 6    995 0.1389680
## 7    994 0.1389593
## 8    993 0.1389489
## 9    992 0.1389367
## 10   991 0.1389227
## 11   990 0.1389070

\(N = 1000\) yields the highest probability and is indeed the maximum likelihood estimate.


References