<- 0.005*(1-0.005)
a <- a/4000
b <- sqrt(b)
c <- 2.326*c
d <- 0.005 - d
lower <- 0.005 + d
upper lower
[1] 0.002405962
upper
[1] 0.007594038
There are \(n_{0}\) fish in a lake. A random sample of m of these fish is taken. the fish in this sample are tagged and released unharmed back into the lake. After a suitable interval, a second random sample of size n is taken. The random variable R is the number of fish in this second sample that are found to have been tagged. Assuming that the probability that a fish is captured is independent of whether it has been tagged or not, and that \(n_{0}\) is sufficiently large for a binomial approximation to be used, obtain the expectation of R in terms of m,n and \(n_{0}\). Suppose that m=100, n=4000 and the observed value of R is 20. Obtain an approximate symmetric 98% confidence interval for the proportion of fish in the lake which are tagged. Deduce an approximate 98% confidence interval for \(n_{0}\).
Firstly you need to write the formula connecting R to m,n and \(n_{0}\).
\(\dfrac{R}{m}=\dfrac{n}{n_{0}}\)
\(R=\dfrac{m \times n}{n_{0}}\)
\(\dfrac{R}{n}=\dfrac{m}{n_{0}}\)
The sample proportion is
\(\dfrac{R}{n} = \dfrac{20}{4000} = 0.005\)
The critical value for the 98% confidence interval is 2.326 which means that the formula for the confidence interval for the sample proportion p is
\(p_{s}-2.326 \sqrt{\dfrac{p_{s}(1-p_{s})}{n}}< p <p_{s}+2.326 \sqrt{\dfrac{p_{s}(1-p_{s})}{n}}\)
where \(p_{s}=0.005\) and \(n=4000\)
<- 0.005*(1-0.005)
a <- a/4000
b <- sqrt(b)
c <- 2.326*c
d <- 0.005 - d
lower <- 0.005 + d
upper lower
[1] 0.002405962
upper
[1] 0.007594038
So that
\(0.00241<p< 0.00759\)
The sample proportion multiplied by \(n_{0}\) gives m.
<- 100/lower
n1 <- 100/upper
n2 n1
[1] 41563.41
n2
[1] 13168.23
Which gives confidence bounds for \(n_{0}\) of approximately 13168 to 41563.
This is important as calculating the confidence bands for the proportion gives a confidence interval for \(n_{0}\) which is not centred on 20,000 which is the point estimate that you get for using the population proportion for the sample.
Capture and recapture experiments are a lot more complex than the versions given at GCSE. Even by simulation with a known value for \(n_{0}\) and a specified probability of capture you can get some very different estimates to the reality from a single sample.