Why fisheries collapse, and why they don’t need to

Minimum background preparation: Stevens (Ch. 5, 2021)

Based on: Roughgarden and Smith (1996, Why fisheries collapse and what to do about it. PNAS USA 93:5078-5083)

Goals

In this lab, you should develop a deeper understanding of, and explain

observation error versus process error or variation.
the benefit of selecting particular probability distributions to represent different types of unknown quantities,
maximum sustained yield (MSY), and problems associated with the concept,
the effect of environmental stochasticity of the outcome of MSY harvesting strategies, and
how models can help clarify and specify conceptual hypotheses.

Open R using RStudio and specify the working directory. We also load primer, tidyverse, and mvtnorm.

Background

Harvesting as a source of mortality

Consult Stevens (2021), section 5.5, and redraw by hand Figs. 5.16 using the same x-axis and y-axis scales on both graphs. Both $y$-axes should range from 0–15, and both $x$-axes should range from 0–100. Compare your figures with those of the book and those drawn by a classmate. Identify and explain all differences that you find.

Difficulties in managing a population

When we manage a population, we use our estimates of population size and growth rate to select a harvest rate that maximizes the population growth rate. If we can maximize the population growth rate, then we can maximize our harvests each year. However, at least two important difficulties arise:

We don’t know the actual size of the population, because we estimate the size of the population imperfectly.
We can’t predict future environment or the future population growth rate.

These two problems interact synergisticly to make decisions about harvesting very difficult.

These two sources of uncertainty are what ecologists and statisticians call observation error and process error respectively.

Observation and process error

Observation error is fairly simple. It is simly the difference between the observed and actual value of something. For instance, we assume that there is some constant deterministic process, such as population growth, and when we attempt to observe the population, we do so imperfectly. Sometimes we overestimate and sometimes we underestimate the size of the population. Nonetheless, the population process is constant, and our count does not influence directly the population.

Process error is variation in the underlying deterministic process for which we have not accounted. For instance, this might be variation in birth or death rates, or unexpected changes in the carrying capacity. In an ongoing dynamic process, these variations multiply over time to create a much more variable and unpredictable dynamic.

Let’s illustrate these two processes using discrete geometric growth.

Observation error is simply population projection plus added noise.

Actual population growth process: \[N_t = \lambda N_{t-1}\] Observed population size: \[N_{t,o} \sim \mathrm{Poisson}\left(N_t\right)\] where we pretend our samples are a random draw from the population at time $t$, which follows the Poisson distribution. The Poisson distribution describes “the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event” (Wikipedia).

# Total amount of time (n generations)
Tt <- 20

## data frame to hold results
obs.err <- data.frame(
  N.true = numeric(Tt), 
  N.obs = numeric(Tt)
)

## finite rate of increase
lambda <- 1.1

## Initial states
## True N at t=0
obs.err$N.true[1] <- 10
## Observed N at t=0
obs.err$N.obs <- rpois(1, obs.err$N.true[1])

for(t in 2:Tt){
  
  ## true population growth
  ## note this is completely deterministic
  obs.err$N.true[t] <- lambda * obs.err$N.true[t-1]
  
  ## observed population size
  ## only our observation has uncertainty
  obs.err$N.obs[t] <- rpois(1, obs.err$N.true[t])
}
obs.err[1:4,]

##   N.true N.obs
## 1  10.00     9
## 2  11.00    12
## 3  12.10    15
## 4  13.31    18

Pure process error is more interesting. Here the noise in the system emerges in the biological mechanistic process. For instance, our growth rate $\lambda$ might vary each year. Here we just let some unknown random process (migration? birth or death rates?) add noise to the actual true $N$, while we observe it perfectly, without error.

Actual population growth with process error: \[N_t \sim \mathrm{Poisson}\left(\lambda N_{t-1}\right)\] where we pretend the actual population size is determined by a stochastic process which follows the Poisson distribution with mean of $\lambda N_{t-1}$.

Observed population size (without error): \[N_{t,o} = N_t\] Note that process error is not “error” as we typically construe it. It is a stochastic process which our simple model does not completely account for.

## data frame to hold results
proc.err <- data.frame(
  N.true = numeric(Tt), 
  N.obs = numeric(Tt)
)

## finite rate of increase
lambda <- 1.1

## Initial states
## True N at t=0
proc.err$N.true[1] <- 10
## Observed N at t=0
proc.err$N.obs <- obs.err$N.true[1]

set.seed(2)
for(t in 2:Tt){
  
  ## true population growth
  ## note that we introduce stochasticity into the biology here
  proc.err$N.true[t] <- rpois(1, lambda * proc.err$N.true[t-1])
  
  ## observed population size
  ## we observe the population without error
  proc.err$N.obs[t] <- proc.err$N.true[t]
}
proc.err[1:4,]

##   N.true N.obs
## 1     10    10
## 2      8     8
## 3      6     6
## 4     11    11

ggplot(data=obs.err, aes(x=0:(Tt-1), y=N.true)) + 
  geom_point() + geom_line() + 
  geom_point(aes(x=0:(Tt-1), y=N.obs)) + 
  geom_line(aes(x=0:(Tt-1), y=N.obs), linetype=2) + 
  geom_point(data=proc.err, aes(x=0:(Tt-1), y=N.true), color=2) +
  geom_line(data=proc.err, aes(x=0:(Tt-1), y=N.true), color=2)

Pure observation error is relative constant (or often covaries with population size) over time, whereas process error compounds on itself.

Of course, in real ecological systems we have process noise or error AND we sample imperfectly.

Modeling a harvested population with observation error and two types of process error

$*Our model of a salmon fishery. N1-N3 are determined by environmentally driven $r$ and $K$ and by harvest ($H$). The hrvest and other data are used to estimate N ($N_{est}$) which is used to calculate total allowable catch ($T$). Total allowable catch is supposed to set harvest quotas, but in practice is not typically achieved.*$

Our model of a salmon fishery. N1-N3 are determined by environmentally driven $r$ and $K$ and by harvest ($H$). The hrvest and other data are used to estimate N ($N_{est}$) which is used to calculate total allowable catch ($T$). Total allowable catch is supposed to set harvest quotas, but in practice is not typically achieved.

We will use simple discrete logistic growth to illustrate a harvested population, \[N_{t} = N_{t-1} + r_dN_{t-1} \left( 1-\alpha_{t-1} N_{t-1} \right) - FN_{\mathrm{est},\, t-1}\] where $N$ is the true, actual population size at time $t$, something that we cannot observe directly. $N_{\mathrm{est}}$ is the observed estimate of $N$. The per capita intrinsic rate of increase, $r_d$ and negative density dependence $\alpha$ both vary annually with the environment. $F$ is fishing mortality rate, which is the fraction of the population we choose to harvest.

Maximum sustainable yield occurs at $K/2$ because that is when the population growth rate is greatest. That will occur when fishing mortality is half of the intrinsic rate of increase, $F=r_d/2$. However, the economic optimum is different because it factors in investment in financial stocks occurs at $F=(r_d + \rho)/2$ where $\rho$ is the return on investment, or interest rate. Roughgarden and Smith (1996) argued that once you factor in environmental risk, the long term optimum is much closer to $F=r_d/4$, which would keep the population closer to $(3/4)K$.

In this model, we include

annual variation in $r_d$ and negative density dependence (i.e., carrying capacity) due unknown biotic or abiotic factors,
sampling uncertainty leading to inaccurate population estimates,
error determining the total allowable catch or optimal harvest (due to inaccurate population estimates), and
a stochastic difference between the total allowable catch and actual harvest due to practical difficulties in fishing and managing fisheries.

Once we understand our model, we will then try out a variety of harvesting strategies in order to maximize long-term harvest and also never drive the population extinct.

Recall that if we want to harvest the entire population, and invest the money into something else, we could do that. Alternatively, we could hope to maintain the fishing industry, as well as the local economies that depend on fishing, and the cultural way of life. To do that, we need the population to persist.

## The population:
## How many years?
Y=50
## discrete per capita growth rate
rd <- 0.1
## average carrying capacity
K <- 1e4
## average negative density dependence
alpha <- 1/K

## set harvest rate
## Harvest rate units are inds harvested per 
## individuals in the population.
## It is a dimensionless fraction. 
## to start, try 
harvest <- 0.05

Envrionmental variability

Here we model the environmental variability in $r_d$ and $\alpha$, and the estimation uncertainty in sampling accurately. When we do the simulations, we will allow $r_d$ and $\alpha$ to differ each. Negative density dependence, $alpha$ is log-normally distributed with a mean of 0.0001, while $r_d$ is normally distributed with a mean of 0.1. In good years $r_d>0.1$ and $alpha < 0.0001$. In bad years $r_d < 0.1$ and $alpha > 0.0001$.

In the code below, env.var is a multiplier of the magnitude of the environmental variation. When env.var=1, the standard deviation $r_d$ and $alpha$ are equal to $r_d$ and $alpha$. That is, if we let env.var = v, then \[\sigma_{rd} = vr_d \quad;\quad \sigma_{\alpha} = v \alpha\] so when env.var=3 that would mean that the $\sigma_{rd} = 3r_d$. The purpose of the histograms below is for you to see and appreciate the distribution of the yearly, random $r_d$ and $alpha$.

## Select a random seed.
 my.seed <- 3

## environmental variation that causes variation in alpha and r
## let this vary between
env.var = 1

## standard deviation of negative density dependence (alpha) 
sigma.a = alpha*env.var

## method of moments to get parameters of the log-normal distribution
## X ~ exp(mu + sigma Z); Z ~ normal(0, 1)
mu <- log( alpha^2/sqrt(sigma.a^2 + alpha^2) )
Var <- log(sigma.a^2/alpha^2 + 1) 


## Standard deviation of r 
sigma.r = rd * env.var

## r and K are positively correlated, so
## alpha and r are negatively correlated
## Variance - covariance matrix
varcov <-  matrix(c(sigma.r^2, -sigma.r*sqrt(Var),
                    -sigma.r*sqrt(Var),Var), nrow=2)

## random negatively correlated r and alpha
set.seed(my.seed)
rn <- rmvnorm(Y, mean=c(rd, mu), sigma=varcov ) 

## rd fussing
rd.t <- rn[,1] - mean(rn[,1]) + rd
## alpha fussing
a1 <- exp( rn[,2] ) 
a2 <- a1 - mean(a1) + alpha
alpha.t <- ifelse(a2 < 0, 0, a2)

plot(rd.t, alpha.t)
hist(alpha.t); abline(v=c(alpha, mean(alpha.t)), lty=1:2)
hist(rd.t); abline(v=c(rd, mean(rd.t)), lty=1:2)

Here we select an amount of estimation uncertainty. We assume that there is error in determining the total allowable catch or optimal harvest (due to inaccurate population estimates), and between the total allowable catch and actual harvest. We assume that this error can be described by the negative binomial distribution. This distribution is like the Poisson distribution but can be much more variable. One way to think of it is that it is a Poisson distribution, but one in which the expected mean varies from observation to observation.

\[y_i \sim \mathrm{Poisson}\left(\mu_i\right)\] \[\mu_i \sim \mathrm{gamma}\left(\alpha, \beta\right)\] where the gamma distribution can be any number greater than zero. Don’t sweat these details. Just remember that we can

select different probability distributions to represent different types of unknowns, and
the Poisson distribution is good for independent counts, and the negative binomial is like the Poisson but can be much more variable.

The negative binomial distribution can be parameterized by it’s mean $\mu$ and $k$ a measure of dispersion. When $k$ is large, the variance is smaller; when $k$ is less than 10, the variance and skew increases dramatically.

## Estimation variability (large is less variable)
k = 10000
## This will be like the Poisson distribution

Next, we do our simulation, starting with preparation.

## Create a dataframe to hold all the results
salmon <- data.frame(
  rd.t = rd.t,
  alpha.t = alpha.t,
  N.true = numeric(Y), 
  N.est = numeric(Y),
  TAC = numeric(Y),
  H.total = numeric(Y),
  Y = 0:(Y-1)
)

## Set initial states
## True N at t=0
salmon$N.true[1] <- K/2
## Estimated N at t=0
salmon$N.est[1] <- rnbinom(1, size=k, mu = salmon$N.true[1])
salmon$TAC[1] <- harvest * salmon$N.est[1]
salmon$H.total[1] <- rnbinom(1, size=k, mu=salmon$TAC[1])
head(salmon)

##          rd.t      alpha.t N.true N.est    TAC H.total Y
## 1  0.10948207 7.658721e-05   5000  5001 250.05     265 0
## 2  0.20938697 4.209028e-05      0     0   0.00       0 1
## 3  0.09125372 8.659826e-05      0     0   0.00       0 2
## 4 -0.01793572 1.919649e-04      0     0   0.00       0 3
## 5 -0.04845813 2.430198e-04      0     0   0.00       0 4
## 6  0.19534248 4.538846e-05      0     0   0.00       0 5

Now we project the population and the harvests.

set.seed(my.seed)
for(t in 2:Y){
 ## Defining variables
  ## previous year
  N.true = salmon$N.true[t-1]; 
  N.est = salmon$N.est[t-1];
  TAC = salmon$TAC[t-1]
  ## upcoming/current year
  alpha.t <- salmon$alpha.t[t]
  rd.t <- salmon$rd.t[t]
  
  ## Projected true N
  ## True N, following harvest
  ## Contains process error 
  N.growth <- N.true + rd * N.true*(1 - alpha.t*N.true) 
  ## Last year's observation error causes process error
  ## Harvest, based on uncertain estimate of N in previous year
  H.total2 <- rnbinom(1, size=k, mu=TAC)
  ## true N in the next year, including growth and actual harvest
  N.true2 <- N.growth - H.total2
  
  ## check model boundary conditions
  if(N.growth <= 0) N.true2 <- 0
  
  ## Observation process
  ## Estimate of N.true2
  N.est2 <- rnbinom(1, size=k, mu=N.true2)
  ## calculate Total allowable catch for next year based on estimated N
  TAC2 <- harvest * N.est2
  
  ## Collect results into data frame
  salmon[t,3:6] <- c(N.true2, N.est2, TAC2, H.total2)
}
tail(salmon)

##           rd.t      alpha.t   N.true N.est    TAC H.total  Y
## 45  0.18516058 4.803257e-05 4895.590  4804 240.20     231 44
## 46  0.18002052 4.945496e-05 4999.621  5034 251.70     267 45
## 47 -0.02049478 1.957649e-04 4770.245  4742 237.10     240 46
## 48  0.09425665 8.484277e-05 4802.208  4695 234.75     252 47
## 49  0.03269480 1.312689e-04 4772.706  4758 237.90     207 48
## 50  0.12379811 6.972349e-05 4851.156  4926 246.30     240 49

Plotting results

Next we plot some of the results.

## Plot pop size
## rearrange data into long format
N.data <- salmon %>% select(N.true, N.est, Y) %>% pivot_longer(cols=-Y, names_to="type", values_to="N")
## plot N
ggplot(data=N.data, aes(Y, N, colour=type)) +
  geom_point() + geom_line()

c(Y=Y,k=k,env.var=env.var, rd=rd, alpha=alpha, my.seed=my.seed)

##       Y       k env.var      rd   alpha my.seed 
##   5e+01   1e+04   1e+00   1e-01   1e-04   3e+00

Plot catch

## plot catch
TACt <- salmon$TAC[-Y]
Htplus1 <- salmon$H.total[-1]

catch.data <- data.frame(Ytplus1 = c(1:(Y-1),1:(Y-1)),
                         type = c(rep("TACt",(Y-1)), rep("Htplus1",(Y-1))),
                         catch = c(TACt, Htplus1))
catch.data.w <- data.frame(Ytplus1 = 1:(Y-1), TACt=TACt, Htplus1=Htplus1)
ggplot(catch.data.w, aes(TACt, Htplus1)) + geom_point() + labs(title="Harvest in current year vs. TAC from previous year")

catch.data.l <- catch.data.w %>% pivot_longer(cols=-Ytplus1, names_to="type", values_to="catch")

ggplot(data=catch.data.l, aes(Ytplus1, catch, colour=type)) +
  geom_point() + geom_line() + labs(title="Harvest in current year (TAC from previous year)")

c(Y=Y,k=k,env.var=env.var, rd=rd, alpha=alpha, my.seed=my.seed)

##       Y       k env.var      rd   alpha my.seed 
##   5e+01   1e+04   1e+00   1e-01   1e-04   3e+00

summary(salmon)

##       rd.t             alpha.t              N.true         N.est     
##  Min.   :-0.05568   Min.   :2.600e-05   Min.   :4574   Min.   :4506  
##  1st Qu.: 0.02220   1st Qu.:5.007e-05   1st Qu.:4771   1st Qu.:4748  
##  Median : 0.09886   Median :8.226e-05   Median :4913   Median :4924  
##  Mean   : 0.10000   Mean   :1.000e-04   Mean   :4930   Mean   :4929  
##  3rd Qu.: 0.17786   3rd Qu.:1.419e-04   3rd Qu.:5060   3rd Qu.:5085  
##  Max.   : 0.32099   Max.   :2.571e-04   Max.   :5468   Max.   :5454  
##       TAC           H.total            Y        
##  Min.   :225.3   Min.   :207.0   Min.   : 0.00  
##  1st Qu.:237.4   1st Qu.:233.0   1st Qu.:12.25  
##  Median :246.2   Median :246.0   Median :24.50  
##  Mean   :246.5   Mean   :247.9   Mean   :24.50  
##  3rd Qu.:254.2   3rd Qu.:260.5   3rd Qu.:36.75  
##  Max.   :272.7   Max.   :292.0   Max.   :49.00

c(Y=Y,k=k,env.var=env.var, rd=rd, alpha=alpha, my.seed=my.seed)

##       Y       k env.var      rd   alpha my.seed 
##   5e+01   1e+04   1e+00   1e-01   1e-04   3e+00

Questions

What happens when you

Increase sampling uncertainty?
Increase environmental variability?

Advanced: Typically environments show environmental autocorrelation or reddened environmental noise. See 1/f noise in chapter 4 of Stevens (2021). What happens to fisheries dynamics under these conditions?