Biomass–dynamic models such as SPiCT represent uncertainty in population dynamics through a process–error term. If the production function and state equation are correctly specified, and all relevant covariates are included, these process residuals should be independent and identically distributed around zero (white noise), with approximately constant variance. In practice, however, the process–error term often compensates for structural misspecification and non‑stationarity, for example time‑varying productivity, changing selectivity or catchability, or changes in observation quality, especially in stocks that are not the main target.
Following Walters et al. (2008), empirical surplus production is defined directly from biomass change and catch as
\[ SP_t = B_{t+1} - B_t + C_t, \]
i.e. the net biomass gain that would have occurred in the absence of harvesting.
The surplus production function \(P(B)\) (e.g. Schaefer, Pella–Tomlinson, Fletcher) is the expected value of \(SP_t = B_{t+1} - B_t + C_t\) as a function of \(B_t\) under equilibrium assumptions. The realised surplus production \(SP_t\) in year \(t\) is plotted against \(B_t\) and overlaid on the equilibrium production curve. The residual \(SP_t - P(B_t)\) is the process error on the biomass scale, assuming observation error in biomass is negligible or has been accounted for.
Here we propose a set of diagnostics for process error using an example data set. The aim is not to formally test hypotheses, but to provide a visual summary that highlights when a model is explaining non‑stationary or mis‑specified dynamics via the process‑error term. We focus on four aspects:
We use standard diagnostics for model fits—runs tests, residual plots, autocorrelation functions—to identify non‑stationarity, the impact of environmental drivers, and the balance between genuine stochasticity and model misspecification. If process residuals appear random with approximately constant variance, this suggests a correctly specified production function with genuine stochastic variation, conditional on the chosen model structure and fixed parameters. Positive or negative trends in residuals may indicate production‑curve misspecification or structural changes, for example an incorrect shape parameter, time‑varying productivity, changes in discarding or landings, or regime shifts. Consistently large residuals relative to the assumed process variance may reflect high biological variability (e.g. recruitment‑driven dynamics) or a mis‑specified variance structure. Significant temporal autocorrelation points to environmental forcing, omitted covariates, or serially correlated implementation error rather than white‑noise process error; in Walters’ language, many marine systems exhibit “red noise” rather than white.
The following R code loads pre‑computed process‑error output (e.g. from SPiCT, JABBA or SS3) and constructs a set of diagnostic plots for a selected stock.
library(spict)
library(ggplot2)
library(dplyr)
library(tidyr)
library(patchwork)
load("P:/rfmo/ices/WKBMSYSPiCT4/fle.27,21-23/finalFit.RData")
fit = calc.process.resid(fit)
getSpictPrd = function(fit, xMax = 1.3, nX = 200) {
## -- Parameters for Pella–Tomlinson curve --
n = get.par("logn", fit, exp = TRUE)[1, "est"]
msy = get.par("MSY", fit, exp =!TRUE)[1, "est"] # MSY already on natural scale
K = get.par("logK", fit, exp = TRUE)[1, "est"]
# Relative biomass grid B/K
x = seq(0.01, xMax, length.out = nX)
# Pella–Tomlinson parameterised by MSY and n (Walters et al. style P(B))
gamma = (1 / (n - 1)) * n^(n / (n - 1))
Pfun = gamma * msy * (x - x^n)
prodFun = data.frame(
bRel = x,
b = x * K,
sp = Pfun
)
## -- Realised surplus production: SP_t = B_{t+1} - B_t + C_t --
# Biomass on SPiCT grid
Bfull = get.par("logB", fit, exp = TRUE)[, "est"]
tFull = fit$inp$time
# Keep integer years
sel = (tFull - floor(tFull)) == 0
B = Bfull[sel]
t = tFull[sel]
# Catches aligned to [t, t+1]
C = fit$inp$obsC
C = C[seq_len(length(B) - 1)]
SP = B[-1] - B[-length(B)] + C
prodTraj = data.frame(
year = tail(t, -1), # end of interval
sp = SP,
bRel = B[-length(B)] / K, # starting biomass of interval, relative to K
b = B[-length(B)]
)
# Label process residuals returned by calc.process.resid
# (assumes columns are time, biomass residual, F residual)
names(fit$process.resid) = c("year", "bRsdl", "fRsdl")
list(
pf = prodFun,
traj = merge(prodTraj, fit$process.resid, by = "year")
)
}
pf = getSpictPrd(fit)
ggplot(pf$traj) +
geom_line(aes(b, sp), data = pf$pf) +
geom_path(aes(b, sp)) +
geom_point(aes(b, sp, col = year)) +
geom_text(aes(label = year, x = b, y = sp),
size = 3,
data = subset(pf$traj, year == min(year))) +
geom_point(aes(x = b, y = sp),
data = subset(pf$traj, year == min(year))) +
geom_text(aes(label = year, x = b, y = sp),
col = "blue",
size = 3,
data = subset(pf$traj, year == max(year))) +
geom_point(aes(x = b, y = sp),
col = "blue",
data = subset(pf$traj, year == max(year)))
library(grid)
library(plyr)
windowOverlap = function(dat, yearCol = "year", step = 10, overlap = 5) {
years = dat[[yearCol]]
yMin = min(years, na.rm = TRUE)
yMax = max(years, na.rm = TRUE)
centres = seq(
from = ceiling(yMin / step) * step,
to = floor(yMax / step) * step,
by = step
)
outList = lapply(centres, function(y0) {
from = y0 - overlap
to = y0 + 9 + overlap # decade (10 yrs) plus overlap on both sides
sel = years >= from & years <= to
w = dat[sel, ]
w$centre = y0
w$from = from
w$to = to
w
})
names(outList) = as.character(centres)
ldply(outList, .id = ".id")
}
trajWin = windowOverlap(pf$traj, overlap = 2)
trajArrows =
trajWin %>%
arrange(.id, year) %>%
group_by(.id) %>%
mutate(
bEnd = lead(b),
spEnd = lead(sp)
) %>%
ungroup() %>%
filter(!is.na(bEnd), !is.na(spEnd))
ggplot(trajWin) +
facet_wrap(~ .id) +
geom_line(data = pf$pf, aes(x = b, y = sp)) +
geom_segment(data = trajArrows,
aes(x = b, y = sp, col = year - from,
xend = bEnd, yend = spEnd),
linewidth = 0.2,
arrow = arrow(length = unit(0.15, "cm"))) +
geom_text(aes(label = year, x = b, y = sp),
size = 3,
data = subset(trajWin, year == from)) +
geom_point(aes(x = b, y = sp),
data = subset(trajWin, year == from)) +
geom_text(aes(label = year, x = b, y = sp),
col = "blue",
size = 3,
data = subset(trajWin, year == to)) +
geom_point(aes(x = b, y = sp),
col = "blue",
data = subset(trajWin, year == to))
This follows Walters et al. (2008), who treat empirical \(SP_t\) vs \(B_t\) plots as signatures of how production depends on stock size and of whether there is non‑stationarity in productivity.
These patterns correspond directly to Walters et al.’s empirical findings across 110 stocks, where non‑stationarity and production‑driven changes in stock size are common and simple, repeatable SP–B curves are rare.
If the \(B_{t+1} - B_t + C_t\) vs \(B_t\) phase plot shows early‑period points centred on the curve, but recent years consistently below the curve at similar biomass levels, with a counter‑clockwise loop, then this may be evidence that productivity has declined in recent decades and that recoveries are occurring under lower surplus production than declines at comparable biomass; this could argue for time‑varying productivity or for more conservative advice.
A single figure with four panels summarises process error: (1) residuals vs time, (2) residuals vs fitted biomass, (3) autocorrelation function (ACF), and (4) histogram with normal curve.
p_time = ggplot(pf$traj, aes(x = year, y = bRsdl)) +
geom_hline(yintercept = 0, colour = "grey60") +
geom_line(colour = "steelblue") +
geom_point(colour = "steelblue") +
labs(x = "Year", y = "Process residual",
title = "Over time") +
theme_bw()
p_fit = ggplot(pf$traj, aes(x = b, y = bRsdl)) +
geom_hline(yintercept = 0, colour = "grey60") +
geom_point(alpha = 0.6, colour = "steelblue") +
labs(x = "Fitted biomass (or state)", y = "Process residual",
title = "Residuals vs Biomass") +
theme_bw()
acf_obj = acf(pf$traj$bRsdl, plot = FALSE)
acf_df = with(acf_obj, data.frame(lag = lag, acf = acf))[-1, ]
crit = qnorm(0.975) / sqrt(nrow(pf$traj))
p_acf = ggplot(acf_df, aes(x = lag, y = acf)) +
geom_hline(yintercept = 0, colour = "grey60") +
geom_hline(yintercept = c(-crit, crit),
linetype = "dashed",
colour = "grey60") +
geom_col(fill = "steelblue") +
labs(x = "Lag", y = "ACF",
title = "Autocorrelation") +
theme_bw()
p_hist = ggplot(pf$traj, aes(x = bRsdl)) +
geom_histogram(aes(y = ..density..),
bins = 20,
fill = "steelblue",
colour = "white",
alpha = 0.7) +
stat_function(fun = dnorm,
args = list(mean = mean(pf$traj$bRsdl),
sd = sd(pf$traj$bRsdl)),
colour = "red",
linewidth = 0.8) +
labs(x = "Residual", y = "Density",
title = "Distribution") +
theme_bw()
(p_time | p_fit) /
(p_acf | p_hist)
The four panels jointly diagnose whether the process–error term is behaving like the stationary, mean‑zero noise assumed in simple biomass–dynamic model.
The top-left panel (“Over time”) shows process residuals plotted against year. For a well-specified model, residuals would be expected to fluctuate around zero with no obvious trend and roughly constant variance. However, residuals are mostly positive after about 2000, with an upward shift in the mean. The spread also appears slightly larger in recent years. Together this suggests that the process-error term is not stationary and probably reflects systematic model bias rather than purely stochastic fluctuations, a pattern Walters et al. describe as “production-driven change” where shifts in productivity appear independent of stock size and drive biomass up or down.
The time-series panel, together with the top-right panel (“Residuals vs Biomass”), identifies trends and shifts. Before 2000, residuals randomly vary about zero; afterwards they are predominantly positive. This step-change implies that the model starts to under-predict the biomass once the stock moves into a higher biomass regime. This is consistent with a production function that is mis-specified for the more recent period or with time-varying productivity or selectivity, echoing Walters et al.’s findings that SP–B relationships often exhibit biomass‑dependent bias or downward hooks during recoveries reflecting reduced mean fecundity.
The bottom-right panel (“Distribution”) shows the empirical distribution of residuals with a normal density overlay. The shape is roughly Gaussian but centred above zero. Residual magnitudes are not particularly extreme relative to the assumed process standard deviation, but the mean of ~0.2 points to bias in the production curve or to a missing process, rather than to occasional large shocks superimposed on an otherwise correctly specified model. This matches Walters et al.’s classification: non‑stationary productivity is common, simple dome‑shaped curves rare.
The bottom-left panel (“Autocorrelation”) is the ACF of the process residuals. Several low-lag bars exceed the approximate 95% bounds, and there is a pattern of positive autocorrelation at short lags followed by negative values at longer lags. This is inconsistent with a white-noise process and suggests slow, temporally structured deviations from the model, for example an environmental regime, time-varying productivity, or changes in catchability. Walters et al. emphasise that marine systems often show “red noise” rather than white, a feature captured here by the significant ACF structure.
Together, these plots suggest that the process-error term in this example is accounting for model misspecification or non-stationarity rather than representing pure stochastic noise. This is because:
Process-error appears not to be purely stochastic and so confidence intervals or fractiles from the fitted model will be biased.
Following Walters et al. (2008), possible next steps are to:
The same diagnostic framework can be applied systematically across stocks to identify cases where process error behaves as desired and those where it is clearly signalling deeper structural issues. These diagnostics can be applied to any model with an implicit or explicit production function.
Walters, C.J., Hilborn, R., and Christensen, V. 2008. Surplus production dynamics in declining and recovering fish populations. Can. J. Fish. Aquat. Sci. 65: 2536–2551. doi:10.1139/F08-170.