Introduction

Biomass-dynamic models such as SPiCT represent uncertainty in population dynamics through a process-error term. If the production function and state equation are correctly specified, and all relevant covariates are included, these process residuals should be independent and identically distributed around zero (i.e. white noise), with approximately constant variance. In practice, however, the process-error term often compensates for structural misspecification and non-stationarity, for example time-varying productivity, changing selectivity or catchability, e.g. in stocks which are not the main target.

Here we propose a set of diagnostics for process error using an example data set. The aim is not to formally test ypotheses, but to provide a visual summary to highlights when a model is explain model misspecification via process-error focussing on four aspects namely:

Randomness and variance over time
Trends and shifts including potential bias at high or low biomass.
Magnitude and distribution of residuals.
Autocorrelation temporal structure beyond white noise.

Framework

Use is made of existing diagnostics used for model fits, such as runs tests and residual diagnostics to help identify non-stationarity, the impact of environmental drivers, and the balance between stochasticity and model misspecification. If process residuals appear random with approximately constant variance, this suggests a correctly specified production function with genuine stochastic variation, conditional on the chosen model structure and fixed parameters. Positive or negative trends in residuals may indicate production-curve misspecification or structural changes, for example an incorrect shape parameter, time-varying productivity, changes in discarding or landings, or regime shifts. Consistently large residuals relative to the assumed process variance may reflect high biological variability, such as recruitment-driven dynamics, but may also signal structural model inadequacy or a mis-specified variance structure. Significant temporal autocorrelation may point to environmental forcing, omitted covariates, or serially correlated implementation error rather than white-noise process error. But we live in a world full of red noise.

Data and R Implementation

The following R code loads precomputed process-error, e.g. output from SPiCT, JABBA or SS3, and runs the set of diagnostic plots for a selected stock.

library(ggplot2)
library(dplyr)
library(tidyr)
library(patchwork)

load("C:/active/flrpapers/rebuild/papers/brebuild/data/age/pe.RData")

pe = subset(
  pe,
  .id == "ank.27.78abd" &
  Scenario == 1 &
  SRR == "Beverton & Holt" &
  !is.na(pe))

The data frame pe contains:

year - assessment year
pe - process residual on the biomass state
biomass - biomass estimates

Four-Panel Diagnostic Plot

A single figure with four panels is used to summarise process error: i.e. 1) residuals vs time, 2) residuals vs fitted biomass, 3) autocorrelation function (ACF), and 4) a histogram with normal curve for comparison.

1. Residuals vs time

p_time = ggplot(pe, aes(x = year, y = pe)) +
  geom_hline(yintercept = 0, colour = "grey60") +
  geom_line(colour = "steelblue") +
  geom_point(colour = "steelblue") +
  labs(x = "Year", y = "Process residual",
       title = "Over time") +
  theme_bw()

2. Residuals vs fitted biomass

p_fit = ggplot(pe, aes(x = ssb, y = pe)) +
  geom_hline(yintercept = 0, colour = "grey60") +
  geom_point(alpha = 0.6, colour = "steelblue") +
  labs(x = "Fitted biomass (or state)", y = "Process residual",
       title = "Residuals vs Biomass") +
  theme_bw()

3. ACF (pre-compute then plot with ggplot)

acf_obj = acf(pe$pe, plot = FALSE)
acf_df = with(acf_obj, data.frame(lag = lag, acf = acf))[-1, ]
crit = qnorm(0.975) / sqrt(nrow(pe))
p_acf = ggplot(acf_df, aes(x = lag, y = acf)) +
  geom_hline(yintercept = 0, colour = "grey60") +
  geom_hline(yintercept = c(-crit, crit), linetype = "dashed",
             colour = "grey60") +
  geom_col(fill = "steelblue") +
  labs(x = "Lag", y = "ACF",
       title = "Autocorrelation") +
  theme_bw()

4. Histogram + normal overlay

p_hist = ggplot(pe, aes(x = pe)) +
  geom_histogram(aes(y = ..density..), bins = 20,
                 fill = "steelblue", colour = "white", alpha = 0.7) +
  stat_function(fun = dnorm,
                args = list(mean = mean(pe$pe), sd = sd(pe$pe)),
                colour = "red", linewidth = 0.8) +
  labs(x = "Residual", y = "Density",
       title = "Distribution") +
  theme_bw()

Combined

(p_time | p_fit) /
  (p_acf | p_hist)

Interpretation

Randomness and Variance

The top-left panel (“Over time”) shows process residuals plotted against year. For a well-specified residuals would be expected to fluctuate around zero with no obvious trend and roughly constant variance. However, residuals are mostly positive after about 2000, with an upward shift in the mean. The spread also appears slightly larger in recent years. Together this suggest that the process-error term is not stationary and probably reflects systematic model bias rather than purely stochastic fluctuations.

Trends and Shifts

The time-series panel, together with the top-right panel (“Residuals vs Biomass”), identifies trends and shifts. Before 2000, residuals randomly vary about zero; afterwards this they are predominantly positive. This step-change implies that the model starts to under-predict the biomass once the stock moves into a higher biomass regime. This is consistent with a production function that is mis-specified for the more recent period or with time-varying productivity or selectivity.

Magnitude and Distribution

The bottom-right panel (“Distribution”) shows the empirical distribution of residuals with a normal density overlay. The shape is roughly Gaussian but centred above zero. Residual magnitudes are not particulary extreme, i.e. relative to the assumed process standard deviation, but the mean of 0.2 points to bias in the production curve or to a missing process, rather than to occasional large shocks superimposed on an otherwise correct specified model.

Autocorrelation

The bottom-left panel (“Autocorrelation”) is the ACF of the process residuals. Several low-lag bars exceed the approximate 95% bounds, and there is a pattern of positive autocorrelation at short lags followed by negative values at longer lags. This is inconsistent with a white-noise process and suggests slow, temporally structured deviations from the model, for example an environmental regime, time-varying productivity, or changes in catchability.

Summary

Together, these plots suggest that the process-error term in this example is accounting for model misspecification or non-stationarity rather than representing pure stochastic noise. This is because

Residuals exhibit a shift in mean and variance over time.
The model tends to under-predict biomass at higher levels, consistent with a biased production curve.
The residual distribution is centred above zero, even though individual deviations are not huge.
Autocorrelation is present, indicating red rather than the white noise assumed.

Process-error appears not to be purely stochastic and so confidence intervals or fractiles from the fitted model will be biased.

Possible next steps are to

Explore time-varying productivity or selectivity, or alternative production-function shapes.
Consider including covariates or explicit regime-shift structures if there is supporting evidence.
Evaluate HCR performance in an operating-model/MSE framework or an ensemble that reproduces the process-error properties, rather than basing advice solely on a singlr fitted model.
Give up

The same diagnostic framework can be applied systematically across stocks to identify cases where process error behaves as desired and those where it is clearly signalling deeper structural issues. These diagnostics can be applied to any model with an implicit or explicit production function.

Process Error Diagnostics

Dr. Laurence T. Kell

2026-02-04