mvgam case study 3: distributed lag models

Distributed lag nonlinear models

Here we will use the mvgam package (DGAMs; Clark & Wells, 2022), which fits dynamic GAMs using MCMC sampling via either the JAGS software (installation links are found here) or via the Stan software (installation links are found here), to estimate paramaters of a Bayesian distributed lag model. These models are used to describe simultaneously non-linear and delayed functional relationships between a covariate and a response, and are sometimes referred to as exposure-lag-response models. If we assume \(\tilde{\boldsymbol{y}}_{t}\) is the conditional expectation of a discrete response variable \(\boldsymbol{y}\) at time \(\boldsymbol{t}\), the linear predictor for a dynamic distributed lag GAM with one lagged covariate is written as:

\[log(\tilde{\boldsymbol{y}}_{t})={\boldsymbol{B}}_0+\sum\limits_{k=1}^K{f}(\boldsymbol{b}_{k,t}\boldsymbol{x}_{k,t})+\boldsymbol{z}_{t}\,,\] where \(\boldsymbol{B}_{0}\) is the unknown intercept, the \(\boldsymbol{b}\)’s are unknown spline coefficients estimating how the functional effect of covariate (\(\boldsymbol{x}\)) on \(log(\tilde{\boldsymbol{y}}_{t})\) changes over increasing lags (up to a maximum lag of (\(\boldsymbol{K}\))) and \(\boldsymbol{z}\) is a dynamic latent trend component.

To demonstrate how these models are estimated in mvgam, first we load a subset of the Portal rodents capture data, which is available from the mvgam package

#devtools::install_github("nicholasjclark/mvgam")
library(mvgam)
library(dplyr)
library(mgcv)
data('portal_data')

## Welcome to mvgam. Please cite as: Clark, NJ, and Wells, K. 2022. Dynamic Generalized Additive Models (DGAMs) for forecasting discrete ecological time series. Methods in Ecology and Evolution, 2022, https://doi.org/10.1111/2041-210X.13974

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

## Warning: package 'mgcv' was built under R version 4.2.2

## Loading required package: nlme

## 
## Attaching package: 'nlme'

## The following object is masked from 'package:dplyr':
## 
##     collapse

## This is mgcv 1.8-41. For overview type 'help("mgcv-package")'.

These data have been filtered from year 2004 onwards to make the model quicker to estimate for this simple example

dplyr::glimpse(portal_data)

## Rows: 199
## Columns: 10
## $ moon          <int> 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 3…
## $ DM            <int> 10, 14, 9, NA, 15, NA, NA, 9, 5, 8, NA, 14, 7, NA, NA, 9…
## $ DO            <int> 6, 8, 1, NA, 8, NA, NA, 3, 3, 4, NA, 3, 8, NA, NA, 3, NA…
## $ PP            <int> 0, 1, 2, NA, 10, NA, NA, 16, 18, 12, NA, 3, 2, NA, NA, 1…
## $ OT            <int> 2, 0, 1, NA, 1, NA, NA, 1, 0, 0, NA, 2, 1, NA, NA, 1, NA…
## $ year          <int> 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 20…
## $ month         <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,…
## $ mintemp       <dbl> -9.710, -5.924, -0.220, 1.931, 6.568, 11.590, 14.370, 16…
## $ precipitation <dbl> 37.8, 8.7, 43.5, 23.9, 0.9, 1.4, 20.3, 91.0, 60.5, 25.2,…
## $ ndvi          <dbl> 1.4658889, 1.5585069, 1.3378172, 1.6589129, 1.8536561, 1…

Setting up distributed lag data matrices

Below is an exact reproduction of Simon Wood’s lag matrix function (which he uses in his distributed lag example from his book Generalized Additive Models - An Introduction with R 2nd edition). Here we supply a vector and specify the maximum lag that we want, and it will return a matrix of dimension length(x) * lag. Note that NAs are used for the missing lag values at the beginning of the matrix. In essence, the matrix objects represent exposure histories, where each row represents the lagged values of the predictor that correspond to each observation in y

lagard <- function(x, n.lag = 6) {
  n <- length(x); X <- matrix(NA, n, n.lag)
  for (i in 1:n.lag) X[i:n, i] <- x[i:n - i + 1]
  X
}

Organise all data needed for modelling into a list. We will focus only on the species Chaetodipus penicillatus (labelled as PP), which shows reasonable seasonality in its captures over time

data_all <- list(lag=matrix(0:5,nrow(portal_data),6,byrow=TRUE),
            y = portal_data$PP,
            season = portal_data$month,
            year = portal_data$year,
            series = rep(as.factor('series1'), NROW(portal_data)),
            time = 1:NROW(portal_data))
data_all$precip <- lagard(portal_data$precipitation)
data_all$mintemp <- lagard(portal_data$mintemp)

The exposure history matrix elements of the data list look as follows:

head(data_all$lag, 5)

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    1    2    3    4    5
## [2,]    0    1    2    3    4    5
## [3,]    0    1    2    3    4    5
## [4,]    0    1    2    3    4    5
## [5,]    0    1    2    3    4    5

head(data_all$precip, 5)

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 37.8   NA   NA   NA   NA   NA
## [2,]  8.7 37.8   NA   NA   NA   NA
## [3,] 43.5  8.7 37.8   NA   NA   NA
## [4,] 23.9 43.5  8.7 37.8   NA   NA
## [5,]  0.9 23.9 43.5  8.7 37.8   NA

head(data_all$mintemp, 5)

##        [,1]   [,2]   [,3]   [,4]  [,5] [,6]
## [1,] -9.710     NA     NA     NA    NA   NA
## [2,] -5.924 -9.710     NA     NA    NA   NA
## [3,] -0.220 -5.924 -9.710     NA    NA   NA
## [4,]  1.931 -0.220 -5.924 -9.710    NA   NA
## [5,]  6.568  1.931 -0.220 -5.924 -9.71   NA

All other elements of the data list are in the usual vector format

head(data_all$y, 5)

## [1]  0  1  2 NA 10

head(data_all$series, 5)

## [1] series1 series1 series1 series1 series1
## Levels: series1

head(data_all$year, 5)

## [1] 2004 2004 2004 2004 2004

head(data_all$time, 5)

## [1] 1 2 3 4 5

View the raw series. There is a clear seasonal pattern to the data, and there are missing values scattered throughout

plot_mvgam_series(data = data_all)

Create training and testing sets; start at observation 7 so that the NA values at the beginning of the covariate lag matrices are not included. Currently there is no option for on-the-fly imputation of missing covariate values in mvgam models, though this can potentially be done in future versions

data_train <- list(lag = data_all$lag[7:184,],
                   y = data_all$y[7:184],
                   series = data_all$series[7:184],
                   season = data_all$season[7:184],
                   year = data_all$year[7:184],
                   time = 7:184,
                   precip = data_all$precip[7:184,],
                   mintemp = data_all$mintemp[7:184,])
data_test <- list(lag = data_all$lag[185:length(data_all$y),],
                   y = data_all$y[185:length(data_all$y)],
                   series = data_all$series[185:length(data_all$y)],
                   season = data_all$season[185:length(data_all$y)],
                   year = data_all$year[185:length(data_all$y)],
                   time = 185:length(data_all$y),
                   precip = data_all$precip[185:length(data_all$y),],
                  mintemp = data_all$mintemp[185:length(data_all$y),])

Fitting distributed lag models in `mvgam`

Now we can fit a Bayesian GAM with distributed lag terms for precipitation and minimum temperature. The distributed lags are set up as tensor product smooth functions (see help(te) for an explanation of tensor product smooth constructions in the mgcv package) between lag and each covariate. We will start simply by assuming our data follow a Poisson observation process. Note that the Gibbs samplers used by JAGS are not efficient at exploring high-dimensional posteriors, so we cannot expect reasonable performance without running the chains for many iterations. For this example, I only use a burnin of 4000

mod1 <- mvgam(formula =  y ~ te(mintemp, lag, k = c(6, 6)) +
                 te(precip, lag, k = c(6, 6)),
                data = data_train,
                newdata = data_test,
                family = poisson(),
              use_stan = FALSE,
                burnin = 4000,
                trend_model = 'None')

Posterior predictive rootograms are a useful way to explore whether a discrete model is able to capture relevant dispersion in the observed data. This plot compares the frequencies of observed vs predicted values for each bin, which can help to identify aspects of poor model fit. For example, if the gray bars (representing observed frequencies) tend to stretch below zero, this suggests the model’s simulations predict the values in that particular bin less frequently than they are observed in the data. A well-fitting model that can generate realistic simulated data will provide a rootogram in which the lower boundaries of the grey bars are generally near zero

ppc(mod1, type = 'rootogram')

The Poisson model is not doing a great job of capturing dispersion, underpredicting the zeros in the data and overpredicting some of the medium-range values (counts of ~5-30). The residual Q-Q plot confirms that the Poisson is not an appropriate distribution for these data

plot(mod1, type = 'residuals')

Given the overdispersion present in the data, we will now assume a Geometric-Poisson observation model, which can be more flexible than the Negative binomial for modelling overdispersed count data. In mvgam the Geometric-Poisson is estimated as a Tweedie-Poisson model with the power parameter p fixed at 1.5

mod2 <- mvgam(formula =  y ~ te(mintemp, lag, k = c(6, 6)) +
                 te(precip, lag, k = c(6, 6)),
                data = data_train,
                newdata = data_test,
                family = 'tweedie',
                use_stan = FALSE,
                burnin = 4000,
                trend_model = 'None')

The rootogram for this model looks better, though of course there is still some overprediction of medium-range values

ppc(mod2, type = 'rootogram')

However, the residual plot looks much better for this model

plot(mod2, type = 'residuals')

The summary of the model provides useful information on convergence for unobserved parameters. Notice how strongly positive the overdispersion parameter is estimated to be, providing further evidence that this overdispersion is important to capture for these data. However, the Gibbs samplers used by JAGS have not come close to converging and we really shouldn’t put much faith in our estimates of parameter uncertainty

summary(mod2)

## GAM formula:

## y ~ te(mintemp, lag, k = c(6, 6)) + te(precip, lag, k = c(6, 
##     6))

##

## Family:

## tweedie

##

## Link function:

## log

##

## Trend model:

## None

##

## N series:

## 1

##

## N timepoints:

## 184

##

## Status:

## Fitted using JAGS

##

## Dispersion parameter estimates:

##         2.5%      50%    97.5% Rhat n.eff
## phi 1.270338 1.820059 2.470019 1.31    38

##

## GAM coefficient (beta) estimates:

##                           2.5%           50%         97.5% Rhat n.eff
## (Intercept)         2.30183297  2.4194617383  2.5350142576 1.07    97
## te(mintemp,lag).1  -0.66926024 -0.5151696415  0.1470709636 6.65    20
## te(mintemp,lag).2  -0.74301562 -0.4007704711  0.1563759030 4.24    20
## te(mintemp,lag).3  -0.66976529 -0.3530384829  0.3671666975 5.01    12
## te(mintemp,lag).4  -0.63799709 -0.3536085824  0.8413451131 5.28    21
## te(mintemp,lag).5  -0.75255143 -0.1187258932  0.5759121994 4.40    12
## te(mintemp,lag).6  -0.47640972 -0.3613702187 -0.1274790500 2.18    30
## te(mintemp,lag).7  -0.37172322 -0.1601459646 -0.0405874181 1.18    11
## te(mintemp,lag).8  -0.24774355 -0.0326247211  0.2081783708 2.61    17
## te(mintemp,lag).9  -0.10974094  0.3144176292  0.6240691481 6.70    16
## te(mintemp,lag).10  0.03555117  0.5737193898  0.9038599642 5.22    15
## te(mintemp,lag).11  0.59145030  0.8512801563  1.0619611460 1.93    13
## te(mintemp,lag).12 -0.38141095 -0.1767804388 -0.0209160350 2.06    19
## te(mintemp,lag).13 -0.42958716 -0.1884199592  0.0296703743 3.40    18
## te(mintemp,lag).14 -0.36174933 -0.2597414759 -0.0461747560 1.48    18
## te(mintemp,lag).15 -0.36566429 -0.1158108545  0.0826439210 2.35     9
## te(mintemp,lag).16 -0.05953438  0.1069821802  0.3911058425 3.76    19
## te(mintemp,lag).17  0.36447820  0.5358530460  0.7707532669 2.51    16
## te(mintemp,lag).18 -0.24267772 -0.0339280548  0.1712677832 2.32    17
## te(mintemp,lag).19  0.01358350  0.1005844763  0.2843888934 1.37    25
## te(mintemp,lag).20 -0.20504834 -0.0034614445  0.2775122530 3.04    16
## te(mintemp,lag).21 -0.30551114 -0.0636920216  0.1488050554 2.68    11
## te(mintemp,lag).22 -0.27405119 -0.0435884997  0.3160940089 3.74    16
## te(mintemp,lag).23  0.10932880  0.4179756039  0.5710100539 2.94    21
## te(mintemp,lag).24  0.07647209  0.3259171347  0.6468807504 4.49    32
## te(mintemp,lag).25 -0.05272535  0.0921670711  0.3475320088 2.65    23
## te(mintemp,lag).26 -0.21377648  0.0307540557  0.2467449900 5.19    24
## te(mintemp,lag).27 -0.17402948 -0.0180549420  0.2476613630 2.51    23
## te(mintemp,lag).28 -0.29209975 -0.1397504658  0.1069953654 1.72    22
## te(mintemp,lag).29 -0.72804887 -0.2807874158 -0.0010884273 5.21    15
## te(mintemp,lag).30  0.50161982  0.7790041855  1.0012029288 2.29    14
## te(mintemp,lag).31 -0.13528660  0.4294638324  0.9017452800 8.81    26
## te(mintemp,lag).32 -0.30617164  0.1932002253  0.6388961322 7.60    14
## te(mintemp,lag).33 -0.19931306  0.0232100078  0.3151622738 3.64    19
## te(mintemp,lag).34 -0.42257184 -0.2186267715 -0.0658137708 1.32    16
## te(mintemp,lag).35 -1.41150012 -0.7679187413 -0.3643751775 6.02    22
## te(precip,lag).1   -0.11949168 -0.0280111773  0.0444908234 1.72    18
## te(precip,lag).2   -0.16355151 -0.0489954958  0.0158605351 2.01    26
## te(precip,lag).3   -0.13161910 -0.0413345627  0.0180416526 2.39    28
## te(precip,lag).4   -0.15081112 -0.0396877467  0.0369169995 3.46    24
## te(precip,lag).5   -0.23197847 -0.0408845682  0.1295944136 2.37     8
## te(precip,lag).6   -0.12136183  0.0305024232  0.1269361051 3.28    16
## te(precip,lag).7   -0.09625747 -0.0038714183  0.0435033507 2.27    37
## te(precip,lag).8   -0.09394801 -0.0350264450  0.0103852451 1.29    26
## te(precip,lag).9   -0.12153210 -0.0559158116  0.0025692479 1.80    28
## te(precip,lag).10  -0.14566991 -0.0923747083  0.0096235069 2.52    30
## te(precip,lag).11  -0.19408372 -0.0612557248  0.0289807361 3.79    20
## te(precip,lag).12  -0.14610047 -0.0333381763  0.0318598222 1.35    17
## te(precip,lag).13  -0.12999340 -0.0234231202  0.0361997048 1.38    20
## te(precip,lag).14  -0.13632827 -0.0523546929  0.0525802394 1.19    12
## te(precip,lag).15  -0.19258113 -0.0839466620 -0.0004863832 2.07    14
## te(precip,lag).16  -0.24574108 -0.1181862160 -0.0024902255 2.41    12
## te(precip,lag).17  -0.17885560 -0.1271965374 -0.0099775003 1.40    20
## te(precip,lag).18  -0.17448424 -0.0331055726  0.0472535286 4.09    30
## te(precip,lag).19  -0.11298474 -0.0155996520  0.0737793548 1.93    20
## te(precip,lag).20  -0.10889753 -0.0448366117 -0.0025813010 1.21    25
## te(precip,lag).21  -0.15281188 -0.0983397711 -0.0375898259 1.22    19
## te(precip,lag).22  -0.23446868 -0.1347500512  0.0110038776 1.91    22
## te(precip,lag).23  -0.30958276 -0.1666796165  0.0545046406 4.25    29
## te(precip,lag).24  -0.44982707 -0.1350358977  0.1430531764 7.46    15
## te(precip,lag).25  -0.13709230 -0.0005112134  0.2047438214 6.04    26
## te(precip,lag).26   0.03798553  0.0946476547  0.2122946166 2.06    17
## te(precip,lag).27   0.04916108  0.1265026736  0.1912003633 1.37    18
## te(precip,lag).28  -0.01225446  0.1040817307  0.2049093905 2.16    21
## te(precip,lag).29  -0.08888208  0.1028564629  0.2106125195 3.24    19
## te(precip,lag).30  -0.09562143 -0.0017552572  0.1123088690 2.18    18

##

## GAM smoothing parameter (rho) estimates:

##                       2.5%       50%    97.5% Rhat n.eff
## te(mintemp,lag)   1.117711 2.6251471 4.256999 2.60   205
## te(mintemp,lag)2  1.894227 3.4213375 4.579725 2.39   204
## te(mintemp,lag)3 -1.115421 0.6326806 2.803879 1.80   919
## te(precip,lag)    2.421800 3.6949771 4.660730 1.02   164
## te(precip,lag)2   2.673085 3.8255027 4.644388 1.11    77
## te(precip,lag)3   1.473711 3.4784714 4.713150 1.01   987

##

## JAGS MCMC diagnostics

## Rhats above 1.05 found for 276 parameters
## *Diagnose further to investigate why the chains have not mixed

##

Updating priors in `mvgam`

As this is a timeseries and the residual plot hints at some autocorrelation remaining in the short-term lags, lets check if an AR latent trend process improves forecasts compared to the no-trend model. An important note here is the choice of prior for the overdispersion parameter twdis. This parameter and the latent trend variance can interact strongly, particularly when overdispersion is in the data high. This is because at high values of the dispersion parameter, there is less need for a latent trend to be able to capture any outliers and so the latent trend precision can go up toward infinity, approaching a space of very diffuse likelihood that forces the Gibbs samplers to take on frustratingly small step sizes. Likewise when there is not much need for overdispersion, the dispersion parameter can approach zero and move around in an equally uninformative parameter space. The latent trend operates on the log scale, so really we should not expect autocorrelated jumps in trappings of more than 6-8 from timepoint to timepoint (any larger and the trend will compete strongly with the overdispersion parameter, making it difficult for us to model the inherent overdispersion process and instead assuming it is all autocorrelation). A containment prior on the latent trend sigma will help achieve this, but first we need to see the structure of the priors argument for making alterations to defaults

priors <- get_mvgam_priors(formula =  y ~ te(mintemp, lag, k = c(6, 6)) +
                 te(precip, lag, k = c(6, 6)),
                data = data_train,
                family = 'tweedie',
                trend_model = 'AR3')

## Warning in get_mvgam_priors(formula = y ~ te(mintemp, lag, k = c(6, 6)) + :
## Tweedie family not yet supported for stan; reverting to JAGS

priors

The 5th row of this data.frame contains information on the prior for the latent trend variance component. It is the entry for the prior column of this row that needs to be modified before supplying the argument for priors to the mvgam function:

priors$prior[4] <- "dexp(2.5)T(0.15, 2)"
mod3 <- mvgam(formula =  y ~ te(mintemp, lag, k = c(6, 6)) +
                 te(precip, lag, k = c(6, 6)),
                data = data_train,
                newdata = data_test,
                family = 'tweedie',
              use_stan = FALSE,
                priors = priors,
                burnin = 4000,
                trend_model = 'AR3')
summary(mod3)

## y ~ te(mintemp, lag, k = c(6, 6)) + te(precip, lag, k = c(6, 
##     6))
## tweedie
## log
## AR3
## 1 
## 184 
## Fitted using JAGS 
##          2.5%       50%     97.5% Rhat n.eff
## phi 0.2029026 0.2884058 0.4388165 3.01    14
##                            2.5%          50%        97.5%  Rhat n.eff
## (Intercept)         2.185181991  2.335622051  2.613573364  1.37    30
## te(mintemp,lag).1  -0.489566532  0.149702055  0.418241616 10.28    23
## te(mintemp,lag).2  -0.433078661  0.157546221  0.452235009  8.79    19
## te(mintemp,lag).3  -0.468680795  0.038037510  0.602248136  7.88     8
## te(mintemp,lag).4  -0.533454170 -0.095714393  0.273249496  7.60    12
## te(mintemp,lag).5  -0.918114914 -0.315607000  0.344514452  9.11    19
## te(mintemp,lag).6  -0.697392897 -0.392294506 -0.020315683  4.04    16
## te(mintemp,lag).7  -0.369865017 -0.242910508 -0.161440889  2.04    28
## te(mintemp,lag).8  -0.213054129 -0.125587156  0.226836885  3.83    23
## te(mintemp,lag).9  -0.095729391  0.092329266  0.438745725  3.92    13
## te(mintemp,lag).10  0.044658935  0.271792411  0.545981340  4.63    21
## te(mintemp,lag).11  0.353821090  0.524459041  0.676246627  1.72    12
## te(mintemp,lag).12 -0.412952702 -0.181301165  0.043755404  5.48    19
## te(mintemp,lag).13 -0.145604613 -0.064607400  0.039984998  1.26    15
## te(mintemp,lag).14 -0.257107954 -0.105645283  0.036120732  2.37    17
## te(mintemp,lag).15 -0.211219983 -0.124608756  0.125487273  3.10    15
## te(mintemp,lag).16 -0.004763215  0.106924711  0.205569828  1.35    12
## te(mintemp,lag).17  0.170816808  0.295769857  0.449059202  1.93    15
## te(mintemp,lag).18 -0.381242629 -0.053881455  0.181950852  4.38    12
## te(mintemp,lag).19 -0.110890369  0.031770271  0.133181651  2.27    21
## te(mintemp,lag).20 -0.236503548 -0.063191252  0.103224720  3.13    19
## te(mintemp,lag).21 -0.330751024 -0.195762833  0.072223494  3.88    17
## te(mintemp,lag).22 -0.292162503 -0.145219051 -0.023012889  1.78    16
## te(mintemp,lag).23 -0.120958394  0.079402801  0.290028406  3.23    12
## te(mintemp,lag).24  0.087966157  0.271426741  0.584861050  3.73    20
## te(mintemp,lag).25 -0.145119238  0.229731885  0.394138638  4.78    19
## te(mintemp,lag).26 -0.177655008  0.077755904  0.327234907  3.61    15
## te(mintemp,lag).27 -0.272076780 -0.158306272  0.118201451  3.66    16
## te(mintemp,lag).28 -0.442974940 -0.242179707 -0.058756897  2.63    14
## te(mintemp,lag).29 -0.496996585 -0.336079283 -0.129552368  1.48    24
## te(mintemp,lag).30  0.390282186  0.546112964  0.672424683  1.52    27
## te(mintemp,lag).31 -0.090172679  0.490533660  0.725422932  4.79     8
## te(mintemp,lag).32 -0.049018483  0.298779039  0.599778980  4.12    11
## te(mintemp,lag).33 -0.166829507 -0.065765334  0.101892105  2.85    23
## te(mintemp,lag).34 -0.542388889 -0.361455514 -0.216153963  2.20    15
## te(mintemp,lag).35 -0.860538849 -0.695966157 -0.559096212  1.47    12
## te(precip,lag).1   -0.074149359  0.018912999  0.134319108  1.98    11
## te(precip,lag).2   -0.233839557 -0.062535854 -0.008254045  4.15    25
## te(precip,lag).3   -0.120069640 -0.050371177  0.037702349  2.04    22
## te(precip,lag).4   -0.099922278  0.009060749  0.088401231  1.60    11
## te(precip,lag).5   -0.097745717  0.035305554  0.163556041  2.75    15
## te(precip,lag).6    0.071560890  0.152590795  0.267133683  3.03    22
## te(precip,lag).7   -0.083912221  0.009251725  0.076792133  2.26    22
## te(precip,lag).8   -0.180996950 -0.071393004  0.006509016  2.62    16
## te(precip,lag).9   -0.147131191 -0.048821551  0.018651506  2.77    24
## te(precip,lag).10  -0.048968351 -0.007174150  0.040370247  1.54    32
## te(precip,lag).11  -0.104064177 -0.006225787  0.157496538  5.68    23
## te(precip,lag).12  -0.168643971  0.019737218  0.105814914  2.61    17
## te(precip,lag).13  -0.172524845 -0.030901172  0.050911662  3.09    17
## te(precip,lag).14  -0.184084976 -0.104629391 -0.028170979  2.77    38
## te(precip,lag).15  -0.232830933 -0.106530947 -0.020511567  2.87    15
## te(precip,lag).16  -0.170366087 -0.099754150 -0.020081579  1.37    17
## te(precip,lag).17  -0.214030630 -0.044686545  0.117253510  4.63    15
## te(precip,lag).18  -0.017505640  0.056515070  0.236230448  4.00    17
## te(precip,lag).19  -0.125153338  0.018852186  0.109826556  4.47    22
## te(precip,lag).20  -0.154505115 -0.059714082  0.038154962  2.56    14
## te(precip,lag).21  -0.191131736 -0.091641485  0.003142930  3.94    26
## te(precip,lag).22  -0.210450212 -0.082505927 -0.007768921  1.40    14
## te(precip,lag).23  -0.102406220  0.018897532  0.196081351  3.22    13
## te(precip,lag).24  -0.210833877 -0.112945197 -0.006904847  1.18    12
## te(precip,lag).25  -0.172684394 -0.086591320  0.039223359  2.03    16
## te(precip,lag).26  -0.227636976 -0.073206981  0.046740339  3.25    16
## te(precip,lag).27  -0.286950278 -0.083726036  0.032506094  3.92    19
## te(precip,lag).28  -0.262262627 -0.094193968 -0.011139450  4.56    21
## te(precip,lag).29  -0.219652616 -0.047886299  0.091089487  5.12    21
## te(precip,lag).30  -0.086803539  0.071203029  0.247638618  5.15    19
##                        2.5%      50%    97.5% Rhat n.eff
## te(mintemp,lag)   2.4905928 3.546675 4.391608 1.13   237
## te(mintemp,lag)2  2.6371177 3.692525 4.525741 1.13   181
## te(mintemp,lag)3 -0.3526018 1.330835 3.592959 1.75   882
## te(precip,lag)    2.2980497 3.846671 4.855390 1.35   172
## te(precip,lag)2   1.7906079 3.155542 4.591898 2.04   282
## te(precip,lag)3   1.1284969 3.362199 4.539355 1.03  1168
##             2.5%        50%     97.5% Rhat n.eff
## ar1    0.6031134  0.8917930 1.1558248 1.05    56
## ar2   -0.2504091  0.1401597 0.5001433 1.43    47
## ar3   -0.4135877 -0.1102343 0.1367329 1.40   102
## sigma  0.2050275  0.3105242 0.4739982 1.53    64
## Rhats above 1.05 found for 331 parameters
## *Diagnose further to investigate why the chains have not mixed

A pairs plot of the logged versions of the latent trend precision and the overdispersion parameter can give insight into whether there is still any strange behaviour or problematic parameter spaces that we must think more deeply about

plot(log(MCMCvis::MCMCchains(mod3$model_output, 'phi')),
     log(MCMCvis::MCMCchains(mod3$model_output, 'tau')),
     ylab = 'log(tau)', xlab = 'log(phi)', pch = 16, col = "#8F272740")

Our rootogram has not improved much with the addition of the latent trend

ppc(mod3, type = 'rootogram')

But the residual distributions look better

plot(mod3, type = 'residuals')

Using `Stan` for DGAMs with latent Gaussian Process trends

We can also demonstrate another feature of mvgam, which is the ability to use Hamiltonian Monte Carlo for parameter estimation via the software Stan (using the rstan or cmdstanr interfaces). Also note that currently there is no support for fitting Tweedie responses in Stan. We will therefore stick with a Negative Binomial observation process for the Stan version. However there are great advantages when using Stan, which includes the option to estimate smooth latent trends via Hilbert space approximate Gaussian Processes. This often makes sense for ecological series, which we expect to change smoothly over time. As expected when compared to the inferior Gibbs samplers in JAGS, the Stan version converges very nicely

mod4 <- mvgam(formula =  y ~ te(mintemp, lag, k = c(6, 6)) +
                 te(precip, lag, k = c(6, 6)),
                data = data_train,
                newdata = data_test,
                family = nb(),
                trend_model = 'GP',
              use_stan = TRUE)

## Running MCMC with 4 parallel chains...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 27.2 seconds.
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 27.9 seconds.
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 28.2 seconds.
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 28.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 27.9 seconds.
## Total execution time: 28.6 seconds.

summary(mod4)

## y ~ te(mintemp, lag, k = c(6, 6)) + te(precip, lag, k = c(6, 
##     6))
## negative binomial
## log
## GP
## 1 
## 184 
## Fitted using Stan 
##            2.5%      50%    97.5% Rhat n.eff
## phi[1] 2.881133 4.132835 5.983513    1  1752
##                           2.5%         50%       97.5% Rhat n.eff
## (Intercept)         2.09556150  2.35250500 2.593088750 1.00  3431
## te(mintemp,lag).1  -1.37725650 -0.39416500 0.396096975 1.01   552
## te(mintemp,lag).2  -1.02177100 -0.21080750 0.507214300 1.00  1081
## te(mintemp,lag).3  -0.91837940 -0.18118300 0.517734175 1.00  1427
## te(mintemp,lag).4  -1.00858425 -0.34373150 0.423574200 1.00  1603
## te(mintemp,lag).5  -1.52026625 -0.56464500 0.566515850 1.00  1298
## te(mintemp,lag).6  -0.99490130 -0.32208150 0.539031800 1.00   615
## te(mintemp,lag).7  -0.72488388 -0.18687300 0.404401975 1.00   566
## te(mintemp,lag).8  -0.53592625  0.04698050 0.696713500 1.00   555
## te(mintemp,lag).9  -0.17545363  0.36622400 1.028240250 1.00   566
## te(mintemp,lag).10  0.13923620  0.59739300 1.259377000 1.01   508
## te(mintemp,lag).11  0.08364091  0.79361700 1.656776500 1.01   607
## te(mintemp,lag).12 -0.66840937 -0.11579050 0.599631825 1.00   601
## te(mintemp,lag).13 -0.41930430 -0.08333770 0.369463575 1.00   568
## te(mintemp,lag).14 -0.45096550 -0.06632260 0.352985450 1.00   615
## te(mintemp,lag).15 -0.34529870  0.05240410 0.474380325 1.00   686
## te(mintemp,lag).16 -0.05141427  0.33289550 0.753164900 1.00   533
## te(mintemp,lag).17 -0.07738934  0.57772400 1.253270500 1.01   543
## te(mintemp,lag).18 -0.43718820  0.12361000 0.866091025 1.00   571
## te(mintemp,lag).19 -0.24139033  0.12892350 0.613180000 1.00   577
## te(mintemp,lag).20 -0.37554130  0.06518235 0.509747950 1.00   620
## te(mintemp,lag).21 -0.43904505  0.00761062 0.482686425 1.00   675
## te(mintemp,lag).22 -0.30901025  0.09906420 0.595758950 1.00   591
## te(mintemp,lag).23 -0.30294003  0.34611300 1.103164000 1.01   543
## te(mintemp,lag).24 -0.16854653  0.42221850 1.227761750 1.00   597
## te(mintemp,lag).25 -0.13598647  0.26619400 0.810101500 1.00   530
## te(mintemp,lag).26 -0.28360978  0.13741850 0.634467175 1.00   560
## te(mintemp,lag).27 -0.47103077 -0.01317475 0.451261700 1.00   599
## te(mintemp,lag).28 -0.58903065 -0.18952800 0.339721550 1.00   543
## te(mintemp,lag).29 -0.91715157 -0.28081800 0.503343700 1.01   543
## te(mintemp,lag).30 -0.07119840  0.61902800 1.520616500 1.00   647
## te(mintemp,lag).31 -0.05126970  0.37922350 0.938163450 1.00   633
## te(mintemp,lag).32 -0.27703360  0.15626650 0.686810700 1.00   812
## te(mintemp,lag).33 -0.54496550 -0.10932950 0.362176900 1.00   917
## te(mintemp,lag).34 -0.87101758 -0.41641000 0.084357977 1.00   780
## te(mintemp,lag).35 -1.57654275 -0.78503600 0.009086367 1.01   684
## te(precip,lag).1   -0.16217245 -0.01486675 0.143194800 1.00   971
## te(precip,lag).2   -0.27096865 -0.08014105 0.080409792 1.00   892
## te(precip,lag).3   -0.27477330 -0.08442045 0.098376805 1.00   946
## te(precip,lag).4   -0.25840100 -0.06782490 0.096830047 1.00   851
## te(precip,lag).5   -0.33868042 -0.06309145 0.185581125 1.00   734
## te(precip,lag).6   -0.14319105  0.04144290 0.256698650 1.01   624
## te(precip,lag).7   -0.14696977 -0.02673155 0.092198902 1.00   668
## te(precip,lag).8   -0.22086702 -0.06225420 0.079826025 1.00   803
## te(precip,lag).9   -0.24756212 -0.07029200 0.080876047 1.00   724
## te(precip,lag).10  -0.23219705 -0.06378475 0.072724755 1.01   589
## te(precip,lag).11  -0.32112658 -0.06802675 0.147496875 1.00   606
## te(precip,lag).12  -0.28312775 -0.07930025 0.153434000 1.01   611
## te(precip,lag).13  -0.20411562 -0.06189010 0.085992967 1.00  1071
## te(precip,lag).14  -0.23903337 -0.06065880 0.116923975 1.00   937
## te(precip,lag).15  -0.27408000 -0.07268605 0.107949025 1.01   699
## te(precip,lag).16  -0.26619840 -0.08023205 0.087272590 1.01   564
## te(precip,lag).17  -0.34042205 -0.07845605 0.148602075 1.00   636
## te(precip,lag).18  -0.26942150 -0.03148435 0.227976825 1.01   520
## te(precip,lag).19  -0.18248185 -0.04765015 0.123922425 1.00   686
## te(precip,lag).20  -0.21296640 -0.07447375 0.086886822 1.00   971
## te(precip,lag).21  -0.25806818 -0.10230100 0.043393905 1.00  1022
## te(precip,lag).22  -0.26000350 -0.11325200 0.014511535 1.01   995
## te(precip,lag).23  -0.33709930 -0.10124650 0.130192150 1.00   856
## te(precip,lag).24  -0.44895575 -0.17933350 0.104187475 1.00   796
## te(precip,lag).25  -0.20456255 -0.02634530 0.152873025 1.00  1097
## te(precip,lag).26  -0.16324258  0.04159950 0.238608000 1.00   964
## te(precip,lag).27  -0.18071818  0.05203080 0.250757675 1.00   912
## te(precip,lag).28  -0.17687255  0.05394175 0.234758375 1.00   853
## te(precip,lag).29  -0.21174462  0.09316825 0.370009200 1.00   882
## te(precip,lag).30  -0.25389995 -0.01073295 0.200755600 1.00  2564
##                       2.5%      50%    97.5% Rhat n.eff
## te(mintemp,lag)   2.287283 3.396090 4.183368 1.00  1300
## te(mintemp,lag)2  1.800113 3.415265 4.261473 1.01   430
## te(mintemp,lag)3 -1.720837 0.654860 3.638840 1.00   475
## te(precip,lag)    1.859859 3.416360 4.254159 1.00  1121
## te(precip,lag)2   1.993303 3.474540 4.300673 1.01   485
## te(precip,lag)3   1.009984 3.362615 4.261303 1.01   516
##                  2.5%       50%     97.5% Rhat n.eff
## alpha_gp[1] 0.5042018 0.8391255  1.360626    1  1844
## rho_gp[1]   1.8950992 5.3760800 14.130147    1  2749
## n_eff / iter looks reasonable for all parameters
## Rhat looks reasonable for all parameters
## 0 of 2000 iterations ended with a divergence (0%)
## 0 of 2000 iterations saturated the maximum tree depth of 12 (0%)
## E-FMI indicated no pathological behavior

As with all other mvgam objects, we can create plots of the estimated forecast distribution

plot_mvgam_fc(mod4, series = 1, newdata = data_test, ylim = c(0, 100))

## Out of sample DRPS:

## [1] 23.21037

##

The trend now evolves smoothly via an infinite dimensional Gaussian Process

plot_mvgam_trend(mod4, series = 1)

Traceplots of smooth penalties indicate good mixing and convergence of the four MCMC chains

rstan::stan_trace(mod4$model_output, pars = 'rho')

We can also create quick plots of the estimated smooth tensor product interactions for the distributed lag terms, which basically follow mgcv’s two-dimensional plotting utility but uses the mvgam’s estimated coefficients

plot_mvgam_smooth(mod4, series = 1, smooth = 1)

plot_mvgam_smooth(mod4, series = 1, smooth = 2)

Plotting posterior distributed lag functions

If you are like me then you’ll find these plots rather difficult to interpret! The more intense yellow/white colours indicate higher predicted values, with the deeper red colours representing lower predicted values, but actually making sense of how the functional response is expected to change over different lags is not easy from these plots. HOwever, we can use the predict_mvgam function to generate much more interpretable plots. First we will focus on the effect of mintemp and generate a series of predictions to visualise how the estimated function changes over different lags. Set up prediction data by zeroing out all covariates apart from the covariate of interest

newdata <- data_test
newdata$year <- rep(0, length(newdata$year))
newdata$season <- rep(0, length(newdata$season))
newdata$precip <- matrix(0, ncol = ncol(newdata$precip),
                         nrow = nrow(newdata$precip))

Set up viridis plot colours and initiate the plot window to be centred around zero. We will then keep all mintemp values at zero apart from the particular lag being predicted so that we can visualise how the predicted function changes over lags of mintemp. Predictions are generated on the link scale in this case, though you could also use the response scale. Note that we need to first generate predictions with all covariates (including the mintemp covariate) zeroed out to find the ‘baseline’ prediction so that we can shift by this baseline for generating a zero-centred plot. That way our resulting plot will roughly follow the traditional mgcv partial effect plots

cols <- viridis::inferno(6)
plot(1, type = "n",
     xlab = 'Mintemp',
     ylab = 'Predicted response function',
     xlim = c(min(data_train$mintemp), max(data_train$mintemp)),
     ylim = c(-1.6, 1.6))

# Calculate predictions for when mintemp is all zeros to find the baseline
# value for centring the plot
newdata$mintemp <- matrix(0, ncol = ncol(newdata$mintemp),
                         nrow = nrow(newdata$mintemp))
preds <- predict(mod4, newdata = newdata, type = 'link')
offset <- mean(preds)

for(i in 1:6){
  # Set up prediction matrix for mintemp with lag i as the prediction sequence; 
  # use a sequence of mintemp values across the full range of observed values in the training data
  newdata$mintemp <- matrix(0, ncol = ncol(newdata$precip),
                            nrow = nrow(newdata$precip))
  newdata$mintemp[,i] <- seq(min(data_train$mintemp),
                             max(data_train$mintemp),
                             length.out = length(newdata$year))

  # Predict on the link scale and shift by the offset so that values are roughly centred at zero
  preds <- predict(mod4, newdata = newdata, type = 'link') - offset

  # Calculate empirical prediction quantiles
  probs = c(0.05, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.95)
  cred <- sapply(1:NCOL(preds),
                 function(n) quantile(preds[,n],
                                      probs = probs))

  # Plot expected function posterior intervals (40-60%) and medians in varying colours per lag
  pred_upper <- cred[4,]
  pred_lower <- cred[6,]
  pred_vals <- seq(min(data_train$mintemp),
                   max(data_train$mintemp),
                   length.out = length(newdata$year))
  polygon(c(pred_vals, rev(pred_vals)), c(pred_upper, rev(pred_lower)),
          col = scales::alpha(cols[i], 0.6), border = scales::alpha(cols[i], 0.7))
  lines(pred_vals, cred[5,],
        col = scales::alpha(cols[i], 0.8), lwd = 2.5)
}
abline(h = 0, lty = 'dashed')
legend('topleft', legend = paste0('lag', seq(0, 5)),
       bg = 'white', bty = 'n',
       col = cols, lty = 1, lwd = 6)

This plot demonstrates how the effect of mintemp is expected to change over different exposure lags, with the 3 - 5 month lags showing more of a cyclic seasonal pattern (catches expected to increase in the summer and autumn, roughly 3 - 5 months following cold minimum winter temperatures) while the recent lags (lags 0 and 1) demonstrate a more linear response function (catches broadly increasing as minimum temperature increases). This is hopefully a useful example for developing a better understanding of how a distributed lag model is attempting to recreate the data generating process. And here is the same plot for precipitation, which demonstrates how a u-shaped functional relationship diminishes toward a flat function at lags 2 - 5 (though this effect is clearly less important in the model than the mintemp * lag effect above)

newdata <- data_test
newdata$year <- rep(0, length(newdata$year))
newdata$season <- rep(0, length(newdata$season))
newdata$mintemp <- matrix(0, ncol = ncol(newdata$mintemp),
                         nrow = nrow(newdata$mintemp))
newdata$precip <- matrix(0, ncol = ncol(newdata$precip),
                         nrow = nrow(newdata$precip))
preds <- predict(mod4, newdata = newdata, type = 'link')
offset <- mean(preds)
plot(1, type = "n",
     xlab = 'Precipitation',
     ylab = 'Predicted response function',
     xlim = c(min(data_train$precip), max(data_train$precip)),
     ylim = c(-1.6, 1.6))

for(i in 1:6){
  newdata$precip <- matrix(0, ncol = ncol(newdata$precip),
                            nrow = nrow(newdata$precip))
  newdata$precip[,i] <- seq(min(data_train$precip),
                             max(data_train$precip),
                             length.out = length(newdata$year))
  preds <- predict(mod4, newdata = newdata, type = 'link') - offset
  probs = c(0.05, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.95)
  cred <- sapply(1:NCOL(preds),
                 function(n) quantile(preds[,n],
                                      probs = probs))
  pred_upper <- cred[4,]
  pred_lower <- cred[6,]
  pred_vals <- seq(min(data_train$precip),
                   max(data_train$precip),
                   length.out = length(newdata$year))
  polygon(c(pred_vals, rev(pred_vals)), c(pred_upper, rev(pred_lower)),
          col = scales::alpha(cols[i], 0.6), border = scales::alpha(cols[i], 0.7))
  lines(pred_vals, cred[5,],
        col = scales::alpha(cols[i], 0.8), lwd = 2.5)
}
abline(h = 0, lty = 'dashed')
legend('topleft', legend = paste0('lag', seq(0, 5)),
       bg = 'white', bty = 'n',
       col = cols, lty = 1, lwd = 6)

All of the usual functions in mvgam can also be used for list data objects and for models fitted with Stan, so long as they contain the necessary fields series, season and year. For example, posterior retrodictive checks for the in-sample training period:

ppc(mod4, series = 1, type = 'cdf')

and predictive checks for the out of sample forecast period (which demonstrates how the model tends to overpredict for the forecast period in this particular example):

ppc(mod4, newdata = data_test, series = 1, type = 'cdf')

Logical next steps for interrogating this model would be to trial different trend types (i.e. random walk), replace the distributed lag function for precip with a standard smooth function (that does not include lag interactions, as clearly the model above indicates that these are not supported) and inspect whether different covariates (such as ndvi or maxtemp) might play a role in modulating catches of PP. Finally, once we are satisfied that we have a well-performing model that we can understand and interrogate, we could expand up to a multivariate model by including other species as response variables. This would allow us to capture any possible unobserved dependencies in the catches of multiple co-occurring species in a single unified modelling framework

References

Clark, N.J. and Wells, K. (2022). Dynamic Generalized Additive Models (DGAMs) for forecasting discrete ecological time series. Methods in Ecology and Evolution. DOI: https://doi.org/10.1111/2041-210X.13974

mvgam case study 3: distributed lag models

Nicholas Clark (n.clark@uq.edu.au)