Here we will use the mvgam package (DGAMs; Clark &
Wells, 2022), which fits dynamic GAMs using MCMC sampling via either the
JAGS software (installation links are found here) or
via the Stan software (installation links are found here),
to estimate paramaters of a Bayesian distributed lag model. These models
are used to describe simultaneously non-linear and delayed functional
relationships between a covariate and a response, and are sometimes
referred to as exposure-lag-response models. If we assume \(\tilde{\boldsymbol{y}}_{t}\) is the
conditional expectation of a discrete response variable \(\boldsymbol{y}\) at time \(\boldsymbol{t}\), the linear predictor for
a dynamic distributed lag GAM with one lagged covariate is written
as:
\[log(\tilde{\boldsymbol{y}}_{t})={\boldsymbol{B}}_0+\sum\limits_{k=1}^K{f}(\boldsymbol{b}_{k,t}\boldsymbol{x}_{k,t})+\boldsymbol{z}_{t}\,,\] where \(\boldsymbol{B}_{0}\) is the unknown intercept, the \(\boldsymbol{b}\)’s are unknown spline coefficients estimating how the functional effect of covariate (\(\boldsymbol{x}\)) on \(log(\tilde{\boldsymbol{y}}_{t})\) changes over increasing lags (up to a maximum lag of (\(\boldsymbol{K}\))) and \(\boldsymbol{z}\) is a dynamic latent trend component.
To demonstrate how these models are estimated in mvgam,
first we load a subset of the Portal rodents capture data, which is
available from the mvgam package
#devtools::install_github("nicholasjclark/mvgam")
library(mvgam)
library(dplyr)
library(mgcv)
data('portal_data')
## Welcome to mvgam. Please cite as: Clark, NJ, and Wells, K. 2022. Dynamic Generalized Additive Models (DGAMs) for forecasting discrete ecological time series. Methods in Ecology and Evolution, 2022, https://doi.org/10.1111/2041-210X.13974
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Warning: package 'mgcv' was built under R version 4.2.2
## Loading required package: nlme
##
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
##
## collapse
## This is mgcv 1.8-41. For overview type 'help("mgcv-package")'.
These data have been filtered from year 2004 onwards to make the model quicker to estimate for this simple example
dplyr::glimpse(portal_data)
## Rows: 199
## Columns: 10
## $ moon <int> 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 3…
## $ DM <int> 10, 14, 9, NA, 15, NA, NA, 9, 5, 8, NA, 14, 7, NA, NA, 9…
## $ DO <int> 6, 8, 1, NA, 8, NA, NA, 3, 3, 4, NA, 3, 8, NA, NA, 3, NA…
## $ PP <int> 0, 1, 2, NA, 10, NA, NA, 16, 18, 12, NA, 3, 2, NA, NA, 1…
## $ OT <int> 2, 0, 1, NA, 1, NA, NA, 1, 0, 0, NA, 2, 1, NA, NA, 1, NA…
## $ year <int> 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 20…
## $ month <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6,…
## $ mintemp <dbl> -9.710, -5.924, -0.220, 1.931, 6.568, 11.590, 14.370, 16…
## $ precipitation <dbl> 37.8, 8.7, 43.5, 23.9, 0.9, 1.4, 20.3, 91.0, 60.5, 25.2,…
## $ ndvi <dbl> 1.4658889, 1.5585069, 1.3378172, 1.6589129, 1.8536561, 1…
Below is an exact reproduction of Simon Wood’s lag matrix function
(which he uses in his distributed lag example from his book Generalized
Additive Models - An Introduction with R 2nd edition). Here we
supply a vector and specify the maximum lag that we want, and it will
return a matrix of dimension length(x) * lag. Note that
NAs are used for the missing lag values at the beginning of
the matrix. In essence, the matrix objects represent exposure histories,
where each row represents the lagged values of the predictor that
correspond to each observation in y
lagard <- function(x, n.lag = 6) {
n <- length(x); X <- matrix(NA, n, n.lag)
for (i in 1:n.lag) X[i:n, i] <- x[i:n - i + 1]
X
}
Organise all data needed for modelling into a list. We will focus
only on the species Chaetodipus penicillatus (labelled as
PP), which shows reasonable seasonality in its captures
over time
data_all <- list(lag=matrix(0:5,nrow(portal_data),6,byrow=TRUE),
y = portal_data$PP,
season = portal_data$month,
year = portal_data$year,
series = rep(as.factor('series1'), NROW(portal_data)),
time = 1:NROW(portal_data))
data_all$precip <- lagard(portal_data$precipitation)
data_all$mintemp <- lagard(portal_data$mintemp)
The exposure history matrix elements of the data list look as follows:
head(data_all$lag, 5)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0 1 2 3 4 5
## [2,] 0 1 2 3 4 5
## [3,] 0 1 2 3 4 5
## [4,] 0 1 2 3 4 5
## [5,] 0 1 2 3 4 5
head(data_all$precip, 5)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 37.8 NA NA NA NA NA
## [2,] 8.7 37.8 NA NA NA NA
## [3,] 43.5 8.7 37.8 NA NA NA
## [4,] 23.9 43.5 8.7 37.8 NA NA
## [5,] 0.9 23.9 43.5 8.7 37.8 NA
head(data_all$mintemp, 5)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -9.710 NA NA NA NA NA
## [2,] -5.924 -9.710 NA NA NA NA
## [3,] -0.220 -5.924 -9.710 NA NA NA
## [4,] 1.931 -0.220 -5.924 -9.710 NA NA
## [5,] 6.568 1.931 -0.220 -5.924 -9.71 NA
All other elements of the data list are in the usual vector format
head(data_all$y, 5)
## [1] 0 1 2 NA 10
head(data_all$series, 5)
## [1] series1 series1 series1 series1 series1
## Levels: series1
head(data_all$year, 5)
## [1] 2004 2004 2004 2004 2004
head(data_all$time, 5)
## [1] 1 2 3 4 5
View the raw series. There is a clear seasonal pattern to the data, and there are missing values scattered throughout
plot_mvgam_series(data = data_all)
Create training and testing sets; start at observation 7 so that the
NA values at the beginning of the covariate lag matrices
are not included. Currently there is no option for on-the-fly imputation
of missing covariate values in mvgam models, though this
can potentially be done in future versions
data_train <- list(lag = data_all$lag[7:184,],
y = data_all$y[7:184],
series = data_all$series[7:184],
season = data_all$season[7:184],
year = data_all$year[7:184],
time = 7:184,
precip = data_all$precip[7:184,],
mintemp = data_all$mintemp[7:184,])
data_test <- list(lag = data_all$lag[185:length(data_all$y),],
y = data_all$y[185:length(data_all$y)],
series = data_all$series[185:length(data_all$y)],
season = data_all$season[185:length(data_all$y)],
year = data_all$year[185:length(data_all$y)],
time = 185:length(data_all$y),
precip = data_all$precip[185:length(data_all$y),],
mintemp = data_all$mintemp[185:length(data_all$y),])
mvgamNow we can fit a Bayesian GAM with distributed lag terms for
precipitation and minimum temperature. The distributed lags are set up
as tensor product smooth functions (see help(te) for an
explanation of tensor product smooth constructions in the
mgcv package) between lag and each covariate.
We will start simply by assuming our data follow a Poisson
observation process. Note that the Gibbs samplers used by
JAGS are not efficient at exploring high-dimensional
posteriors, so we cannot expect reasonable performance without running
the chains for many iterations. For this example, I only use a burnin of
4000
mod1 <- mvgam(formula = y ~ te(mintemp, lag, k = c(6, 6)) +
te(precip, lag, k = c(6, 6)),
data = data_train,
newdata = data_test,
family = poisson(),
use_stan = FALSE,
burnin = 4000,
trend_model = 'None')
Posterior predictive rootograms are a useful way to explore whether a discrete model is able to capture relevant dispersion in the observed data. This plot compares the frequencies of observed vs predicted values for each bin, which can help to identify aspects of poor model fit. For example, if the gray bars (representing observed frequencies) tend to stretch below zero, this suggests the model’s simulations predict the values in that particular bin less frequently than they are observed in the data. A well-fitting model that can generate realistic simulated data will provide a rootogram in which the lower boundaries of the grey bars are generally near zero
ppc(mod1, type = 'rootogram')
The Poisson model is not doing a great job of capturing
dispersion, underpredicting the zeros in the data and overpredicting
some of the medium-range values (counts of ~5-30). The
residual Q-Q plot confirms that the Poisson is
not an appropriate distribution for these data
plot(mod1, type = 'residuals')
Given the overdispersion present in the data, we will now assume a Geometric-Poisson
observation model, which can be more
flexible than the Negative binomial for modelling
overdispersed count data. In mvgam the
Geometric-Poisson is estimated as a
Tweedie-Poisson model with the power parameter
p fixed at 1.5
mod2 <- mvgam(formula = y ~ te(mintemp, lag, k = c(6, 6)) +
te(precip, lag, k = c(6, 6)),
data = data_train,
newdata = data_test,
family = 'tweedie',
use_stan = FALSE,
burnin = 4000,
trend_model = 'None')
The rootogram for this model looks better, though of course there is still some overprediction of medium-range values
ppc(mod2, type = 'rootogram')
However, the residual plot looks much better for this model
plot(mod2, type = 'residuals')
The summary of the model provides useful information on convergence
for unobserved parameters. Notice how strongly positive the
overdispersion parameter is estimated to be, providing further evidence
that this overdispersion is important to capture for these data.
However, the Gibbs samplers used by JAGS have not come
close to converging and we really shouldn’t put much faith in our
estimates of parameter uncertainty
summary(mod2)
## GAM formula:
## y ~ te(mintemp, lag, k = c(6, 6)) + te(precip, lag, k = c(6,
## 6))
##
## Family:
## tweedie
##
## Link function:
## log
##
## Trend model:
## None
##
## N series:
## 1
##
## N timepoints:
## 184
##
## Status:
## Fitted using JAGS
##
## Dispersion parameter estimates:
## 2.5% 50% 97.5% Rhat n.eff
## phi 1.270338 1.820059 2.470019 1.31 38
##
## GAM coefficient (beta) estimates:
## 2.5% 50% 97.5% Rhat n.eff
## (Intercept) 2.30183297 2.4194617383 2.5350142576 1.07 97
## te(mintemp,lag).1 -0.66926024 -0.5151696415 0.1470709636 6.65 20
## te(mintemp,lag).2 -0.74301562 -0.4007704711 0.1563759030 4.24 20
## te(mintemp,lag).3 -0.66976529 -0.3530384829 0.3671666975 5.01 12
## te(mintemp,lag).4 -0.63799709 -0.3536085824 0.8413451131 5.28 21
## te(mintemp,lag).5 -0.75255143 -0.1187258932 0.5759121994 4.40 12
## te(mintemp,lag).6 -0.47640972 -0.3613702187 -0.1274790500 2.18 30
## te(mintemp,lag).7 -0.37172322 -0.1601459646 -0.0405874181 1.18 11
## te(mintemp,lag).8 -0.24774355 -0.0326247211 0.2081783708 2.61 17
## te(mintemp,lag).9 -0.10974094 0.3144176292 0.6240691481 6.70 16
## te(mintemp,lag).10 0.03555117 0.5737193898 0.9038599642 5.22 15
## te(mintemp,lag).11 0.59145030 0.8512801563 1.0619611460 1.93 13
## te(mintemp,lag).12 -0.38141095 -0.1767804388 -0.0209160350 2.06 19
## te(mintemp,lag).13 -0.42958716 -0.1884199592 0.0296703743 3.40 18
## te(mintemp,lag).14 -0.36174933 -0.2597414759 -0.0461747560 1.48 18
## te(mintemp,lag).15 -0.36566429 -0.1158108545 0.0826439210 2.35 9
## te(mintemp,lag).16 -0.05953438 0.1069821802 0.3911058425 3.76 19
## te(mintemp,lag).17 0.36447820 0.5358530460 0.7707532669 2.51 16
## te(mintemp,lag).18 -0.24267772 -0.0339280548 0.1712677832 2.32 17
## te(mintemp,lag).19 0.01358350 0.1005844763 0.2843888934 1.37 25
## te(mintemp,lag).20 -0.20504834 -0.0034614445 0.2775122530 3.04 16
## te(mintemp,lag).21 -0.30551114 -0.0636920216 0.1488050554 2.68 11
## te(mintemp,lag).22 -0.27405119 -0.0435884997 0.3160940089 3.74 16
## te(mintemp,lag).23 0.10932880 0.4179756039 0.5710100539 2.94 21
## te(mintemp,lag).24 0.07647209 0.3259171347 0.6468807504 4.49 32
## te(mintemp,lag).25 -0.05272535 0.0921670711 0.3475320088 2.65 23
## te(mintemp,lag).26 -0.21377648 0.0307540557 0.2467449900 5.19 24
## te(mintemp,lag).27 -0.17402948 -0.0180549420 0.2476613630 2.51 23
## te(mintemp,lag).28 -0.29209975 -0.1397504658 0.1069953654 1.72 22
## te(mintemp,lag).29 -0.72804887 -0.2807874158 -0.0010884273 5.21 15
## te(mintemp,lag).30 0.50161982 0.7790041855 1.0012029288 2.29 14
## te(mintemp,lag).31 -0.13528660 0.4294638324 0.9017452800 8.81 26
## te(mintemp,lag).32 -0.30617164 0.1932002253 0.6388961322 7.60 14
## te(mintemp,lag).33 -0.19931306 0.0232100078 0.3151622738 3.64 19
## te(mintemp,lag).34 -0.42257184 -0.2186267715 -0.0658137708 1.32 16
## te(mintemp,lag).35 -1.41150012 -0.7679187413 -0.3643751775 6.02 22
## te(precip,lag).1 -0.11949168 -0.0280111773 0.0444908234 1.72 18
## te(precip,lag).2 -0.16355151 -0.0489954958 0.0158605351 2.01 26
## te(precip,lag).3 -0.13161910 -0.0413345627 0.0180416526 2.39 28
## te(precip,lag).4 -0.15081112 -0.0396877467 0.0369169995 3.46 24
## te(precip,lag).5 -0.23197847 -0.0408845682 0.1295944136 2.37 8
## te(precip,lag).6 -0.12136183 0.0305024232 0.1269361051 3.28 16
## te(precip,lag).7 -0.09625747 -0.0038714183 0.0435033507 2.27 37
## te(precip,lag).8 -0.09394801 -0.0350264450 0.0103852451 1.29 26
## te(precip,lag).9 -0.12153210 -0.0559158116 0.0025692479 1.80 28
## te(precip,lag).10 -0.14566991 -0.0923747083 0.0096235069 2.52 30
## te(precip,lag).11 -0.19408372 -0.0612557248 0.0289807361 3.79 20
## te(precip,lag).12 -0.14610047 -0.0333381763 0.0318598222 1.35 17
## te(precip,lag).13 -0.12999340 -0.0234231202 0.0361997048 1.38 20
## te(precip,lag).14 -0.13632827 -0.0523546929 0.0525802394 1.19 12
## te(precip,lag).15 -0.19258113 -0.0839466620 -0.0004863832 2.07 14
## te(precip,lag).16 -0.24574108 -0.1181862160 -0.0024902255 2.41 12
## te(precip,lag).17 -0.17885560 -0.1271965374 -0.0099775003 1.40 20
## te(precip,lag).18 -0.17448424 -0.0331055726 0.0472535286 4.09 30
## te(precip,lag).19 -0.11298474 -0.0155996520 0.0737793548 1.93 20
## te(precip,lag).20 -0.10889753 -0.0448366117 -0.0025813010 1.21 25
## te(precip,lag).21 -0.15281188 -0.0983397711 -0.0375898259 1.22 19
## te(precip,lag).22 -0.23446868 -0.1347500512 0.0110038776 1.91 22
## te(precip,lag).23 -0.30958276 -0.1666796165 0.0545046406 4.25 29
## te(precip,lag).24 -0.44982707 -0.1350358977 0.1430531764 7.46 15
## te(precip,lag).25 -0.13709230 -0.0005112134 0.2047438214 6.04 26
## te(precip,lag).26 0.03798553 0.0946476547 0.2122946166 2.06 17
## te(precip,lag).27 0.04916108 0.1265026736 0.1912003633 1.37 18
## te(precip,lag).28 -0.01225446 0.1040817307 0.2049093905 2.16 21
## te(precip,lag).29 -0.08888208 0.1028564629 0.2106125195 3.24 19
## te(precip,lag).30 -0.09562143 -0.0017552572 0.1123088690 2.18 18
##
## GAM smoothing parameter (rho) estimates:
## 2.5% 50% 97.5% Rhat n.eff
## te(mintemp,lag) 1.117711 2.6251471 4.256999 2.60 205
## te(mintemp,lag)2 1.894227 3.4213375 4.579725 2.39 204
## te(mintemp,lag)3 -1.115421 0.6326806 2.803879 1.80 919
## te(precip,lag) 2.421800 3.6949771 4.660730 1.02 164
## te(precip,lag)2 2.673085 3.8255027 4.644388 1.11 77
## te(precip,lag)3 1.473711 3.4784714 4.713150 1.01 987
##
## JAGS MCMC diagnostics
## Rhats above 1.05 found for 276 parameters
## *Diagnose further to investigate why the chains have not mixed
##
mvgamAs this is a timeseries and the residual plot hints at some
autocorrelation remaining in the short-term lags, lets check if an
AR latent trend process improves forecasts compared to the
no-trend model. An important note here is the choice of prior for the
overdispersion parameter twdis. This parameter and the
latent trend variance can interact strongly, particularly when
overdispersion is in the data high. This is because at high values of
the dispersion parameter, there is less need for a latent trend to be
able to capture any outliers and so the latent trend precision can go up
toward infinity, approaching a space of very diffuse likelihood that
forces the Gibbs samplers to take on frustratingly small step sizes.
Likewise when there is not much need for overdispersion, the dispersion
parameter can approach zero and move around in an equally uninformative
parameter space. The latent trend operates on the log scale, so really
we should not expect autocorrelated jumps in trappings of more than
6-8 from timepoint to timepoint (any larger and the trend
will compete strongly with the overdispersion parameter, making it
difficult for us to model the inherent overdispersion process and
instead assuming it is all autocorrelation). A containment prior on the
latent trend sigma will help achieve this, but first we
need to see the structure of the priors argument for making
alterations to defaults
priors <- get_mvgam_priors(formula = y ~ te(mintemp, lag, k = c(6, 6)) +
te(precip, lag, k = c(6, 6)),
data = data_train,
family = 'tweedie',
trend_model = 'AR3')
## Warning in get_mvgam_priors(formula = y ~ te(mintemp, lag, k = c(6, 6)) + :
## Tweedie family not yet supported for stan; reverting to JAGS
priors
The 5th row of this data.frame contains information on
the prior for the latent trend variance component. It is the entry for
the prior column of this row that needs to be modified
before supplying the argument for priors to the
mvgam function:
priors$prior[4] <- "dexp(2.5)T(0.15, 2)"
mod3 <- mvgam(formula = y ~ te(mintemp, lag, k = c(6, 6)) +
te(precip, lag, k = c(6, 6)),
data = data_train,
newdata = data_test,
family = 'tweedie',
use_stan = FALSE,
priors = priors,
burnin = 4000,
trend_model = 'AR3')
summary(mod3)
## y ~ te(mintemp, lag, k = c(6, 6)) + te(precip, lag, k = c(6,
## 6))
## tweedie
## log
## AR3
## 1
## 184
## Fitted using JAGS
## 2.5% 50% 97.5% Rhat n.eff
## phi 0.2029026 0.2884058 0.4388165 3.01 14
## 2.5% 50% 97.5% Rhat n.eff
## (Intercept) 2.185181991 2.335622051 2.613573364 1.37 30
## te(mintemp,lag).1 -0.489566532 0.149702055 0.418241616 10.28 23
## te(mintemp,lag).2 -0.433078661 0.157546221 0.452235009 8.79 19
## te(mintemp,lag).3 -0.468680795 0.038037510 0.602248136 7.88 8
## te(mintemp,lag).4 -0.533454170 -0.095714393 0.273249496 7.60 12
## te(mintemp,lag).5 -0.918114914 -0.315607000 0.344514452 9.11 19
## te(mintemp,lag).6 -0.697392897 -0.392294506 -0.020315683 4.04 16
## te(mintemp,lag).7 -0.369865017 -0.242910508 -0.161440889 2.04 28
## te(mintemp,lag).8 -0.213054129 -0.125587156 0.226836885 3.83 23
## te(mintemp,lag).9 -0.095729391 0.092329266 0.438745725 3.92 13
## te(mintemp,lag).10 0.044658935 0.271792411 0.545981340 4.63 21
## te(mintemp,lag).11 0.353821090 0.524459041 0.676246627 1.72 12
## te(mintemp,lag).12 -0.412952702 -0.181301165 0.043755404 5.48 19
## te(mintemp,lag).13 -0.145604613 -0.064607400 0.039984998 1.26 15
## te(mintemp,lag).14 -0.257107954 -0.105645283 0.036120732 2.37 17
## te(mintemp,lag).15 -0.211219983 -0.124608756 0.125487273 3.10 15
## te(mintemp,lag).16 -0.004763215 0.106924711 0.205569828 1.35 12
## te(mintemp,lag).17 0.170816808 0.295769857 0.449059202 1.93 15
## te(mintemp,lag).18 -0.381242629 -0.053881455 0.181950852 4.38 12
## te(mintemp,lag).19 -0.110890369 0.031770271 0.133181651 2.27 21
## te(mintemp,lag).20 -0.236503548 -0.063191252 0.103224720 3.13 19
## te(mintemp,lag).21 -0.330751024 -0.195762833 0.072223494 3.88 17
## te(mintemp,lag).22 -0.292162503 -0.145219051 -0.023012889 1.78 16
## te(mintemp,lag).23 -0.120958394 0.079402801 0.290028406 3.23 12
## te(mintemp,lag).24 0.087966157 0.271426741 0.584861050 3.73 20
## te(mintemp,lag).25 -0.145119238 0.229731885 0.394138638 4.78 19
## te(mintemp,lag).26 -0.177655008 0.077755904 0.327234907 3.61 15
## te(mintemp,lag).27 -0.272076780 -0.158306272 0.118201451 3.66 16
## te(mintemp,lag).28 -0.442974940 -0.242179707 -0.058756897 2.63 14
## te(mintemp,lag).29 -0.496996585 -0.336079283 -0.129552368 1.48 24
## te(mintemp,lag).30 0.390282186 0.546112964 0.672424683 1.52 27
## te(mintemp,lag).31 -0.090172679 0.490533660 0.725422932 4.79 8
## te(mintemp,lag).32 -0.049018483 0.298779039 0.599778980 4.12 11
## te(mintemp,lag).33 -0.166829507 -0.065765334 0.101892105 2.85 23
## te(mintemp,lag).34 -0.542388889 -0.361455514 -0.216153963 2.20 15
## te(mintemp,lag).35 -0.860538849 -0.695966157 -0.559096212 1.47 12
## te(precip,lag).1 -0.074149359 0.018912999 0.134319108 1.98 11
## te(precip,lag).2 -0.233839557 -0.062535854 -0.008254045 4.15 25
## te(precip,lag).3 -0.120069640 -0.050371177 0.037702349 2.04 22
## te(precip,lag).4 -0.099922278 0.009060749 0.088401231 1.60 11
## te(precip,lag).5 -0.097745717 0.035305554 0.163556041 2.75 15
## te(precip,lag).6 0.071560890 0.152590795 0.267133683 3.03 22
## te(precip,lag).7 -0.083912221 0.009251725 0.076792133 2.26 22
## te(precip,lag).8 -0.180996950 -0.071393004 0.006509016 2.62 16
## te(precip,lag).9 -0.147131191 -0.048821551 0.018651506 2.77 24
## te(precip,lag).10 -0.048968351 -0.007174150 0.040370247 1.54 32
## te(precip,lag).11 -0.104064177 -0.006225787 0.157496538 5.68 23
## te(precip,lag).12 -0.168643971 0.019737218 0.105814914 2.61 17
## te(precip,lag).13 -0.172524845 -0.030901172 0.050911662 3.09 17
## te(precip,lag).14 -0.184084976 -0.104629391 -0.028170979 2.77 38
## te(precip,lag).15 -0.232830933 -0.106530947 -0.020511567 2.87 15
## te(precip,lag).16 -0.170366087 -0.099754150 -0.020081579 1.37 17
## te(precip,lag).17 -0.214030630 -0.044686545 0.117253510 4.63 15
## te(precip,lag).18 -0.017505640 0.056515070 0.236230448 4.00 17
## te(precip,lag).19 -0.125153338 0.018852186 0.109826556 4.47 22
## te(precip,lag).20 -0.154505115 -0.059714082 0.038154962 2.56 14
## te(precip,lag).21 -0.191131736 -0.091641485 0.003142930 3.94 26
## te(precip,lag).22 -0.210450212 -0.082505927 -0.007768921 1.40 14
## te(precip,lag).23 -0.102406220 0.018897532 0.196081351 3.22 13
## te(precip,lag).24 -0.210833877 -0.112945197 -0.006904847 1.18 12
## te(precip,lag).25 -0.172684394 -0.086591320 0.039223359 2.03 16
## te(precip,lag).26 -0.227636976 -0.073206981 0.046740339 3.25 16
## te(precip,lag).27 -0.286950278 -0.083726036 0.032506094 3.92 19
## te(precip,lag).28 -0.262262627 -0.094193968 -0.011139450 4.56 21
## te(precip,lag).29 -0.219652616 -0.047886299 0.091089487 5.12 21
## te(precip,lag).30 -0.086803539 0.071203029 0.247638618 5.15 19
## 2.5% 50% 97.5% Rhat n.eff
## te(mintemp,lag) 2.4905928 3.546675 4.391608 1.13 237
## te(mintemp,lag)2 2.6371177 3.692525 4.525741 1.13 181
## te(mintemp,lag)3 -0.3526018 1.330835 3.592959 1.75 882
## te(precip,lag) 2.2980497 3.846671 4.855390 1.35 172
## te(precip,lag)2 1.7906079 3.155542 4.591898 2.04 282
## te(precip,lag)3 1.1284969 3.362199 4.539355 1.03 1168
## 2.5% 50% 97.5% Rhat n.eff
## ar1 0.6031134 0.8917930 1.1558248 1.05 56
## ar2 -0.2504091 0.1401597 0.5001433 1.43 47
## ar3 -0.4135877 -0.1102343 0.1367329 1.40 102
## sigma 0.2050275 0.3105242 0.4739982 1.53 64
## Rhats above 1.05 found for 331 parameters
## *Diagnose further to investigate why the chains have not mixed
A pairs plot of the logged versions of the latent trend precision and the overdispersion parameter can give insight into whether there is still any strange behaviour or problematic parameter spaces that we must think more deeply about
plot(log(MCMCvis::MCMCchains(mod3$model_output, 'phi')),
log(MCMCvis::MCMCchains(mod3$model_output, 'tau')),
ylab = 'log(tau)', xlab = 'log(phi)', pch = 16, col = "#8F272740")
Our rootogram has not improved much with the addition of the latent trend
ppc(mod3, type = 'rootogram')
But the residual distributions look better
plot(mod3, type = 'residuals')
Stan for DGAMs with latent Gaussian Process
trendsWe can also demonstrate another feature of mvgam, which
is the ability to use Hamiltonian Monte Carlo for parameter estimation
via the software Stan (using the rstan or
cmdstanr interfaces). Also note that currently there is no
support for fitting Tweedie responses in Stan.
We will therefore stick with a Negative Binomial
observation process for the Stan version. However there are
great advantages when using Stan, which includes the option
to estimate smooth latent trends via Hilbert space approximate
Gaussian Processes. This often makes sense for ecological series,
which we expect to change smoothly over time. As expected when compared
to the inferior Gibbs samplers in JAGS, the
Stan version converges very nicely
mod4 <- mvgam(formula = y ~ te(mintemp, lag, k = c(6, 6)) +
te(precip, lag, k = c(6, 6)),
data = data_train,
newdata = data_test,
family = nb(),
trend_model = 'GP',
use_stan = TRUE)
## Running MCMC with 4 parallel chains...
##
## Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 2 Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 3 Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 4 Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 2 Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 4 Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 3 Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1 Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 2 Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 3 Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 4 Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1 Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 2 Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 3 Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1 Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 2 Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 4 Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 3 Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1 Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 2 Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 2 Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 4 Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 3 Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 3 Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1 Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1 Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 4 Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 2 Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 4 Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1 Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 3 Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 4 Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 2 Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1 Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 3 Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 4 Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 2 Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1 Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 3 Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 4 Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 2 Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1 Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 3 Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 4 Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 2 Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 2 finished in 27.2 seconds.
## Chain 1 Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1 finished in 27.9 seconds.
## Chain 3 Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 3 finished in 28.2 seconds.
## Chain 4 Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 4 finished in 28.4 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 27.9 seconds.
## Total execution time: 28.6 seconds.
summary(mod4)
## y ~ te(mintemp, lag, k = c(6, 6)) + te(precip, lag, k = c(6,
## 6))
## negative binomial
## log
## GP
## 1
## 184
## Fitted using Stan
## 2.5% 50% 97.5% Rhat n.eff
## phi[1] 2.881133 4.132835 5.983513 1 1752
## 2.5% 50% 97.5% Rhat n.eff
## (Intercept) 2.09556150 2.35250500 2.593088750 1.00 3431
## te(mintemp,lag).1 -1.37725650 -0.39416500 0.396096975 1.01 552
## te(mintemp,lag).2 -1.02177100 -0.21080750 0.507214300 1.00 1081
## te(mintemp,lag).3 -0.91837940 -0.18118300 0.517734175 1.00 1427
## te(mintemp,lag).4 -1.00858425 -0.34373150 0.423574200 1.00 1603
## te(mintemp,lag).5 -1.52026625 -0.56464500 0.566515850 1.00 1298
## te(mintemp,lag).6 -0.99490130 -0.32208150 0.539031800 1.00 615
## te(mintemp,lag).7 -0.72488388 -0.18687300 0.404401975 1.00 566
## te(mintemp,lag).8 -0.53592625 0.04698050 0.696713500 1.00 555
## te(mintemp,lag).9 -0.17545363 0.36622400 1.028240250 1.00 566
## te(mintemp,lag).10 0.13923620 0.59739300 1.259377000 1.01 508
## te(mintemp,lag).11 0.08364091 0.79361700 1.656776500 1.01 607
## te(mintemp,lag).12 -0.66840937 -0.11579050 0.599631825 1.00 601
## te(mintemp,lag).13 -0.41930430 -0.08333770 0.369463575 1.00 568
## te(mintemp,lag).14 -0.45096550 -0.06632260 0.352985450 1.00 615
## te(mintemp,lag).15 -0.34529870 0.05240410 0.474380325 1.00 686
## te(mintemp,lag).16 -0.05141427 0.33289550 0.753164900 1.00 533
## te(mintemp,lag).17 -0.07738934 0.57772400 1.253270500 1.01 543
## te(mintemp,lag).18 -0.43718820 0.12361000 0.866091025 1.00 571
## te(mintemp,lag).19 -0.24139033 0.12892350 0.613180000 1.00 577
## te(mintemp,lag).20 -0.37554130 0.06518235 0.509747950 1.00 620
## te(mintemp,lag).21 -0.43904505 0.00761062 0.482686425 1.00 675
## te(mintemp,lag).22 -0.30901025 0.09906420 0.595758950 1.00 591
## te(mintemp,lag).23 -0.30294003 0.34611300 1.103164000 1.01 543
## te(mintemp,lag).24 -0.16854653 0.42221850 1.227761750 1.00 597
## te(mintemp,lag).25 -0.13598647 0.26619400 0.810101500 1.00 530
## te(mintemp,lag).26 -0.28360978 0.13741850 0.634467175 1.00 560
## te(mintemp,lag).27 -0.47103077 -0.01317475 0.451261700 1.00 599
## te(mintemp,lag).28 -0.58903065 -0.18952800 0.339721550 1.00 543
## te(mintemp,lag).29 -0.91715157 -0.28081800 0.503343700 1.01 543
## te(mintemp,lag).30 -0.07119840 0.61902800 1.520616500 1.00 647
## te(mintemp,lag).31 -0.05126970 0.37922350 0.938163450 1.00 633
## te(mintemp,lag).32 -0.27703360 0.15626650 0.686810700 1.00 812
## te(mintemp,lag).33 -0.54496550 -0.10932950 0.362176900 1.00 917
## te(mintemp,lag).34 -0.87101758 -0.41641000 0.084357977 1.00 780
## te(mintemp,lag).35 -1.57654275 -0.78503600 0.009086367 1.01 684
## te(precip,lag).1 -0.16217245 -0.01486675 0.143194800 1.00 971
## te(precip,lag).2 -0.27096865 -0.08014105 0.080409792 1.00 892
## te(precip,lag).3 -0.27477330 -0.08442045 0.098376805 1.00 946
## te(precip,lag).4 -0.25840100 -0.06782490 0.096830047 1.00 851
## te(precip,lag).5 -0.33868042 -0.06309145 0.185581125 1.00 734
## te(precip,lag).6 -0.14319105 0.04144290 0.256698650 1.01 624
## te(precip,lag).7 -0.14696977 -0.02673155 0.092198902 1.00 668
## te(precip,lag).8 -0.22086702 -0.06225420 0.079826025 1.00 803
## te(precip,lag).9 -0.24756212 -0.07029200 0.080876047 1.00 724
## te(precip,lag).10 -0.23219705 -0.06378475 0.072724755 1.01 589
## te(precip,lag).11 -0.32112658 -0.06802675 0.147496875 1.00 606
## te(precip,lag).12 -0.28312775 -0.07930025 0.153434000 1.01 611
## te(precip,lag).13 -0.20411562 -0.06189010 0.085992967 1.00 1071
## te(precip,lag).14 -0.23903337 -0.06065880 0.116923975 1.00 937
## te(precip,lag).15 -0.27408000 -0.07268605 0.107949025 1.01 699
## te(precip,lag).16 -0.26619840 -0.08023205 0.087272590 1.01 564
## te(precip,lag).17 -0.34042205 -0.07845605 0.148602075 1.00 636
## te(precip,lag).18 -0.26942150 -0.03148435 0.227976825 1.01 520
## te(precip,lag).19 -0.18248185 -0.04765015 0.123922425 1.00 686
## te(precip,lag).20 -0.21296640 -0.07447375 0.086886822 1.00 971
## te(precip,lag).21 -0.25806818 -0.10230100 0.043393905 1.00 1022
## te(precip,lag).22 -0.26000350 -0.11325200 0.014511535 1.01 995
## te(precip,lag).23 -0.33709930 -0.10124650 0.130192150 1.00 856
## te(precip,lag).24 -0.44895575 -0.17933350 0.104187475 1.00 796
## te(precip,lag).25 -0.20456255 -0.02634530 0.152873025 1.00 1097
## te(precip,lag).26 -0.16324258 0.04159950 0.238608000 1.00 964
## te(precip,lag).27 -0.18071818 0.05203080 0.250757675 1.00 912
## te(precip,lag).28 -0.17687255 0.05394175 0.234758375 1.00 853
## te(precip,lag).29 -0.21174462 0.09316825 0.370009200 1.00 882
## te(precip,lag).30 -0.25389995 -0.01073295 0.200755600 1.00 2564
## 2.5% 50% 97.5% Rhat n.eff
## te(mintemp,lag) 2.287283 3.396090 4.183368 1.00 1300
## te(mintemp,lag)2 1.800113 3.415265 4.261473 1.01 430
## te(mintemp,lag)3 -1.720837 0.654860 3.638840 1.00 475
## te(precip,lag) 1.859859 3.416360 4.254159 1.00 1121
## te(precip,lag)2 1.993303 3.474540 4.300673 1.01 485
## te(precip,lag)3 1.009984 3.362615 4.261303 1.01 516
## 2.5% 50% 97.5% Rhat n.eff
## alpha_gp[1] 0.5042018 0.8391255 1.360626 1 1844
## rho_gp[1] 1.8950992 5.3760800 14.130147 1 2749
## n_eff / iter looks reasonable for all parameters
## Rhat looks reasonable for all parameters
## 0 of 2000 iterations ended with a divergence (0%)
## 0 of 2000 iterations saturated the maximum tree depth of 12 (0%)
## E-FMI indicated no pathological behavior
As with all other mvgam objects, we can create plots of
the estimated forecast distribution
plot_mvgam_fc(mod4, series = 1, newdata = data_test, ylim = c(0, 100))
## Out of sample DRPS:
## [1] 23.21037
##
The trend now evolves smoothly via an infinite dimensional Gaussian
Process
plot_mvgam_trend(mod4, series = 1)
Traceplots of smooth penalties indicate good mixing and convergence of the four MCMC chains
rstan::stan_trace(mod4$model_output, pars = 'rho')
We can also create quick plots of the estimated smooth tensor product
interactions for the distributed lag terms, which basically follow
mgcv’s two-dimensional plotting utility but uses the
mvgam’s estimated coefficients
plot_mvgam_smooth(mod4, series = 1, smooth = 1)
plot_mvgam_smooth(mod4, series = 1, smooth = 2)
If you are like me then you’ll find these plots rather difficult to
interpret! The more intense yellow/white colours indicate higher
predicted values, with the deeper red colours representing lower
predicted values, but actually making sense of how the functional
response is expected to change over different lags is not easy from
these plots. HOwever, we can use the predict_mvgam function
to generate much more interpretable plots. First we will focus on the
effect of mintemp and generate a series of predictions to
visualise how the estimated function changes over different lags. Set up
prediction data by zeroing out all covariates apart from the covariate
of interest
newdata <- data_test
newdata$year <- rep(0, length(newdata$year))
newdata$season <- rep(0, length(newdata$season))
newdata$precip <- matrix(0, ncol = ncol(newdata$precip),
nrow = nrow(newdata$precip))
Set up viridis plot colours and initiate the plot window
to be centred around zero. We will then keep all mintemp
values at zero apart from the particular lag being predicted so that we
can visualise how the predicted function changes over lags of
mintemp. Predictions are generated on the link scale in
this case, though you could also use the response scale. Note that we
need to first generate predictions with all covariates (including the
mintemp covariate) zeroed out to find the ‘baseline’
prediction so that we can shift by this baseline for generating a
zero-centred plot. That way our resulting plot will roughly follow the
traditional mgcv partial effect plots
cols <- viridis::inferno(6)
plot(1, type = "n",
xlab = 'Mintemp',
ylab = 'Predicted response function',
xlim = c(min(data_train$mintemp), max(data_train$mintemp)),
ylim = c(-1.6, 1.6))
# Calculate predictions for when mintemp is all zeros to find the baseline
# value for centring the plot
newdata$mintemp <- matrix(0, ncol = ncol(newdata$mintemp),
nrow = nrow(newdata$mintemp))
preds <- predict(mod4, newdata = newdata, type = 'link')
offset <- mean(preds)
for(i in 1:6){
# Set up prediction matrix for mintemp with lag i as the prediction sequence;
# use a sequence of mintemp values across the full range of observed values in the training data
newdata$mintemp <- matrix(0, ncol = ncol(newdata$precip),
nrow = nrow(newdata$precip))
newdata$mintemp[,i] <- seq(min(data_train$mintemp),
max(data_train$mintemp),
length.out = length(newdata$year))
# Predict on the link scale and shift by the offset so that values are roughly centred at zero
preds <- predict(mod4, newdata = newdata, type = 'link') - offset
# Calculate empirical prediction quantiles
probs = c(0.05, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.95)
cred <- sapply(1:NCOL(preds),
function(n) quantile(preds[,n],
probs = probs))
# Plot expected function posterior intervals (40-60%) and medians in varying colours per lag
pred_upper <- cred[4,]
pred_lower <- cred[6,]
pred_vals <- seq(min(data_train$mintemp),
max(data_train$mintemp),
length.out = length(newdata$year))
polygon(c(pred_vals, rev(pred_vals)), c(pred_upper, rev(pred_lower)),
col = scales::alpha(cols[i], 0.6), border = scales::alpha(cols[i], 0.7))
lines(pred_vals, cred[5,],
col = scales::alpha(cols[i], 0.8), lwd = 2.5)
}
abline(h = 0, lty = 'dashed')
legend('topleft', legend = paste0('lag', seq(0, 5)),
bg = 'white', bty = 'n',
col = cols, lty = 1, lwd = 6)
This plot demonstrates how the effect of mintemp is
expected to change over different exposure lags, with the 3 - 5 month
lags showing more of a cyclic seasonal pattern (catches expected to
increase in the summer and autumn, roughly 3 - 5 months following cold
minimum winter temperatures) while the recent lags (lags 0 and 1)
demonstrate a more linear response function (catches broadly increasing
as minimum temperature increases). This is hopefully a useful example
for developing a better understanding of how a distributed lag model is
attempting to recreate the data generating process. And here is the same
plot for precipitation, which demonstrates how a u-shaped functional
relationship diminishes toward a flat function at lags 2 - 5 (though
this effect is clearly less important in the model than the mintemp *
lag effect above)
newdata <- data_test
newdata$year <- rep(0, length(newdata$year))
newdata$season <- rep(0, length(newdata$season))
newdata$mintemp <- matrix(0, ncol = ncol(newdata$mintemp),
nrow = nrow(newdata$mintemp))
newdata$precip <- matrix(0, ncol = ncol(newdata$precip),
nrow = nrow(newdata$precip))
preds <- predict(mod4, newdata = newdata, type = 'link')
offset <- mean(preds)
plot(1, type = "n",
xlab = 'Precipitation',
ylab = 'Predicted response function',
xlim = c(min(data_train$precip), max(data_train$precip)),
ylim = c(-1.6, 1.6))
for(i in 1:6){
newdata$precip <- matrix(0, ncol = ncol(newdata$precip),
nrow = nrow(newdata$precip))
newdata$precip[,i] <- seq(min(data_train$precip),
max(data_train$precip),
length.out = length(newdata$year))
preds <- predict(mod4, newdata = newdata, type = 'link') - offset
probs = c(0.05, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.95)
cred <- sapply(1:NCOL(preds),
function(n) quantile(preds[,n],
probs = probs))
pred_upper <- cred[4,]
pred_lower <- cred[6,]
pred_vals <- seq(min(data_train$precip),
max(data_train$precip),
length.out = length(newdata$year))
polygon(c(pred_vals, rev(pred_vals)), c(pred_upper, rev(pred_lower)),
col = scales::alpha(cols[i], 0.6), border = scales::alpha(cols[i], 0.7))
lines(pred_vals, cred[5,],
col = scales::alpha(cols[i], 0.8), lwd = 2.5)
}
abline(h = 0, lty = 'dashed')
legend('topleft', legend = paste0('lag', seq(0, 5)),
bg = 'white', bty = 'n',
col = cols, lty = 1, lwd = 6)
All of the usual functions in mvgam can also be used for
list data objects and for models fitted with Stan, so long
as they contain the necessary fields series,
season and year. For example, posterior
retrodictive checks for the in-sample training period:
ppc(mod4, series = 1, type = 'cdf')
and predictive checks for the out of sample forecast period (which demonstrates how the model tends to overpredict for the forecast period in this particular example):
ppc(mod4, newdata = data_test, series = 1, type = 'cdf')
Logical next steps for interrogating this model would be to trial
different trend types (i.e. random walk), replace the distributed lag
function for precip with a standard smooth function (that
does not include lag interactions, as clearly the model above indicates
that these are not supported) and inspect whether different covariates
(such as ndvi or maxtemp) might play a role in
modulating catches of PP. Finally, once we are satisfied
that we have a well-performing model that we can understand and
interrogate, we could expand up to a multivariate model by including
other species as response variables. This would allow us to capture any
possible unobserved dependencies in the catches of multiple co-occurring
species in a single unified modelling framework
Clark, N.J. and Wells, K. (2022). Dynamic Generalized Additive Models (DGAMs) for forecasting discrete ecological time series. Methods in Ecology and Evolution. DOI: https://doi.org/10.1111/2041-210X.13974