gjam for community dynamics

This is a version of gjam that estimates parameters for community interactions where there are multiple groups interacting over multiple years. A group could be a location, such as the BBS data for each route.

Population growth

Start with the familiar model for density-independent population growth. There is a population at a density \(w(t)\) that increases or declines at rate

\[\begin{equation} \frac{dw}{dt} = r w(t) \end{equation}\]

The parameter \(r\) is termed the ‘density-independent’ (DI) growth rate, because it does not depend on \(w(t)\). It is multiplied by \(w(t)\) and, thus, causes the population to grow (\(r > 0\)) or decline (\(r < 0\)) exponentially.

The term ‘density-dependent’ (DD) growth refers to the case where parameter \(r\) is replaced with a function of \(w(t)\),

\[\begin{equation} \frac{dw}{dt} = f(w(t))w(t) \end{equation}\]

such as the ‘logistic’ growth equation, \(f(w) = r + \alpha w(t)\), where \(\alpha < 0\). The density-dependent model is now

\[\begin{equation} \frac{dw(t)}{dt} = r w(t) + \alpha w^2(t) \end{equation}\]

The notion of a carrying capacity is obtained at zero growth rate \(w^* = - \frac{r}{\alpha}\), provided \(\alpha < 0\).

Multiple species

For \(S\) species there are \(S\) equations. Common examples include the Lotka-Volterra competition and predator-prey equations. In its simplest form, growth of species \(s\) is given by

\[\begin{equation} \frac{dw_s}{dt} = w_s(t) \left( r_s + \sum_{s'} \alpha_{ss'} w_{s'}(t) \right) \end{equation}\]

where species interaction coefficients \(\alpha_{ss'}\) describe the effect of the density of species \(s'\) on the growth rate of species \(s\). The matrix \(\boldsymbol{\alpha}\) holds the species interactions, \(SS'\). For interspecific competition, \(\alpha_{ss'}\) is negative. For predator \(s\) and prey \(s'\), \(\alpha_{ss'} > 0\) (a resource that stimulates growth), and \(\alpha_{s's} < 0\) (predation loss). In general we expect asymmetry, \(\alpha_{ss'} \neq \alpha_{s's}\).

Let \(\mathbf{r} = r_1, \dots, r_S\) be a vector of species growth rates and \(\boldsymbol{\alpha}\) be the \(S \times S\) matrix of interaction coefficients \(\alpha_{ss'}\) for the effect of species \(s'\) (columns) on \(s\) (rows),

\[\begin{equation} \boldsymbol{\alpha} = \begin{pmatrix} \alpha_{11} & \alpha_{12} & \ldots & \alpha_{1S} \\ \alpha_{21} & \alpha_{22} & \ldots & \alpha_{2S} \\ \vdots & \vdots & \ddots & \vdots\\ \alpha_{S1} & \alpha_{S2} &\ldots & \alpha_{SS} \end{pmatrix} \label{eqn:alphaMat} \end{equation}\]

There is a vector of growth rates,

\[\begin{equation} \frac{d\mathbf{w}}{dt} = diag(\mathbf{w}(t)) \mathbf{r} + diag(\mathbf{w}(t)) \boldsymbol{\alpha} \mathbf{w} \end{equation}\]

At equilibrium, there can be a vector of ‘carrying capacities’ \(\mathbf{w}^* = -(\boldsymbol{\alpha}' \boldsymbol{\alpha})^{-1} \boldsymbol{\alpha}' \mathbf{r}\), provided all values are positive and all eigenvalues of the Jacobian of \(\boldsymbol{\alpha}\) have negative real parts.

Interaction matrix

The structure of \(\boldsymbol{\alpha}\) is the subject of an extensive literature on trophic interactions, food webs, and species coexistence. Trophic (food-web) theory is often organized around this community matrix of interactions, which summarizes the effects of each species on growth rates of others (Neill 1974, …). For two species, the community matrix is

\[\begin{equation} \boldsymbol{\alpha}_ = \begin{pmatrix} \alpha_{11} & \alpha_{12} \\ \alpha_{21} & \alpha_{22} \end{pmatrix} \end{equation}\]

In food web models, interactions are often summarized by the signs of interactions. Inter- and intraspecific competition can be represented as

\[\begin{equation} \boldsymbol{\alpha}_{\pm} = \begin{pmatrix} - & - \\ - & - \end{pmatrix} \end{equation}\]

and predation as

\[\begin{equation} \boldsymbol{\alpha}_{\pm} = \begin{pmatrix} - & + \\ - & - \end{pmatrix} \end{equation}\] Amensalism (one-way) competition is \[\begin{equation} \boldsymbol{\alpha}_{\pm} = \begin{pmatrix} - & 0 \\ - & - \end{pmatrix} \end{equation}\]

Food web theory treats communities as networks and considers how degree of connectedness might affect community stability (Cohen **).

Beyond the signs of the coefficients there is substantial literature on their relative magnitudes. For coexistence of competitors (\(\alpha_{ss'} < 0\)) the diagonal elements must be sufficiently negative to insure coexistence. Do competing sessile organisms recognize the identities of their neighbors (all \(\alpha_{ss'}\) unique), only some, or none (all \(\alpha_{ss'}\) equivalent)? Do competitors affect one another directly (\(SS'\) effects in \(\alpha\)) or by way of shared resources (other terms in a model)?

Simple simulation

This is a simple deterministic simulation to show convergence toward an equlibrium.

S <- 3
aa <- matrix(runif(S^2,-3,0), S,S)
diag(aa) <- runif(S,-50,-10)
aa <- solve(aa)

gam <- matrix(1.04,S)   # this is the discrete time growth rate 
r   <- log(gam)

wstar <- solve(crossprod(aa))%*%t(aa)%*%(1 - gam)
wistar <- (1 - gam)/diag(aa)

oo <- aa
diag(oo) <- 0
DD <- diag(aa) - apply(oo,1,min)

nt <- 200
w  <- matrix(1,nt,S)
w[1,] <- runif(S,.5,3)

for(t in 2:nt){
  
  w[t,] <- w[t-1,]*(gam + aa%*%t(w[drop=F,t-1,])) 
  w[t,w[t,] < .00000001] <- .00000001
}
ylim <- range(w)
par(bty='n', cex=1.5)
xlab <- expression( paste("Time ", italic(t), sep=''))
ylab <- expression( paste("Abundance ", italic(w[s][t]), sep=''))
plot(w[,1],type='l', ylim=ylim, xlab=xlab,ylab=ylab)
for(s in 1:S){
  lines(w[,s],col=s,lwd=2)
  abline(h=wstar[s],lty=2, col=s, lwd=2)
}

Extrinsic factors and activity

If individuals enter and leave the population independent of population growth, enter or leave dormancy, or otherwise change activity levels, there can be an additional term,

\[\begin{equation} \frac{d\mathbf{w}}{dt} = \boldsymbol{\beta}(t) + diag(\mathbf{w}(t)) \mathbf{r} + diag(\mathbf{w}(t)) \boldsymbol{\alpha} \mathbf{w} \label{eqn:difeq} \end{equation}\]

The length-\(S\) vector \(\boldsymbol{\beta}(t)\) describes extrinsic forces that affect the population, such as movement, onset or completion of diapause, and losses that occur at a rate that does not depend on population density \(w(t)\). We refer to this term as the environment effect \(E\) to distinguish it from population growth and environment-species interactions, which are introduced below.

Environment

Environmental variation and environment-environment (\(E\)) interactions can affect any of the terms on the right-hand side, but we limit them to the first two terms, summarized as

\[\begin{equation} \frac{d\mathbf{w}}{dt} = f_{E}(X) + f_{ES}(W, X \times W) + f_{SS'}(W \times W) \label{eqn:difeq2} \end{equation}\]

The first term represents how gains from and losses to elsewhere are influenced by \(E\). In the second term, environment enters as an interaction with species abundance \(W\). These \(ES\) interactions describe how population growth is influenced by \(E\). Because the third term is already an \(SS'\) interaction, including \(E\) would result in four-way interactions, \(ESS'\).

Incorporating data

Connecting theory to data involves predictors, discretization, and some reorganization. In discrete time, there is a length-\(S\) vector \(\mathbf{w}_t\) representing abundances at time \(t\),

\[\begin{align} \mathbf{w}_{t+dt} &= \mathbf{w}_{t} + d \mathbf{w}_t + dt \cdot \boldsymbol{\Sigma}^{1/2} \boldsymbol{\epsilon}_t \\ d\mathbf{w}_t &= dt \cdot \left( \mathbf{B}' \mathbf{x}_t + \boldsymbol{\Lambda}' \mathbf{v}_t + \mathbf{A}' \mathbf{u}_{t} \right) - \mathbf{w}_t \label{eqnDT} \end{align}\]

where \(\boldsymbol{\epsilon}_t\) is a random vector, and \(\boldsymbol{\Sigma}\) is an \(S \times S\) covariance matrix. For \(dt = 1\),

\[\begin{equation} \mathbf{w}_{t+1} = \mathbf{B}' \mathbf{x}_t + \boldsymbol{\Lambda}' \mathbf{v}_t + \mathbf{A}' \mathbf{u}_{t} + \boldsymbol{\Sigma}^{1/2} \boldsymbol{\epsilon}_t \\ \label{eqnDisc} \end{equation}\]

The first term includes the environment (\(EE'\)) effects in the design vector \(\mathbf{x}_t\).

The second term includes environment-species (\(ES\)) interactions, through \(\mathbf{v}_t\), which holds all combinations of \(\mathbf{w}_t\) and \(\mathbf{x}_t\) (Appendix). Matrix \(\boldsymbol{\Lambda}\) includes these growth rates for all species and their interactions with all environmental variables. It is a sparse version of a \(Q \times S\) matrix \(\boldsymbol{\lambda}\), reorganized to optimize posterior simulation (Appendix). The terms \(\mathbf{w}_t + d\mathbf{w}_t = \mathbf{w}_t + dt \cdot \boldsymbol{\Lambda} \mathbf{v}_t + \dots\) in introduce a term \((1 + r_s dt)w_{t,s} = \lambda_s w_{t,s}\), which is the first-order relationship between the ‘per-capita rate of increase’ \(r\) and the ‘finite rate of increase’ \(\lambda\) of population biology. This is the first element of the length-\(S\) vector \(\boldsymbol{\Lambda}' \mathbf{v}_t\).

The density-dependent (\(SS\)) interactions are contained in \(\mathbf{u}_{t-1}\). Matrix \(\mathbf{A}\) holds the \(SS\) interaction coefficients from \(\boldsymbol{\alpha}\), reorganized to optimize model fitting (Appendix).

BBS data

Use source files for now

#library(gjam)
library(MASS)

# from source files:
source("/Users/jimclark/makeGJAMcurrent/RFunctions/gjamHfunctions.R")
source("/Users/jimclark/makeGJAMcurrent/RFunctions/gjamTrimY.r")
source("/Users/jimclark/makeGJAMcurrent/RFunctions/gjamPriorTemplate.R")
Rcpp::sourceCpp('/Users/jimclark/makeGJAMcurrent/RcppFunctions/cppFns.cpp')
#

#setwd("~/Documents/Manuscripts/gjamTime")
load('../bbsExample.Rdata')
colnames(xdata)[1:2] <- c('lon','lat')

maps::map('county', xlim = c(-85, -75), ylim = c(33.6, 36.8), col='grey')
maps::map('state', xlim = c(-85, -75), ylim = c(33.6, 36.8), add=T)
points(xdata$lon, xdata$lat, col=3)

This is a bit of the BBS data, xdata:

The counts in ydata have S columns. I trim it down here:

ydata <- gjamTrimY(ydata,1800)$y
## [1] "numeric"
knitr::kable(head(ydata[1:17,1:4]), digits=3)
MourningDove Red-belliedWoodpecker ChimneySwift EasternWood-Pewee
1_1966 12 5 4 2
1_1967 51 5 4 8
1_1968 15 5 6 4
1_1969 16 5 2 3
1_1970 19 4 1 1
1_1971 18 6 6 0

Time series data need at least a times variable, indicating sequence. If there are multiple sequences, they also need a groups variable. These are the names of columns in xdata. Here are observations by year and Route:

times  <- 'year'
groups <- 'Route'
knitr::kable(table(xdata[,c(times,groups)])[,1:15], digits=2)

Zeros in this table indicate routes that were not sampled in a given year. There are many missing values.

Missing values in time series data

The function gjamFillMissingTimes provides some options for initializing the missing values for analysis.

First, it will insert one observation at the beginning of the sequence for each group. This initial “time zero” is the prior for the observation at time 1.

Second, gjamFillMissingTimes can insert a placeholder for missing sample times. They will be filled with interpolated (covariates) or neighboring values (factors) in xdata. These would typically then be replaced in xdata with known covariate values.

Third, gjamFillMissingTimes will insert values for ydata. These can be NA (fillNA = T) or will nearby values, which would serve as prior values, with weight determined by the effort matrix edata.

Because the effort controls the partition for the data and, thus, the weight of the observations. The n by S input matrix edata can specify the effort used for missing values.

Finally, gjamFillMissingTimes will insert columns for groups and times. Here is an example:

groupVars <- c( "lon","lat","Route","soil","nlcd" )
xdata <- xdata[,c( groupVars, 'year', 'temp', 'StartWind', 'StartSky')]

edata <- ydata*0 + 100
tmp <- gjamFillMissingTimes(xdata, ydata, edata, groups, times,
                            groupVars = groupVars, FILLMEANS = T,
                            typeNames = 'DA', missingEffort = 1)    
xdata   <- tmp$xdata
ydata   <- tmp$ydata
timeZero <- tmp$timeZero
timeLast <- tmp$timeLast
rowInserts <- tmp$rowInserts
snames   <- colnames(ydata)
effMat <- tmp$edata

I would now replace the missing observations in xdata with known values for those locations and years. I leave the NAs in ydatSS so they will be fitted. Missing values in xdata will also be fitted. Recall that high levels of missingness will destroy the fit.

tpath <- "/Users/jimclark/makeMastOnJimClark/makeMast/climateBuild/terraClimateFiles"
cf    <- "/Users/jimclark/makeMastOnJimClark/makeMast/climateBuild/climateFunctions.r"
source(cf)

sites  <- sort(unique(xdata$groups))
mm     <- match( sites, xdata$groups )
lonLat <- xdata[mm,c('lon','lat')]
rownames(lonLat) <- sites

uvars <- c("tmin","tmax","ppt")

years <- c(min(xdata$year, na.rm=T):max(xdata$year, na.rm=T))

tmp <- getTileTerraClimate(lonLat, years, 
                           months = 1:12, vars = uvars, 
                           DOWNSCALE = T, path = tpath)
#tmpt <- tmp$tallFormat                  
tmpw <- tmp$wideFormat
tempVec <- as.matrix( (tmpw$tmin[,-c(1:2)] + tmpw$tmax[,-c(1:2)])/2 )
precVec <- as.matrix( tmpw$ppt[,-c(1:2)] )
tminVec <- as.matrix( tmpw$tmin[,-c(1:2)] )

cc <- columnSplit(colnames(tempVec),'_')
yi <- as.numeric(cc[,1])
mi <- as.numeric(cc[,2])

# annual precip
ii <- rep(rownames(precVec), ncol(precVec))
jj <- rep(yi, each = nrow(precVec))

prec <- tapply( precVec, list(site = ii, year = jj), sum)
precAnom <- sweep( prec, 1, rowMeans(prec), '-' )
precSite <- matrix( rowMeans(prec), nrow(prec), ncol(prec))

ii <- match( xdata$groups, rownames(prec))
jj <- match( as.character(xdata$year), colnames(prec) )

xdata$precSite <- signif( precSite[ cbind(ii, jj) ], 3)
xdata$precAnom <- signif( precAnom[ cbind(ii, jj) ], 3)

# summer deficit
pet <- monthlyPET(yi, mi, tempVec, precVec, lat=tmpw$ppt[,'lat'])[[1]]  #FIX LAT 
rownames(pet) <- rownames(tempVec)
colnames(pet) <- colnames(tempVec)
def <- pet - precVec
rownames(def) <- rownames(precVec)

ws <- which(mi %in% 6:8)
ds <- def[,ws]
ii <- rep(rownames(ds), ncol(ds))
jj <- rep(yi[ws], each = nrow(ds))
def <- tapply( ds, list(site = ii, year = jj), sum)

defAnom <- sweep( def, 1, rowMeans(def), '-' )
defSite <- matrix( rowMeans(def), nrow(def), ncol(def))

ii <- match( xdata$groups, rownames(defAnom))
jj <- match( as.character(xdata$year), colnames(defAnom) )
xdata$defSite <- signif( defSite[ cbind(ii, jj) ], 3)
xdata$defAnom <- signif( defAnom[ cbind(ii, jj) ], 3)


#winterTmin
ws <- which(mi %in% c(12, 1, 2) )
ds <- tempVec[,ws]

ii <- rep(rownames(ds), ncol(ds))
jj <- rep(yi[ws], each = nrow(ds))

winterTmin <- tapply( ds, list(site = ii, year = jj), min)

ii <- match( xdata$groups, rownames(summerDeficit))
jj <- match( as.character(xdata$year), colnames(summerDeficit) )

xdata$winterTmin <- signif( winterTmin[ cbind(ii, jj) ], 3)
colnames(tmpt)[1] <- 'plot'

#temp at observation month

ws <- which(mi == 6 )
ds <- tempVec[,ws]
ii <- rep(rownames(ds), ncol(ds))
jj <- rep(yi[ws], each = nrow(ds))

juneTemp <- tapply( ds, list(site = ii, year = jj), min)

ii <- match( xdata$groups, rownames(juneTemp))
jj <- match( as.character(xdata$year), colnames(juneTemp) )

xdata$juneTemp <- signif( juneTemp[ cbind(ii, jj) ], 3)

save( xdata, ydata, timeZero, timeLast, rowInserts, snames, effMat,
      file = 'BBSexample.rdata')

Coefficient matrices

The basic model for latent states in matrix normal form is

\[MVN_{n,S}(\mathbf{W}_1 | \mathbf{X}_1 \mathbf{B}' + \mathbf{V}_1 \boldsymbol{\Lambda}' + \mathbf{U}_0 \mathbf{A}', \mathbf{I}_n, \mathbf{\Sigma})\] for environmental effects in \(n \times Q\) matrix \(\mathbf{X}\), environment-species interactions in \(n \times V\) matrix \(\mathbf{V}\), where \(V \leq SQ\), and density-dependence (species-species interactions) in \(n \times U\) matrix \(\mathbf{U}\), where \(U \leq S(S+1)/2\). The subscript \(0\) refers to observation times \(0, \dots, T-1\) and the subscript \(1\) to observation times \(1, \dots, T\). To simplify notation I omit a subscript for route.

Each of the coefficient matrices could be sparse. I have several options for priors. For environmental effects in \(\mathbf{B}\) and species interactions in \(\mathbf{A}\) I can specify the following in the prior matrix: * 0 - fit with non-informative prior * 1 - uniform, but positive * -1 - uniform, but negative * NA - do not fit (an effect that is omitted for a species-environment combination)

Importantly, the model assumes that predictors in \(\mathbf{V}\) also occur in \(\mathbf{X}\).

This is easiest to see with an example. Here is a prior setup for \(\mathbf{B}\):

load('BBSexample.rdata')

formula <- as.formula(~ StartWind + defSite + temp)

hi <- list(StartWind2 = 0, StartWind3 = 0, defAnom = 0) # negative combinations
g  <- gjamPriorTemplate(formula, xdata, ydata = ydata, hi = hi)
loBeta <- g[[1]]; hiBeta <- g[[2]]
loBeta[!is.finite(loBeta)] <- -.5
hiBeta[!is.finite(hiBeta)] <- .5

loBeta['intercept',] <- hiBeta['intercept',] <- NA       # omit intercept
betaPrior <- list(lo = loBeta, hi = hiBeta)

lform <- as.formula(~ defSite + temp)

lo <- list(temp = 0, intercept=.8)                                # winter temp
hi <- list(def = 0, intercept=1.5) 
g <- gjamPriorTemplate(lform, xdata, ydata = ydata, lo = lo, hi = hi)
loLambda <- g[[1]]; hiLambda <- g[[2]]
loLambda[!is.finite(loLambda)] <- -.5
hiLambda[!is.finite(hiLambda)] <- .5
lambda <- loLambda*0

lambdaPrior <- list(lo = loLambda, hi = hiLambda)

I do not need to specify a prior for \(\boldsymbol{\Lambda}\), because it will inherit the prior for \(\mathbf{B}\), albeit reorganized as an environment by species interaction matrix.

The prior for \(\mathbf{A}\) is \(S \times S\), with values given by the interaction matrix. Here is an example emphasizing competition:

S <- ncol(ydata)
alpha <- matrix(NA,S,S)

wstar <- apply(ydata/effMat,2,max,na.rm=T)  # carrying capacity based on observed
wstar <- wstar*10                           # assume it is higher

astar <- (1 - 1.2)/wstar                             # reasonable lambda

z <- sqrt( crossprod( as.matrix(ydata) ) )
alpha[z > 1000] <- 2*matrix(astar,S,S,byrow=T)[z > 1000]
diag(alpha) <- astar

loAlpha <- alpha
hiAlpha <- loAlpha*0
hiAlpha[3,2] <- 1          # a predator and prey
loAlpha[2,3] <- -1

alphaPrior <- list(lo = loAlpha, hi = hiAlpha)

alpha[1:5,1:5]
##              [,1]        [,2]        [,3]        [,4]        [,5]
## [1,] -0.008538462 -0.07547170 -0.03571429          NA -0.05555556
## [2,] -0.017076923 -0.03773585          NA          NA          NA
## [3,] -0.017076923          NA -0.01785714          NA          NA
## [4,]           NA          NA          NA -0.04545455          NA
## [5,] -0.017076923          NA          NA          NA -0.02777778

Here are model fitting and plots

timeList <- list(times = 'times', rowInserts = rowInserts,
                 alphaPrior = alphaPrior, betaPrior = betaPrior, 
                 lambdaPrior = lambdaPrior )

effort <- list(columns = 1:S, values = effMat)

rl <- list(N = 8, r = 5)
modelList <- list(typeNames = 'DA',ng=2000, burnin=500, reductList = rl, 
           timeList=timeList, effort = effort)

output <- .gjam(formula, xdata=xdata, ydata=ydata, modelList=modelList)
## 
## Observations and responses:
## [1] 2639   20
## [1] "1121  values ( 14 %) missing in x imputed"
## [1] "numeric"
## [1] "561  values ( 11 %) missing in x imputed"
## [1] "numeric"
## 
## Dimension reduced from 20 X 20 -> 8 X 5 responses
## ===========================================================================
## expanding covariance chains
## ===========================================================================
specColor <- rep('black',ncol(ydata))
specColor[grep('WoodThrush',colnames(ydata))] <- 'red'
plotPars <- list(specColor = specColor, GRIDPLOTS=T, PLOTALLY=T)
.gjamPlot(output,plotPars)
## no. species, effective species -- return to continue
## expanding covariance chains

## ===========================================================================

## obs y vs predicted y -- return to continue

## x inverse prediction, covariates -- return to continue

## y prediction -- return to continue

## y prediction -- return to continue

## sensitivity over full model -- return to continue

## sensitivity over species pairs -- return to continue

## beta coefficient thinned chains -- return to continue

## correlation thinned chains -- return to continue

## lambda coefficient chains -- return to continue

## alpha coefficient chains -- return to continue

## standardized for W/X, 95% posterior -- return to continue

## standardized for W/X, 95% posterior -- return to continue

## standardized for W/X, 95% posterior -- return to continue

## 95% posterior -- return to continue

## 95% posterior -- return to continue

## 95% posterior -- return to continue

## 95% posterior -- return to continue

## 95% posterior -- return to continue

## Data and E responses to X -- return to continue
## ordination of E matrix -- return to continue
## reduction from sigma to Z -- return to continue
## posterior correlation for model -- return to continue
## E: model-based response to X -- return to continue
## comparison R vs E -- return to continue
## raw data vs E -- return to continue
## beta ordered by response to X -- return to continue
## beta ordered by response to X -- return to continue
## lambda ordered by response to X -- return to continue