Motivation

Some time ago, a student of mine got stuck when fitting dynamic occupancy models to real data in Jags because of the computational burden. We had a dataset with several thousands sites, more than 20 seasons and 4 surveys per season (yeah!). We thought of using Unmarked instead (the likelihood is written in C++ and used through Rcpp), but dynamic models with false positives and/or random effects are not (yet?) implemented, and we were interested in considering both in our analysis. Some years ago, I had the opportunity to learn ADMB in a NCEAS meeting (thanks Hans Skaug!), I thought I would give it a try. ADMB allows you to write down any likelihood functions yourself and to incorporate random effects in an efficient way. It’s known to be fast for reasons I won’t go into here. Last but not least, ADMB can be run from R like JAGS and Unmarked (thanks Ben Bolker!).

Below I first simulate some data, then fit a dynamic model using ADMB, JAGS and Unmarked and finally perform a quick benchmarking. I’m going for a standard dynamic model, because the aims are i) to verify that JAGS is slower than Unmarked, ii) that ADMB is closer to Unmarked than JAGS in terms of time computation. If ii) is verified, then it will be worth the effort coding everything in ADMB.

Simule data

Using code from Kery and Schaub book, I simulate occupancy data from a dynamic occupancy model using the following parameters:

R = 100 # number of sites
J = 5 # number of replicate surveys/visits
K = 10 # number of years/seasons
psi1 = 0.6 # occupancy prob in first year/season
p = 0.7 # detection prob
epsilon = 0.5 # extinction prob
gamma = 0.3 # colonization prob
real_param = c(psi1,gamma,epsilon,p)

Let’s simulate:

# pre-allocate memory
site <- 1:R # Sites
year <- 1:K # Years
psi <- rep(NA, K) # Occupancy probability
muZ <- z <- array(dim = c(R, K)) # Expected and realized occurrence
y <- array(NA, dim = c(R, J, K)) # Detection histories

# define state process
# first year/season
z[,1] <- rbinom(R, 1, psi1) # Initial occupancy state
# subsequent years/seasons
for(i in 1:R){ # Loop over sites
    for(k in 2:K){ # Loop over years
        muZ[k] <- z[i, k-1]*(1-epsilon) + (1-z[i, k-1])*gamma # Prob for occ.
        z[i,k] <- rbinom(1, 1, muZ[k])
        }
}

# define observation process
for(i in 1:R){
    for(k in 1:K){
        prob <- z[i,k] * p
        for(j in 1:J){
            y[i,j,k] <- rbinom(1, 1, prob)
            }
        }
}

# format data
yy <- matrix(y, R, J*K)

Model fitting in ADMB

I recommend going through the vignette of the R2admb package first if you’d like to use ADMB with R. I use the HMM formulation of dynamic occupancy models to write down the likelihood (e.g. this paper for a general presentation, and that one for details on the likelihood).

library(R2admb)
model <- 
paste("
DATA_SECTION
 init_int K // Number of seasons
 init_int N // Number of seasons x surveys
 init_int JJ // Number of time intervals between primary occasions
 init_int nh // Number of sites
 init_ivector e(1,nh) // Date of first capture
 init_imatrix data(1,nh,1,N) // Data matrix
 init_ivector eff(1,nh) // Number of individuals per capture history
 init_ivector primary(1,K) // seasons
 init_ivector secondarybis(1,JJ) // time intervals between primary occasions
PARAMETER_SECTION
 init_bounded_number logit_psi(-20.0,20.0,1) // init occupancy
 init_bounded_number logit_det(-20.0,20.0,1) // detection
 init_bounded_number logit_gam(-20.0,20.0,1) // colonization
 init_bounded_number logit_eps(-20.0,20.0,1) // extinction
 objective_function_value g
// number psi
// number det
// number gam
// number eps
 sdreport_number psi
 sdreport_number det
 sdreport_number gam
 sdreport_number eps
 PROCEDURE_SECTION
 psi = mfexp(logit_psi);
 psi = psi/(1+psi); 
 det = mfexp(logit_det);
 det = det/(1+det); 
 gam = mfexp(logit_gam);
 gam = gam/(1+gam); 
 eps = mfexp(logit_eps);
 eps = eps/(1+eps); 
 dvar_vector prop(1,2);
 prop(1) = 1-psi; prop(2) = psi;
 dvar_matrix B(1,2,1,2);
 B(1,1) = 1;
 B(1,2) = 1-det;
 B(2,1) = 0.0;
 B(2,2) = det;
 dvar3_array PHI(1,N,1,2,1,2);
 for(int i=1;i<=K;i++){
    PHI(primary(i),1,1) = 1-gam;
    PHI(primary(i),1,2) = gam;
    PHI(primary(i),2,1) = eps;
    PHI(primary(i),2,2) = 1-eps;
 }
 for(int j=1;j<=JJ;j++){
    PHI(secondarybis(j),1,1) = 1;
    PHI(secondarybis(j),1,2) = 0;
    PHI(secondarybis(j),2,1) = 0;
    PHI(secondarybis(j),2,2) = 1;
 }

 for(int i=1;i<=nh;i++){
    int oe = e(i) + 1; // initial obs
    ivector evennt = data(i)+1; //
    dvar_vector ALPHA = elem_prod(prop,B(oe));
    for(int j=2;j<=N;j++){
        ALPHA = elem_prod(ALPHA*PHI(j-1),B(evennt(j)));
        g -= log(sum(ALPHA))*eff(i);
    }
 }
")
writeLines(model,"model.tpl")
setup_admb("/Applications/ADMBTerminal.app/admb")
## [1] "/Applications/ADMBTerminal.app/admb"

We then format the data appropriately:

# various quantities
nh <- nrow(yy) # nb sites
eff = rep(1,nh) # nb sites with this particular history
garb = yy[,1] # initial states

# primary and secondary occasions
primary = seq(J,J*K,by=J)
secondary = 1:(J*K)
secondary_bis = secondary[-primary]

# further various quantities that will be useful later on
K <- length(primary)
J2 <- length(secondary)
J <- J2/K
N <- J * K
JJ <- length(secondary_bis) # Number of time intervals between primary occasions

# list of data
df = list(K=K,N=N,JJ=JJ,nh=nh,e=garb,data=yy,eff=eff,primary=primary,secondarybis=secondary_bis)

We’re now ready to fit the model:

params <- list(logit_psi = 0, logit_det = 0, logit_gam = 0, logit_eps = 0) ## starting parameters
deb=Sys.time()
res.admb <- do_admb('model', data=df, params = params,verbose=T)
## writing data and parameter files ...
## compiling with args: '  -s ' ...
## compile output:
##   *** Parse: model.tpl tpl2cpp model || tpl2rem model  *** Compile: model.cpp c++ -c -std=c++14 -O3 -I. -I"/Applications/ADMBTerminal.app/admb/include" -I"/Applications/ADMBTerminal.app/admb/contrib/include" -omodel.obj model.cpp  *** Linking: model.obj  c++ -std=c++14 -O3 -omodel model.obj "/Applications/ADMBTerminal.app/admb/lib/libadmb-contrib.a"  Successfully built executable.  
## compile log:
## 
## running compiled executable with args: '  '...
## Run output:
## 
## 
## 
## 
## 
## Initial statistics: 4 variables; iteration 0; function evaluation 0; phase 1
## Function value   5.7224876e+04; maximum gradient component mag  -3.1249e+05
## Var   Value    Gradient   |Var   Value    Gradient   |Var   Value    Gradient
##   1  0.00000 -2.32022e+04 |  2  0.00000 -3.12491e+05 |  3  0.00000  5.56839e+04
##   4  0.00000  4.11686e+03 |
## 
## Intermediate statistics: 4 variables; iteration 10; function evaluation 22; phase 1
## Function value   5.2181570e+04; maximum gradient component mag   2.8173e-04
## Var   Value    Gradient   |Var   Value    Gradient   |Var   Value    Gradient
##   1  0.01844 -1.14083e-04 |  2  0.02639 -2.65374e-04 |  3 -0.02411  2.81729e-04
##   4  0.00008 -1.61880e-04 |
## 
##  - final statistics:
## 4 variables; iteration 11; function evaluation 23
## Function value   5.2182e+04; maximum gradient component mag   5.4574e-06
## Exit code = 1;  converg criter   1.0000e-04
## Var   Value    Gradient   |Var   Value    Gradient   |Var   Value    Gradient
##   1  0.01844 -1.74899e-06 |  2  0.02639 -2.19830e-06 |  3 -0.02411  5.45743e-06
##   4  0.00008 -1.58112e-06 |
## Estimating row 1 out of 4 for hessian
## Estimating row 2 out of 4 for hessian
## Estimating row 3 out of 4 for hessian
## Estimating row 4 out of 4 for hessian
## 
## 
## reading output ...
fin=Sys.time()
fin-deb
## Time difference of 1.493876 secs
res.admb$coefficients[5:8]
##     psi     det     gam     eps 
## 0.64086 0.69609 0.31922 0.50064

Model fitting in Unmarked

library(unmarked)
simUMF <- unmarkedMultFrame(y = yy,numPrimary=K)
deb=Sys.time()
unmarked_code <- colext(psiformula= ~1, gammaformula = ~ 1, epsilonformula = ~ 1,
pformula = ~ 1, data = simUMF, method="BFGS")
fin=Sys.time()
unmarked_code_time = fin-deb 
unmarked_code_time
## Time difference of 0.623075 secs
psi_unmarked <- backTransform(unmarked_code, type="psi")
col_unmarked <- backTransform(unmarked_code, type="col")
ext_unmarked <- backTransform(unmarked_code, type="ext")
p_unmarked <- backTransform(unmarked_code, type="det")
psi_unmarked
## Backtransformed linear combination(s) of Initial estimate(s)
## 
##  Estimate     SE LinComb (Intercept)
##     0.642 0.0481   0.582           1
## 
## Transformation: logistic
col_unmarked
## Backtransformed linear combination(s) of Colonization estimate(s)
## 
##  Estimate     SE LinComb (Intercept)
##     0.317 0.0203  -0.766           1
## 
## Transformation: logistic
ext_unmarked
## Backtransformed linear combination(s) of Extinction estimate(s)
## 
##  Estimate    SE LinComb (Intercept)
##     0.518 0.026  0.0713           1
## 
## Transformation: logistic
p_unmarked
## Backtransformed linear combination(s) of Detection estimate(s)
## 
##  Estimate     SE LinComb (Intercept)
##     0.705 0.0102    0.87           1
## 
## Transformation: logistic

Model fitting in JAGS

Let’s write the model first:

model <- 
paste("
    model{
    #priors
    p ~ dunif(0,1)
    psi1 ~ dunif(0,1)
    epsilon ~ dunif(0,1)
    gamma~dunif(0,1)
   
    for(i in 1:n.sites){
        # process
        z[i,1] ~ dbern(psi1)
        for(t in 2:n.seasons){ 
            mu[i,t]<-((1-epsilon)*z[i,t-1])+(gamma*(1-z[i,t-1]))
            z[i,t]~dbern(mu[i,t])
        }
        # obs
        for(t in 1:n.seasons){ 
            for(j in 1:n.occas){
                p.eff[i,j,t] <- z[i,t]*p
                y[i,j,t]~dbern(p.eff[i,j,t])
            }
        }
    }
}
")
writeLines(model,"dynocc.txt")

Create lists of data and initial values, set up which parameters to monitor

jags.data <- list(y=y,n.seasons=dim(y)[3],n.occas=dim(y)[2],n.sites=dim(y)[1])
z.init <- apply(jags.data$y,c(1,3),max)
initial <- function()list(p=runif(1,0,1),psi1=runif(1,0,1),z=z.init,epsilon=runif(1,0,1),gamma=runif(1,0,1))
params.to.monitor <- c("psi1","gamma","epsilon","p")
inits <- list(initial(),initial())

Fit model

library(jagsUI)
res <- jagsUI(data=jags.data, inits, parameters.to.save=params.to.monitor, model.file="dynocc.txt",n.thin = 1,n.chains = 2, n.burnin = 500, n.iter =1000,parallel=TRUE)
## 
## Processing function input....... 
## 
## Done. 
##  
## Beginning parallel processing using 2 cores. Console output will be suppressed.
## 
## Parallel processing completed.
## 
## Calculating statistics....... 
## 
## Done.
estim.jags = c(res$mean$psi1,res$mean$gamma,res$mean$epsilon,res$mean$p)

Compare estimates

cat("parameters\n",c('occ','col','ext','det'),'\n')
## parameters
##  occ col ext det
cat("real values\n",real_param,'\n')
## real values
##  0.6 0.3 0.5 0.7
cat("Admb estimates\n",as.vector(res.admb$coefficients[c(5,7,8,6)]),'\n')
## Admb estimates
##  0.64086 0.31922 0.50064 0.69609
cat("Unmarked estimates\n",c(psi_unmarked@estimate,col_unmarked@estimate,ext_unmarked@estimate,p_unmarked@estimate),'\n')
## Unmarked estimates
##  0.6416045 0.3174047 0.517822 0.7047413
cat("Jags estimates\n",estim.jags,'\n')
## Jags estimates
##  0.63988 0.3189305 0.5179611 0.7044753

Not too bad.

Benchmarking

Can ADMB compete with Unmarked in terms of computation times? What about JAGS?

library(microbenchmark)
microbenchmark(
do_admb('model', data=df, params = params),
colext(psiformula= ~1, gammaformula = ~ 1, epsilonformula = ~ 1,pformula = ~ 1, data = simUMF, method="BFGS"),
jagsUI(data=jags.data, inits, parameters.to.save=params.to.monitor, model.file="dynocc.txt",n.thin = 1,n.chains = 2, n.burnin = 500, n.iter =1000,parallel=TRUE),times=3)
## 
## Processing function input....... 
## 
## Done. 
##  
## Beginning parallel processing using 2 cores. Console output will be suppressed.
## 
## Parallel processing completed.
## 
## Calculating statistics....... 
## 
## Done. 
## 
## Processing function input....... 
## 
## Done. 
##  
## Beginning parallel processing using 2 cores. Console output will be suppressed.
## 
## Parallel processing completed.
## 
## Calculating statistics....... 
## 
## Done. 
## 
## Processing function input....... 
## 
## Done. 
##  
## Beginning parallel processing using 2 cores. Console output will be suppressed.
## 
## Parallel processing completed.
## 
## Calculating statistics....... 
## 
## Done.
## Unit: milliseconds
##                                                                                                                                                                                    expr
##                                                                                                                                            do_admb("model", data = df, params = params)
##                                                                     colext(psiformula = ~1, gammaformula = ~1, epsilonformula = ~1,      pformula = ~1, data = simUMF, method = "BFGS")
##  jagsUI(data = jags.data, inits, parameters.to.save = params.to.monitor,      model.file = "dynocc.txt", n.thin = 1, n.chains = 2, n.burnin = 500,      n.iter = 1000, parallel = TRUE)
##       min        lq      mean    median        uq       max neval cld
##  1398.919 1404.1410 1415.4963 1409.3629 1423.7849 1438.2069     3  b 
##   516.319  521.6611  524.7083  527.0031  528.9029  530.8028     3 a  
##  5318.953 5412.2891 5544.2737 5505.6250 5656.9340 5808.2430     3   c

As expected, Unmarked is the fastest, JAGS the slowest (2 parallel chains, I didn’t check whether convergence was achieved). ADMB is not doing so bad. What if I had 1000 sites and 20 years:

## [1] "/Applications/ADMBTerminal.app/admb"
## 
## Processing function input....... 
## 
## Done. 
##  
## Beginning parallel processing using 2 cores. Console output will be suppressed.
## 
## Parallel processing completed.
## 
## Calculating statistics....... 
## 
## Done. 
## 
## Processing function input....... 
## 
## Done. 
##  
## Beginning parallel processing using 2 cores. Console output will be suppressed.
## 
## Parallel processing completed.
## 
## Calculating statistics....... 
## 
## Done. 
## 
## Processing function input....... 
## 
## Done. 
##  
## Beginning parallel processing using 2 cores. Console output will be suppressed.
## 
## Parallel processing completed.
## 
## Calculating statistics....... 
## 
## Done.
## Unit: seconds
##                                                                                                                                                                                    expr
##                                                                                                                                            do_admb("model", data = df, params = params)
##                                                                     colext(psiformula = ~1, gammaformula = ~1, epsilonformula = ~1,      pformula = ~1, data = simUMF, method = "BFGS")
##  jagsUI(data = jags.data, inits, parameters.to.save = params.to.monitor,      model.file = "dynocc.txt", n.thin = 1, n.chains = 2, n.burnin = 500,      n.iter = 1000, parallel = TRUE)
##        min        lq      mean    median        uq       max neval cld
##   17.72053  17.87465  18.04338  18.02878  18.20481  18.38084     3  a 
##   10.73005  10.82544  10.88046  10.92082  10.95566  10.99051     3  a 
##  330.02588 330.36539 341.42388 330.70491 347.12287 363.54084     3   b

The discrepancy is quite big! Not that time is in seconds here, in constrast with the previous benchmarking in which it was in milliseconds.

Conclusions

Based on these investigations, I decided to look into ADMB a bit further. I extended the model we used in our recent paper on wolf recolonization to account for false positives. It wasn’t too difficult either to add covariates (whether they acted at the site, survey or season level). Now ADMB won’t estimate the latent occupancy states, in contrast with JAGS which treats them as parameters. However, if needed, one could use the HMM machinery to get them, namely the Viterbi algorithm like in Fiske et al. (2014).

Hope this is useful.