Modelos espacio-estado

Estos modelos asumen que las observaciones no representan perfectamente el valor verdadero de las variables de estado ** latentes **. Modelamos el proceso con esos valores latentes y verdaderos, y vinculamos el proceso a las observaciones con un “modelo de observación”.

y_t-1    y_t    y_t+1   modelo de observación
  |       |       |
x_t-1 -> x_t -> x_t+1   modelo de proceso

Es decir, modelamos por separado * el proceso - como el sistema cambia en el tiempo o espacio * las observaciones - con error o indirectas

Veamos un ejemplo con datos simulados de una dinámica poblacional logística. Para eso vamos a usar una función copiada del libro de Bolker. En la simulación, tenemos

simlogistdata = function(seed=1001,
                         r=1,K=10,
                         n0=1,t0=1,
                         tot.t=10,
                         dt=0.5, 
                         sd.proc=1,
                         sd.obs=1) {
  if (!is.null(seed)) set.seed(seed)
  tvec = seq(1,tot.t,by=dt)
  n = length(tvec)
  y = numeric(n)
  ydet = numeric(n)
  y[1] = n0
  ydet[1] = n0
  e.proc = rnorm(n,mean=0,sd=sd.proc)
  e.obs = rnorm(n,mean=0,sd=sd.obs)
  for (i in 2:n) {
    ydet[i] = ydet[i-1]+r*ydet[i-1]*(1-ydet[i-1]/K)*dt   # deterministic
    y[i] = y[i-1]+(r*y[i-1]*(1-y[i-1]/K)+e.proc[i-1])*dt # process only
  }
  y.procobs = y + e.obs    # both process and observation error
  y.obs = ydet + e.obs     # only observation error
  cbind(tvec,y.procobs, ydet, y)
}

# set up parameter values
rval = 0.25
n0val = 3
procvar = 0.5
obsvar = 0.5
nt = 100
sim_data = simlogistdata(tot.t=nt,sd.proc=sqrt(0.5),sd.obs=sqrt(0.5), n0=3,r=0.25,seed=1002,dt=1)
y_obs = sim_data[,2]

plot(y_obs,type="l",lwd=2, xlab="time", ylab="population")
lines(sim_data[,4], col = 2)
lines(sim_data[,3], col = 3)

Escribimos el modelo de BUGS

cat(file = "dl_state-space.bug", 
    "
model {
t[1] <- n0
o[1] ~ dnorm(t[1], 1/sd.obs^2)
    for (i in 2:N) {
    v[i] <- t[i-1] + r*t[i-1] * (1-t[i-1]/K) # deterministic
    t[i] ~ dnorm(v[i], 1/sd.proc^2)
    o[i] ~ dnorm(t[i], 1/sd.obs^2)
}

r ~ dunif(0.1, 3)
K ~ dnorm(10,1/25)T(0,)

sd.obs ~ dnorm(0,1)T(0,)
sd.proc ~ dnorm(0,1)T(0,)
n0 ~ dnorm(1, 1/25)T(0,)
}"
)

Usemos JAGS para simular muestras de las posteriores

library(jagsUI)

data <- list(N = length(y_obs),
             o = y_obs)

 
inits <- function() list(r = runif(1),
                         K = runif(1, 8, 12),
                         sd.obs = runif(1),
                         sd.proc = runif(1))

params <- c("r", "K", "sd.obs", "sd.proc", "n0")

ni <- 10000  
nt <- 5     
nb <- 5000   
nc <- 3  

m.sim <- jags(data = data, 
              inits = inits, 
              parameters.to.save = params,
              model.file = "dl_state-space.bug", 
              n.chains = nc, n.iter = ni, n.burnin = nb, n.thin = nt)
## 
## Processing function input....... 
## 
## Done. 
##  
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 100
##    Unobserved stochastic nodes: 104
##    Total graph size: 26465
## 
## Initializing model
## 
## Adaptive phase..... 
## Adaptive phase complete 
##  
## 
##  Burn-in phase, 5000 iterations x 3 chains 
##  
## 
## Sampling from joint posterior, 5000 iterations x 3 chains 
##  
## 
## Calculating statistics....... 
## 
## Done.

Veamos las estimacions

print(m.sim)
## JAGS output for model 'dl_state-space.bug', generated by jagsUI.
## Estimates based on 3 chains of 10000 iterations,
## adaptation = 100 iterations (sufficient),
## burn-in = 5000 iterations and thin rate = 5,
## yielding 3000 total samples from the joint posterior. 
## MCMC ran for 0.239 minutes at time 2021-12-10 08:24:09.
## 
##             mean     sd    2.5%     50%   97.5% overlap0 f  Rhat n.eff
## r          0.309  0.080   0.166   0.303   0.470    FALSE 1 1.007   511
## K         10.457  0.257   9.978  10.448  10.995    FALSE 1 1.001  1574
## sd.obs     0.719  0.107   0.496   0.723   0.921    FALSE 1 1.002  1123
## sd.proc    0.619  0.124   0.400   0.613   0.885    FALSE 1 1.007   288
## n0         2.379  0.503   1.419   2.361   3.350    FALSE 1 1.001  1141
## deviance 214.675 28.801 142.179 218.891 257.393    FALSE 1 1.003   723
## 
## Successful convergence based on Rhat values (all < 1.1). 
## Rhat is the potential scale reduction factor (at convergence, Rhat=1). 
## For each parameter, n.eff is a crude measure of effective sample size. 
## 
## overlap0 checks if 0 falls in the parameter's 95% credible interval.
## f is the proportion of the posterior with the same sign as the mean;
## i.e., our confidence that the parameter is positive or negative.
## 
## DIC info: (pD = var(deviance)/2) 
## pD = 413.9 and DIC = 628.562 
## DIC is an estimate of expected predictive error (lower is better).