Spatial-temporal data analysis has become increasingly important in
many fields including epidemiology, environmental science, economics,
and public health. When analyzing data collected over both space and
time, it is crucial to properly model the interactions between these two
dimensions. INLA (Integrated Nested Laplace Approximations) provides a
flexible framework for fitting such complex models using the
f() function.
The choice of interaction structure can significantly impact model performance and interpretation. Knorr-Held (2000) proposed four types of space-time interactions, which have become standard in spatio-temporal modeling with INLA. These interaction types allow researchers to capture different patterns of how spatial and temporal effects combine.
In many real-world applications, the relationship between space and
time is not simply additive. For example: - Public
Health: Disease outbreaks may spread through neighboring areas
over time - Environmental Science: Air pollution may
show different spatial patterns across seasons
- Economics: Regional economic growth may follow
different temporal trajectories
Ignoring these interactions can lead to: - Biased estimates - Overly simplistic models - Poor predictive performance
Before diving into the interaction types, it’s important to understand the shared components that all four models include.
All four models share the same main effects:
# Main Effects (same for ALL types)
f(ID.area, model="bym", graph=Georgia.adj) # Spatial main effect (structured)
f(ID.year, model="rw2") # Temporal main effect (structured)
f(ID.year1, model="iid") # Temporal main effect (unstructured)
These capture: - Spatial structure: BYM model with adjacency matrix (accounts for spatial autocorrelation) - Temporal structure: RW2 for smooth trends (second-order random walk) - Temporal unstructured: IID for random year effects (independent year-specific deviations)
The main effects represent the average spatial and temporal patterns. The interaction term captures how these patterns deviate from the average across space-time combinations. Only the interaction term changes across the four types!
Before we dive into the interaction types, let’s clarify some key terminology:
In the interaction formulas, you’ll notice variables like
ID.area.int and ID.year.int. These are
copies of the original ID variables (e.g.,
ID.area and ID.year) but are used specifically
for the interaction term. This allows the interaction to have a
different model structure than the main effects.
f(ID.area.year, model="iid")
ID.area.year (unique
combination of area and year)iid (Independent and
Identically Distributed)The interaction term can be expressed as:
\[\gamma_{it} = \phi_{it}\]
where \(\phi_{it} \sim N(0, \sigma^2)\) independently for all \(i\) and \(t\).
Each area-year combination gets its own completely independent random effect. There is no borrowing of information across either space or time.
Year 1 Year 2 Year 3 Year 4
Area 1: [ a1 a2 a3 a4 ] ← All independent
Area 2: [ b1 b2 b3 b4 ] ← All independent
Area 3: [ c1 c2 c3 c4 ] ← All independent
↑ All independent!
f(ID.area.int, model="iid",
group=ID.year.int,
control.group=list(model="rw2"))
ID.area.int with
model="iid" → Areas are unstructured (independent)ID.year.int → Grouped by
yearmodel="rw2" → Years
have temporal structure (smooth across time)\[\gamma_{it} = \delta_{it}\]
where for each area \(i\), \(\delta_{i1}, \delta_{i2}, ..., \delta_{iT}\) follows an RW2 process independently across areas.
Each area has its own smooth temporal trajectory, but areas are independent of each other. This captures the idea that each location has its own unique time trend, but these trends are not influenced by neighboring areas.
Year 1 → Year 2 → Year 3 → Year 4
Area 1: [ a1 → a2 → a3 → a4 ] ← RW2 across years
Area 2: [ b1 → b2 → b3 → b4 ] ← RW2 across years
Area 3: [ c1 → c2 → c3 → c4 ] ← RW2 across years
↑ ↑
IID (independent) IID (independent)
Tracking crime rates across neighborhoods: each neighborhood may have its own temporal trend (e.g., gentrification effects), but these trends might be independent across neighborhoods.
f(ID.year.int, model="iid",
group=ID.area.int,
control.group=list(model="besag", graph=Georgia.adj))
ID.year.int with
model="iid" → Years are unstructured (independent)ID.area.int → Grouped by
areamodel="besag" with
adjacency → Areas have spatial structure\[\gamma_{it} = \theta_{ti}\]
where for each year \(t\), \(\theta_{t1}, \theta_{t2}, ..., \theta_{tI}\) follows a Besag (ICAR) process independently across years.
Each year has its own spatial pattern, but years are independent of each other. This captures the idea that spatial patterns can change from year to year, with no temporal smoothing.
Year 1 Year 2 Year 3 Year 4
Area 1: [ a1 b1 c1 d1 ]
Area 2: [ a2 b2 c2 d2 ] ← IID across years
Area 3: [ a3 b3 c3 d3 ]
↑ ↑ ↑ ↑
BYM BYM BYM BYM
(spatial) (spatial) (spatial) (spatial)
Analyzing annual election results: each election year may have its own spatial pattern (e.g., urban-rural divides), but patterns might not smoothly evolve due to political shifts.
f(ID.area.int, model="besag", graph=Georgia.adj,
group=ID.year.int,
control.group=list(model="rw2"))
ID.area.int with
model="besag" → Areas have spatial structureID.year.int → Grouped by
yearmodel="rw2" → Years
have temporal structure\[\gamma_{it} = \psi_{it}\]
where the vector \(\psi = (\psi_{11}, ..., \psi_{IT})\) follows a spatio-temporal model with both spatial and temporal dependence structures. Typically, this is achieved through the Kronecker product of the spatial and temporal precision matrices.
The interaction is smooth in both space AND time. - Neighboring areas have similar effects - Consecutive years have similar effects - This is the most complex interaction structure
Year 1 → Year 2 → Year 3 → Year 4
Area 1: [ a1 → a2 → a3 → a4 ] ← RW2
Area 2: [ b1 → b2 → b3 → b4 ] ← RW2
Area 3: [ c1 → c2 → c3 → c4 ] ← RW2
↑ ↑ ↑ ↑
BYM BYM BYM BYM
(spatial) (spatial) (spatial) (spatial)
Modeling disease spread: diseases often spread from neighboring areas and evolve smoothly over time. This structure captures both the spatial and temporal dependencies simultaneously.
| Type | Formula Component | Spatial Structure | Temporal Structure | Complexity |
|---|---|---|---|---|
| I | f(ID.area.year, model="iid") |
✗ IID | ✗ IID | Simplest |
| II | f(ID.area.int, model="iid", group=ID.year.int, control.group=list(model="rw2")) |
✗ IID | ✓ RW2 | Medium |
| III | f(ID.year.int, model="iid", group=ID.area.int, control.group=list(model="besag", graph=Georgia.adj)) |
✓ Besag | ✗ IID | Medium |
| IV | f(ID.area.int, model="besag", graph=Georgia.adj, group=ID.year.int, control.group=list(model="rw2")) |
✓ Besag | ✓ RW2 | Most complex |
The interaction type is determined by which dimension has structure:
Important: The .int variables are just
copies of the ID variables used specifically for the interaction term,
allowing them to have different model structures than the main
effects.
The selection depends on:
Typically, you should: 1. Fit all four types to your data 2. Compare their fit statistics 3. Perform sensitivity analysis 4. Choose the model with the best trade-off between fit and complexity
# Load required libraries
library(INLA)
library(spdep)
# Assuming you have your data and adjacency matrix ready
# For this example, we'll use Georgia dataset
data(georgia)
Georgia.adj <- as.matrix(georgia.adj)
# Create unique identifiers
data$ID.area <- as.numeric(factor(data$county))
data$ID.year <- as.numeric(factor(data$year))
data$ID.area.year <- as.numeric(factor(paste(data$ID.area, data$ID.year, sep="_")))
data$ID.area.int <- data$ID.area # Copy for interaction
data$ID.year.int <- data$ID.year # Copy for interaction
# Type I: Unstructured × Unstructured
model1 <- inla(y ~ 1 +
f(ID.area, model="bym", graph=Georgia.adj) +
f(ID.year, model="rw2") +
f(ID.year1, model="iid") +
f(ID.area.year, model="iid"),
data = data,
family = "poisson",
control.predictor = list(compute = TRUE)
)
# Type II: Unstructured × Structured (Temporal)
model2 <- inla(y ~ 1 +
f(ID.area, model="bym", graph=Georgia.adj) +
f(ID.year, model="rw2") +
f(ID.year1, model="iid") +
f(ID.area.int, model="iid",
group=ID.year.int,
control.group=list(model="rw2")),
data = data,
family = "poisson",
control.predictor = list(compute = TRUE)
)
# Type III: Structured (Spatial) × Unstructured
model3 <- inla(y ~ 1 +
f(ID.area, model="bym", graph=Georgia.adj) +
f(ID.year, model="rw2") +
f(ID.year1, model="iid") +
f(ID.year.int, model="iid",
group=ID.area.int,
control.group=list(model="besag", graph=Georgia.adj)),
data = data,
family = "poisson",
control.predictor = list(compute = TRUE)
)
# Type IV: Structured (Spatial) × Structured (Temporal)
model4 <- inla(y ~ 1 +
f(ID.area, model="bym", graph=Georgia.adj) +
f(ID.year, model="rw2") +
f(ID.year1, model="iid") +
f(ID.area.int, model="besag", graph=Georgia.adj,
group=ID.year.int,
control.group=list(model="rw2")),
data = data,
family = "poisson",
control.predictor = list(compute = TRUE)
)
# Create comparison table
model_comparison <- data.frame(
Model = c("Type I", "Type II", "Type III", "Type IV"),
DIC = c(model1$dic$dic, model2$dic$dic,
model3$dic$dic, model4$dic$dic),
WAIC = c(model1$waic$waic, model2$waic$waic,
model3$waic$waic, model4$waic$waic),
p.eff = c(model1$dic$p.eff, model2$dic$p.eff,
model3$dic$p.eff, model4$dic$p.eff)
)
# Print comparison table
kable(model_comparison,
caption = "Model Comparison Results",
digits = 2)
# Identify the best model
best_model <- model_comparison[which.min(model_comparison$DIC), "Model"]
cat("Best model based on DIC:", best_model, "\n")
# Plot fitted values for each model
par(mfrow = c(2, 2))
# Type I
plot(data$y, model1$summary.fitted.values$mean,
xlab = "Observed", ylab = "Fitted",
main = "Type I: IID × IID")
abline(0, 1, col = "red")
# Type II
plot(data$y, model2$summary.fitted.values$mean,
xlab = "Observed", ylab = "Fitted",
main = "Type II: IID × RW2")
abline(0, 1, col = "red")
# Type III
plot(data$y, model3$summary.fitted.values$mean,
xlab = "Observed", ylab = "Fitted",
main = "Type III: Besag × IID")
abline(0, 1, col = "red")
# Type IV
plot(data$y, model4$summary.fitted.values$mean,
xlab = "Observed", ylab = "Fitted",
main = "Type IV: Besag × RW2")
abline(0, 1, col = "red")
The choice of space-time interaction structure is a crucial decision in spatio-temporal modeling with INLA. The four types proposed by Knorr-Held (2000) provide a flexible framework that balances complexity and interpretability:
The selection should be guided by both statistical criteria (DIC, WAIC, CPO) and domain knowledge about the underlying phenomenon. In practice, fitting all four models and comparing their performance is recommended to find the most appropriate structure for your data.