1. The Core SPDE Equation

In spatial analysis, the starting point of the SPDE approach is the linear fractional stochastic partial differential equation:

\[ (\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(s)) = \mathcal{W}(s), \quad s \in \mathbb{R}^d \]

Where:

  • \(\xi(s)\) = The continuous spatial field (e.g., rainfall, temperature, pollution).
  • \(\Delta\) = The Laplacian operator (measures spatial roughness).
  • \(\alpha\) = Controls the smoothness of the field (larger \(\alpha\) = smoother).
  • \(\kappa > 0\) = The scale parameter (controls the correlation range; larger \(\kappa\) = shorter range).
  • \(\tau\) = Controls the variance (overall magnitude of the field).
  • \(\mathcal{W}(s)\) = Gaussian spatial white noise (mean 0, variance 1).

2. Why Do We Use This Equation?

The Problem with Traditional Geostatistics

Traditionally, we define \(\xi(s)\) directly using the Matérn covariance function:

\[ Cov(\xi(s_i), \xi(s_j)) = \frac{\sigma^2}{2^{\nu-1}\Gamma(\nu)}(\kappa ||h||)^\nu K_\nu(\kappa ||h||) \]

However, using this covariance matrix for \(n\) locations requires \(O(n^3)\) operations—impossible for large datasets (e.g., satellite data with millions of pixels).

The SPDE Solution

The SPDE equation is a mathematical shortcut:

  1. It produces the exact same Matérn covariance as the traditional formula.
  2. It is local—when \(\alpha\) is an integer (or half-integer), the operator \((\kappa^2 - \Delta)\) only links a point to its nearest neighbors.
  3. It yields a sparse precision matrix (GMRF), allowing computation in \(O(n^{3/2})\) or even \(O(n)\) time.

In plain English: The SPDE disguises a global, dense covariance matrix as a local, sparse “nearest-neighbor” model that computers can solve incredibly fast.


3. What is \(\xi(s)\)? Is it the Outcome Variable \(Y(s)\)?

Yes, but with important distinctions:

Scenario Mathematical Formulation What is \(\xi(s)\)?
No covariates, no error \(Y(s) = \xi(s)\) The raw observed data.
No covariates, with error \(Y(s) = \xi(s) + \varepsilon(s)\) The true, noise-free spatial surface.
With covariates, no error \(Y(s) = \mathbf{X}(s)\boldsymbol{\beta} + \xi(s)\) The spatial residual after removing covariate effects.
With covariates and error (Standard) \(Y(s) = \mathbf{X}(s)\boldsymbol{\beta} + \xi(s) + \varepsilon(s)\) The latent (hidden) spatial random effect.

Where:

  • \(\mathbf{X}(s)\boldsymbol{\beta}\) = Fixed effects (e.g., altitude, temperature).
  • \(\varepsilon(s)\) = Independent measurement error (white noise).

4. Including Covariates: The Full Hierarchical Model

The SPDE equation itself NEVER changes when you add covariates:

\[ (\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(s)) = \mathcal{W}(s) \]

Covariates are added to the observation equation (the likelihood), not inside the differential operator.

The Complete Bayesian Hierarchical Model

Data level (Likelihood):

\[ Y(s_i) \mid \xi(s_i) \sim \text{Normal}\left(\mathbf{X}(s_i)\boldsymbol{\beta} + \xi(s_i), \sigma^2_e\right) \]

Process level (SPDE):

\[ (\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(s)) = \mathcal{W}(s) \]

Parameter level (Priors):

\[ \boldsymbol{\beta} \sim \text{Normal}(0, 1000) \]

\[ \log(\kappa) \sim \text{Normal}(0, 1) \]

\[ \log(\tau) \sim \text{Normal}(0, 1) \]

\[ \sigma^2_e \sim \text{Inverse-Gamma}(1, 0.01) \]


5. Understanding the Simulation Parameters

In the simulation example from Krainski and Lindgren (2013), the model is:

\[ y_i \sim \text{Normal}(\eta_i, \sigma^2_e), \quad i = 1, \dots, 200 \]

where:

  • \(\eta_i = b_0 + \xi_i\) = The linear predictor (mean of the observation).
  • \(b_0 = 10\) = The overall intercept (global mean).
  • \(\xi_i\) = The spatial random effect at location \(i\).
  • \(\sigma^2_e = 0.3\) = Variance of the measurement error \(\varepsilon_i\).
  • \(\xi(s) \sim \text{MVNormal}(\mathbf{0}, \boldsymbol{\Sigma})\) = The spatial field follows a multivariate normal distribution with mean 0 and covariance matrix \(\boldsymbol{\Sigma}\).
  • \(\boldsymbol{\Sigma}\) = Defined by the Matérn covariance function with parameters:
    • \(\sigma^2 = 5\) = The marginal variance of the spatial field \(\xi(s)\).
    • \(\kappa = 7\) = The scale parameter (controls range).
    • \(\lambda = 1\) = The smoothness parameter (related to \(\alpha\)).

5.1 What Does \(\sigma^2 = 5\) Mean?

\(\sigma^2 = 5\) is the variance of the latent spatial field \(\xi(s)\):

\[ \text{Var}(\xi(s)) = \sigma^2 = 5 \]

This means:

  • The spatial field \(\xi(s)\) has a standard deviation of \(\sqrt{5} \approx 2.24\).
  • The values of \(\xi(s)\) typically range between approximately \(-2 \times 2.24 = -4.47\) and \(+4.47\) (for 95% of the area under a normal distribution).
  • It represents the magnitude of spatial variation after accounting for the intercept.

5.2 The Full Variance Decomposition

In this simulation, the total variance of your observations \(y_i\) comes from two independent sources:

5.2.1 Spatial Variance (The SPDE/Matérn part)

  • \(\sigma^2 = 5\) → Variance of the spatial field \(\xi(s)\).
  • This is the correlated variation across space.

5.2.2 Measurement Error Variance (The “nugget” part)

  • \(\sigma^2_e = 0.3\) → Variance of the independent noise \(\varepsilon_i\).
  • This is the uncorrelated variation (pure noise).

5.2.3 Total Variance

Since \(\xi(s)\) and \(\varepsilon_i\) are independent:

\[ \text{Var}(y_i) = \text{Var}(\xi_i + \varepsilon_i) = \sigma^2 + \sigma^2_e = 5 + 0.3 = 5.3 \]

So the total variance of your simulated observations is 5.3.

5.3 The Proportion of Variance Explained by Spatial Structure

This is often called the proportion of spatial variance or the “fraction”:

\[ \frac{\sigma^2}{\sigma^2 + \sigma^2_e} = \frac{5}{5 + 0.3} = \frac{5}{5.3} \approx 0.943 \]

Interpretation: About 94.3% of the total variation in your data is due to spatial correlation, and only 5.7% is pure measurement noise. This means the spatial field is very strong and the data are highly spatially correlated.

Example: - If at location \(s\), \(\xi(s) = 2\) (slightly above average spatial effect) and \(\varepsilon = 0.2\) (small measurement error), then: - \(y = 10 + 2 + 0.2 = 12.2\)

  • If at another location, \(\xi(s) = -3\) (below average) and \(\varepsilon = -0.5\), then:
    • \(y = 10 - 3 - 0.5 = 6.5\)

5.5 Relationship to the SPDE Parameters

In the SPDE formulation:

\[ (\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(s)) = \mathcal{W}(s) \]

The variance \(\sigma^2\) is related to the SPDE parameters \(\tau\), \(\kappa\), and \(\alpha\):

For a Matérn field in \(\mathbb{R}^d\):

\[ \sigma^2 = \frac{\Gamma(\nu)}{\Gamma(\nu + d/2)} \cdot \frac{1}{(4\pi)^{d/2} \kappa^{2\nu} \tau^2} \]

where:

  • \(\nu = \alpha - d/2\) (the smoothness parameter).
  • In your simulation, \(\lambda = 1\) likely corresponds to \(\nu = 1\) (or \(\alpha = 2\) in 2D).

So in the simulation, the parameter values were chosen such that:

Parameter Value Meaning
\(b_0\) 10 Global mean
\(\sigma^2\) 5 Variance of spatial field \(\xi(s)\)
\(\sigma^2_e\) 0.3 Variance of measurement error
\(\kappa\) 7 Controls spatial range (larger = shorter range)
\(\lambda\) 1 Smoothness parameter (\(\nu = 1\))

6. Interpretation of the Full Model

“The observed rainfall \(Y\) at location \(s\) equals the linear effect of altitude and other covariates, plus a spatially correlated residual \(\xi(s)\), plus independent measurement noise. The SPDE equation forces \(\xi(s)\) to follow a Matérn covariance function with range \(\kappa\) and smoothness \(\alpha\).”

Key takeaway: The SPDE only models the residual spatial dependence after accounting for covariates. This separates:

  • Large-scale variation → Explained by covariates (\(\mathbf{X}\boldsymbol{\beta}\)).
  • Medium-scale variation → Explained by the spatial field (\(\xi(s)\)).
  • Small-scale/Noise → Explained by measurement error (\(\varepsilon\)).

7. Why Not Put Covariates Inside the SPDE?

  • The SPDE defines the covariance structure (how nearby points relate).
  • Covariates are fixed effects—they explain the mean structure.
  • The operator \((\kappa^2 - \Delta)\) only applies to \(\xi(s)\) because we assume covariates have no spatial correlation (they are deterministic).
  • Putting covariates inside the operator would break the mathematical properties that make the solution a Matérn field.

8. Summary Table

Concept Equation Role
SPDE \((\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(s)) = \mathcal{W}(s)\) Defines the spatial covariance of \(\xi(s)\).
Observation Model \(Y(s) = \mathbf{X}(s)\boldsymbol{\beta} + \xi(s) + \varepsilon(s)\) Links data \(Y\) to the spatial field and covariates.
Covariates \(\mathbf{X}(s)\boldsymbol{\beta}\) Explain large-scale trends (fixed effects).
Spatial Random Effect \(\xi(s)\) Captures residual spatial correlation (random effects).
Measurement Error \(\varepsilon(s) \sim \text{N}(0, \sigma^2_e)\) Captures independent noise.
Spatial Variance \(\sigma^2 = 5\) Magnitude of spatial variation in \(\xi(s)\).

9. Practical Interpretation for the Analysis

When we fit this model to real data:

  1. \(\sigma^2\) tells you how much spatial variability exists in your data after accounting for covariates.
    • Large \(\sigma^2\) → Strong spatial patterns.
    • Small \(\sigma^2\) → Weak spatial patterns.
  2. \(\sigma^2_e\) tells you how much noise/measurement error is in your data.
    • Large \(\sigma^2_e\) → Data are noisy, spatial signal is weak.
    • Small \(\sigma^2_e\) → Data are precise, spatial signal is strong.
  3. The ratio \(\sigma^2 / (\sigma^2 + \sigma^2_e)\) is useful for:
    • Determining if spatial modeling is necessary.
    • Comparing different models or datasets.
    • Understanding the reliability of your spatial predictions.

10. Practical Implementation (R-INLA Example)

Here is a minimal example of fitting this model using the R-INLA package:

# Load required libraries
library(INLA)
library(sf)

# Assume we have:
# - coords: matrix of spatial coordinates
# - y: vector of observations
# - x1, x2: covariates

# 1. Create the mesh (triangulation of the spatial domain)
mesh <- inla.mesh.2d(
  loc = coords,
  max.edge = c(0.5, 1),
  cutoff = 0.2
)

# 2. Build the SPDE model object
spde <- inla.spde2.matern(
  mesh = mesh,
  alpha = 2  # for 2D space, alpha = 2 gives smoothness nu = 1
)

# 3. Create the projection matrix (A) to map mesh to data locations
A <- inla.spde.make.A(mesh, loc = coords)

# 4. Create the stack (combine data, covariates, and SPDE)
stack <- inla.stack(
  data = list(y = y),
  A = list(A, 1),
  effects = list(
    list(spde = 1:spde$n.spde),  # SPDE random effect
    list(b0 = 1, x1 = x1, x2 = x2)  # Fixed effects
  )
)

# 5. Fit the model
formula <- y ~ 0 + b0 + x1 + x2 + f(spde, model = spde)
result <- inla(formula,
  data = inla.stack.data(stack),
  control.predictor = list(A = inla.stack.A(stack), compute = TRUE)
)

# 6. View results
summary(result)

# 7. Extract key parameters
# Range parameter
range <- inla.spde2.result(result, "spde", spde, do.transf = TRUE)
print(range$summary.range)

# Variance parameter
variance <- inla.spde2.result(result, "spde", spde, do.transf = TRUE)
print(variance$summary.variance)

# 8. Predict on a grid for mapping
# Create prediction coordinates
grid_points <- expand.grid(
  x = seq(min(coords[,1]), max(coords[,1]), length = 50),
  y = seq(min(coords[,2]), max(coords[,2]), length = 50)
)

# Project predictions to grid
A_pred <- inla.spde.make.A(mesh, loc = as.matrix(grid_points))

# Create prediction stack
stack_pred <- inla.stack(
  data = list(y = NA),
  A = list(A_pred, 1),
  effects = list(
    list(spde = 1:spde$n.spde),
    list(b0 = 1, x1 = grid_points$x, x2 = grid_points$y)
  )
)

# Combine with estimation stack
stack_full <- inla.stack(stack, stack_pred)

# Refit with predictions
result_pred <- inla(formula,
  data = inla.stack.data(stack_full),
  control.predictor = list(
    A = inla.stack.A(stack_full),
    compute = TRUE
  )
)

# Extract predictions
pred_mean <- result_pred$summary.linear.predictor[
  inla.stack.index(stack_full, "pred")$data, "mean"
]

11. Alternative Implementation with Multiple Covariates

Here’s a more general version with multiple covariates:

# Suppose we have covariates in a data frame
df <- data.frame(
  y = y,
  coords_x = coords[,1],
  coords_y = coords[,2],
  altitude = altitude,
  temperature = temperature,
  precipitation = precipitation
)

# Create mesh
mesh <- inla.mesh.2d(
  loc = as.matrix(df[, c("coords_x", "coords_y")]),
  max.edge = c(0.5, 1),
  cutoff = 0.2
)

# Build SPDE model
spde <- inla.spde2.matern(mesh, alpha = 2)

# Projection matrix
A <- inla.spde.make.A(mesh, loc = as.matrix(df[, c("coords_x", "coords_y")]))

# Stack with all covariates
stack <- inla.stack(
  data = list(y = df$y),
  A = list(A, 1),
  effects = list(
    list(spde = 1:spde$n.spde),
    list(
      intercept = 1,
      altitude = df$altitude,
      temperature = df$temperature,
      precipitation = df$precipitation
    )
  )
)

# Formula with all covariates
formula <- y ~ 0 + intercept + altitude + temperature + precipitation + 
           f(spde, model = spde)

# Fit model
result <- inla(formula,
  data = inla.stack.data(stack),
  control.predictor = list(A = inla.stack.A(stack), compute = TRUE),
  control.compute = list(dic = TRUE, waic = TRUE)
)

# Summary
summary(result)

# Extract spatial variance
variance <- inla.spde2.result(result, "spde", spde, do.transf = TRUE)
print(variance$summary.variance)

# Extract measurement error
print(result$summary.hyperpar)

12. Simulating Data with the Given Parameters

Here’s how we would simulate data with the parameters from the Krainski and Lindgren (2013) example:

# Set seed for reproducibility
set.seed(123)

# Parameters from the simulation
b0 <- 10        # Intercept
sigma2 <- 5     # Spatial variance
sigma2_e <- 0.3 # Measurement error variance
kappa <- 7      # Scale parameter
nu <- 1         # Smoothness parameter (lambda = 1)

# Generate 200 random locations in a unit square
n <- 200
coords <- cbind(runif(n), runif(n))

# Build the Matérn covariance matrix
# This uses the fields package for demonstration
library(fields)

# Compute distances
dist_matrix <- rdist(coords)

# Matérn covariance function
matern_cov <- function(h, kappa, nu, sigma2) {
  # h: distance
  # kappa: scale parameter
  # nu: smoothness
  # sigma2: variance
  h <- h * kappa
  if (nu == 0.5) {
    return(sigma2 * exp(-h))
  } else {
    return(sigma2 * (h^nu) * besselK(h, nu) / (2^(nu-1) * gamma(nu)))
  }
}

# Build covariance matrix
Sigma <- matrix(0, n, n)
for (i in 1:n) {
  for (j in 1:n) {
    Sigma[i,j] <- matern_cov(dist_matrix[i,j], kappa, nu, sigma2)
  }
}

# Simulate spatial field
library(MASS)
xi <- mvrnorm(1, mu = rep(0, n), Sigma = Sigma)

# Simulate measurement error
epsilon <- rnorm(n, mean = 0, sd = sqrt(sigma2_e))

# Generate observations
y <- b0 + xi + epsilon

# Check variance decomposition
cat("Variance of xi:", var(xi), "\n")
cat("Variance of epsilon:", var(epsilon), "\n")
cat("Total variance of y:", var(y), "\n")
cat("Proportion spatial:", var(xi) / var(y), "\n")

# Fit SPDE model to the simulated data
mesh <- inla.mesh.2d(loc = coords, max.edge = c(0.2, 0.5))
spde <- inla.spde2.matern(mesh, alpha = 2)
A <- inla.spde.make.A(mesh, loc = coords)

stack <- inla.stack(
  data = list(y = y),
  A = list(A, 1),
  effects = list(
    list(spde = 1:spde$n.spde),
    list(intercept = rep(1, n))
  )
)

formula <- y ~ 0 + intercept + f(spde, model = spde)
result <- inla(formula,
  data = inla.stack.data(stack),
  control.predictor = list(A = inla.stack.A(stack), compute = TRUE)
)

# Compare estimated parameters
summary(result)

13. Final Takeaway

The SPDE approach is revolutionary because:

  1. It preserves the Matérn covariance—the gold standard in spatial statistics.
  2. It replaces dense matrices with sparse ones, enabling big data spatial analysis.
  3. It separates covariate effects (mean structure) from spatial dependence (covariance structure) clearly.
  4. It provides a unified framework for spatial prediction, uncertainty quantification, and parameter estimation.

The equation \((\kappa^2 - \Delta)^{\alpha/2}(\tau \xi(s)) = \mathcal{W}(s)\) is the engine; covariates are the steering wheel. They work together to model complex spatial phenomena efficiently.

Key understanding: When you see \(\sigma^2 = 5\) in the SPDE context, it’s the variance of the hidden spatial field \(\xi(s)\), not the observation noise! This represents the magnitude of spatial variation in your process after accounting for fixed effects.


14. References

  • Lindgren, F., Rue, H., & Lindström, J. (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(4), 423-498.
  • Rue, H., Martino, S., & Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 319-392.
  • Krainski, E., Gómez-Rubio, V., Bakka, H., Lenzi, A., Castro-Camilo, D., Simpson, D., … & Rue, H. (2018). Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA. Chapman and Hall/CRC.