=================================================================================================== PREVIOUS: https://rpubs.com/sherloconan/1198601
THIS: https://rpubs.com/sherloconan/1339374
MORE: https://github.com/zhengxiaoUVic/Spatial
Remark 1: None of the existing approaches take advantage of the additional spatial structure of the ST data; that is, the gene expression levels are measured on a lattice grid (\(\texttt{BOOST-MI}\); Jiang et al., 2022).
nx <- 80; ny <- 80 # N = 6,400 points
coords <- expand.grid(x=seq(0, 1, length.out=nx),
y=seq(0, 1, length.out=ny))
coords <- as.matrix(coords) # convert to numeric matrix N × 2 for INLA functions
For some models, INLA considers data sorted by column, i.e., a vector with the first column of the grid from top to bottom, followed by the second column and so on (Gómez-Rubio, 2021, ch. 7).
lower left corner by rows: internally stored in expand.grid
upper left corner by columns: mapping required by INLA
Build a slightly “tight” polygonal boundary around the points (negative convex shrinks).
Create a Delaunay triangulation constrained to bnd.
The max.edge parameter controls the mesh’s granularity
(i.e., the largest allowed triangle edge length).
A lower
max.edge=c(inside the boundary triangle, outside the boundary triangle)
indicates a higher resolution.
bnd <- inla.nonconvex.hull(coords, convex=-0.05) # polygonal boundary around the points
mesh <- inla.mesh.2d(boundary=bnd,
max.edge=c(0.03, 0.07),
cutoff=0.01) # minimum distance to avoid duplicating very close vertices
cat("The number of vertices is", mesh$n)
## The number of vertices is 4165
Result: mesh$loc is \(n_{\text{spde}}\times3\
(x,y,\color{gray}{z=0})\), and mesh$n is the vertex
count.
What is a Mesh?
A mesh is a triangulated representation of a spatial domain \(\mathcal{D}\). It is created by dividing the domain into triangles (or tetrahedrons in 3D). This structure allows the model to approximate the spatial processes over these smaller regions. The mesh helps in defining the precision matrix \({\mathbf{Q}}\) of the GMRF that represents the spatial random effect. The mesh is passed to the inla.spde2.matern or inla.spde2.pcmatern function to define a spatial model.
Reasons for Triangles Outside the Domain
By including these extra triangles offset, the mesh
ensures more accurate modeling of spatial processes near the boundaries,
improves numerical stability, and enhances the overall quality of the
spatial analysis.
Remark 2: For \(n\times n\) lattice data, a more straightforward approach may involve using an intrinsic GMRF directly. Neighbors are the points directly adjacent on the lattice.
Besag: Used for spatial data that are defined on a lattice. This model is part of the family of intrinsic conditional autoregressive (ICAR) models. Read more.
SPDE: Used for modeling spatial fields that are continuous over
an area, particularly useful for unstructured meshes and non-lattice
data. Increasing alpha implies that the resulting GRF will
have more differentiable sample paths.
SPDE representation of a GRF, \(d=2\)
Remark 3: Non-stationary models are supported for \(\alpha=2\) only in the current implementation of INLA (v25.06.07).
Smoothness: \(\alpha=\nu+d/2\equiv\color{red}{2}\implies \nu=1\) (non-mean-square differentiable)
Spatial scale: \(\kappa=2\sqrt{2\nu}\ /\ \rho=2\sqrt{2}\ /\ \rho\) (autocorrelation is around 0.14 at \(\lVert\mathbf{s}_i-\mathbf{s}_j\rVert=\rho\))
Marginal precision: \(\tau^2=\frac{\Gamma(\nu)}{\Gamma(\alpha)(4\pi)^{d/2}\kappa^{2\nu}}\sigma^{-2}=\frac{1}{4\pi\kappa^2}\sigma^{-2}\)
nu <- 1 # smoothness
range0 <- 0.30 # baseline spatial range
kappa0 <- sqrt(8*nu) / range0 # baseline spatial scale
sigma0 <- 1 # baseline marginal standard deviation
tau0 <- 1 / (2 * sqrt(pi) * kappa0 * sigma0)
Non-stationarity is achieved when \(\tau(\mathbf{s})\) and/or \(\kappa(\mathbf{s})\) are non-constant for \(\mathbf{s}\).
Let \(\ln\kappa\) jump on the right half and \(\ln\tau\) vary smoothly with \(x>0.5\).
xv <- mesh$loc[,1] # extract the x-coordinate at each mesh vertex
x_centered <- as.numeric(scale(xv, scale=F)) # mean-centered x (for identifiability)
right_half <- as.numeric(xv > 0.5) # indicator basis (0 left, 1 right) to allow a jump across x=0.5
The lengths of x_centered and right_half
are each equal to the vertex count.
\[\begin{cases} \ \ln\tau(\mathbf{s})=\sum_i\theta_i^{(\tau)}\!\cdot B_i^{(\tau)}(\mathbf{s})=B_0^{(\tau)}(\mathbf{s})+\theta_1\!\cdot B^{(\tau)}(\mathbf{s}) \\ \\ \ \ln\kappa(\mathbf{s})=\sum_i\theta_i^{(\kappa)}\!\cdot B_i^{(\kappa)}(\mathbf{s})=B_0^{(\kappa)}(\mathbf{s})+\theta_2\!\cdot B^{(\kappa)}(\mathbf{s}) \end{cases}\]Build \(\mathbf{B}\) matrices with offsets from Section 2 and \(x\) coordinates from Section 3
log_tau0_mesh <- rep(log(tau0), mesh$n) # offsets at each mesh vertex (NOT multiplied by theta)
log_kappa0_mesh <- rep(log(kappa0), mesh$n)
B.tau <- cbind(log_tau0_mesh, x_centered, 0) # offset + theta1 * x_centered
B.kappa <- cbind(log_kappa0_mesh, 0, right_half) # offset + theta2 * 1{xv > 0.5}
Each \(\mathbf{B}\) is \(n_{\text{spde}}\times3\): one offset and
two \(\theta\)-columns
(n.theta=2).
Build the non-stationary Matérn‐SPDE on the mesh, using the \(\mathbf{B}\)-matrices
Assume weak Gaussian priors, \(\theta_i\sim\mathcal{N}(0,10^2)\) for \(i\in\{1,2\}\).
spde <- inla.spde2.matern(
mesh,
B.tau = B.tau,
B.kappa = B.kappa,
theta.prior.mean = c(0, 0), # priors on theta1, theta2 (log-scale coefficients)
theta.prior.prec = c(0.01, 0.01) # weakly informative (sd = 10)
)
\(\mathbf{w}\sim\boldsymbol{\mathcal{N}}(\mathbf{0},\mathbf{Q}^{-1})\)
Choose ground-truth non-stationary effects (\(\theta_1=a_1\) and \(\theta_2=b_2\)).
theta_true <- c(
a1 = 0.30, # increases log tau with x --> lowers variance to the right
b1 = -0.50 # decreases log kappa on right_half --> increases range on right
)
Q <- inla.spde2.precision(spde, theta=theta_true) # sparse precision matrix of the GMRF for the SPDE
# one (n = 1) sample path of the latent GMRF on mesh vertices
w <- as.numeric(inla.qsample(n=1, Q=Q, seed=277))
Projection matrix \(\mathbf{A}\) of size \(N\times n_{\text{spde}}\) that interpolates from mesh vertices to observation points.
Observations: \(y_i=w(\mathbf{s}_i)+\epsilon_i\).
Baseline spatial variance component (partial sill): \(\sigma_s^2=1\).
Non-spatial variance component (nugget): \(\sigma_\epsilon^2=0.2^2\).
A_grid <- inla.spde.make.A(mesh, loc=coords) # projection matrix
noiseless <- as.numeric(A_grid %*% w)
noise_sd <- 0.2 # observation noise (nugget effect) sd
set.seed(277)
y <- noiseless + rnorm(nrow(coords), 0, noise_sd)
image(matrix(noiseless, ny, nx)[ny:1, ], # flip the row to make yv increase upwards
col=gray.colors(64), main="True latent field (noiseless)") # quick looks
image(matrix(y, ny, nx)[ny:1, ], # flip the row to make yv increase upwards
col=gray.colors(64), main="Observed latent field") # quick looks
In the model \(y_i=w(\mathbf{s}_i)+\epsilon_i\), no intercept is included, implying a zero mean.
# build the indexing list for the random effect `f()`
w_idx <- inla.spde.make.index("w", n.spde=spde$n.spde)
stk <- inla.stack(
data = list(y=y),
A = list(A_grid), # projection matrix
effects = list(w=w_idx), # effects list containing the SPDE index for `w`
tag = "est"
)
fit <- inla(
y ~ -1 + f(w, model=spde), # remove the intercept
data = inla.stack.data(stk), # unwraps the stack to data lists
family = "gaussian",
control.family = list(hyper = list(
prec = list(prior="pc.prec", param=c(1, 0.01)) # PC prior on the noise precision
)),
control.predictor = list(A=inla.stack.A(stk), compute=T),
control.compute=list("config"=T, "dic"=T, "waic"=T)
) # roughly 3 seconds of run time
# display posterior summary for all hyperparameters
cat("\n--- Hyperparameters ---\n"); print(fit$summary.hyperpar)
##
## --- Hyperparameters ---
## mean sd 0.025quant
## Precision for the Gaussian observations 25.4776175 0.53238957 24.44367369
## Theta1 for w 0.2382861 0.08051712 0.08025481
## Theta2 for w -0.7298762 0.23936825 -1.24268395
## 0.5quant 0.975quant mode
## Precision for the Gaussian observations 25.4727039 26.5397836 25.4642917
## Theta1 for w 0.2381205 0.3972812 0.2374330
## Theta2 for w -0.7158623 -0.3064614 -0.6471118
cat("True precision for the Gaussian observations is", 1/noise_sd^2,
"\nTrue Theta1 for w is", unname(theta_true[1]),
"\nTrue Theta2 for w is", unname(theta_true[2]))
## True precision for the Gaussian observations is 25
## True Theta1 for w is 0.3
## True Theta2 for w is -0.5
## summary(fit)
fit$mlik # log marginal-likelihood
## [,1]
## log marginal-likelihood (integration) -281.5851
## log marginal-likelihood (Gaussian) -280.3409
Warning: The \(\mathbf{A}\)-matrix
in the predictor (see ?control.predictor) is specified but
an intercept is in the formula. This will likely result in the intercept
being applied multiple times in the model, and is likely not what you
want. See ?inla.stack for more information. You can remove
the intercept adding -1 to the formula.
Two methods of computing the log marginal likelihood are
Integration: This approach computes the mlik by
directly integrating the likelihood function over the parameter space.
This method is typically more accurate but can be computationally
intensive. It’s often used when precision is essential for model
comparison.
Gaussian: This method approximates mlik using a
normal approximation. It is generally faster and less computationally
demanding than direct integration, but it might be less accurate,
particularly in cases where the true likelihood is not well-approximated
by a normal distribution (Rue, Martino, & Chopin, 2009,
p. 345).
Keep the same mesh, A_grid, and
y.
Replace the SPDE block with a stationary one (Option A with
spde2.pcmatern, or Option B with constant \(\mathbf{B}\) matrices).
Rebuild the stack with a new index name, fit, and compare (WAIC/DIC, posterior range/sigma).
spde_st <- inla.spde2.pcmatern(
mesh = mesh, alpha = 2,
prior.range = c(0.2, 0.05), # P(range < 0.2) = 0.05
prior.sigma = c(1.0, 0.05) # P(sigma > 1.0) = 0.05
)
wst_idx <- inla.spde.make.index("w_st", n.spde=spde_st$n.spde) # index & stack
stk_st <- inla.stack(
data = list(y = y),
A = list(A_grid),
effects = list(w_st = wst_idx)
)
# fit stationary model
fit_st <- inla(
y ~ -1 + f(w_st, model=spde_st),
data = inla.stack.data(stk_st),
family = "gaussian",
control.family = list(hyper = list(
prec = list(prior="pc.prec", param=c(1, 0.01)) # noise prior (same as before)
)),
control.predictor = list(A=inla.stack.A(stk_st), compute=T),
control.compute=list("config"=T, "dic"=T, "waic"=T)
) # roughly 3 seconds of run time
## summary(fit_st)
fit_st$mlik # log marginal-likelihood
## [,1]
## log marginal-likelihood (integration) -284.2216
## log marginal-likelihood (Gaussian) -282.9829
bf10 <- exp(fit$mlik - fit_st$mlik) # Bayes factor against the null (stationarity)
rownames(bf10) <- c("integration", "Gaussian"); colnames(bf10) <- "BF10.est"
bf10
## BF10.est
## integration 13.96376
## Gaussian 14.04169
INLA reports Theta1 and Theta2 for the SPDE
(our two non-stationary coefficients).
th_rows <- grep("^Theta", rownames(fit$summary.hyperpar))
theta_hat <- fit$summary.hyperpar[th_rows, "mean"] # posterior means (theta1_hat, theta2_hat)
# ln tau(s) = B.tau[,1] + B.tau[,2:3] %*% theta
log_tau_mesh_hat <- as.numeric(B.tau[,1] + B.tau[, -1, drop=F] %*% theta_hat)
# ln kappa(s) = B.kappa[,1]+ B.kappa[,2:3] %*% theta
log_kappa_mesh_hat <- as.numeric(B.kappa[,1] + B.kappa[, -1, drop=F] %*% theta_hat)
tau_mesh_hat <- exp(log_tau_mesh_hat)
kappa_mesh_hat <- exp(log_kappa_mesh_hat)
range_mesh_hat <- sqrt(8) / kappa_mesh_hat
sigma2_mesh_hat <- 1 / (4*pi * kappa_mesh_hat^2 * tau_mesh_hat^2)
proj <- inla.mesh.projector(
mesh, xlim=c(0,1), ylim=c(0,1), dims=c(nx, ny)
) # create a projector for a regular nx × ny image domain
range_grid <- inla.mesh.project(proj, range_mesh_hat) # interpolate mesh values onto the grid
sigma2_grid <- inla.mesh.project(proj, sigma2_mesh_hat)
par(mfrow=c(1,2), mar=c(3,3,3,3))
image(proj$x, proj$y, range_grid, useRaster=T,
xlab ="x", ylab="y", main="Posterior mean range(s)")
abline(v=0.5, lty=2) # larger on the right (x > 0.5)
image(proj$x, proj$y, sigma2_grid, useRaster=T,
xlab="x", ylab="y", main="Posterior mean sigma^2(s)")
abline(v=0.5, lty=2) # a left–right gradient from the tau trend plus a step at x=0.5 from kappa
log_tau_mesh_true <- as.numeric(B.tau[,1] + B.tau[, -1, drop=F] %*% theta_true)
log_kappa_mesh_true <- as.numeric(B.kappa[,1] + B.kappa[, -1, drop=F] %*% theta_true)
tau_mesh_true <- exp(log_tau_mesh_true)
kappa_mesh_true <- exp(log_kappa_mesh_true)
range_mesh_true <- sqrt(8) / kappa_mesh_true
sigma2_mesh_true <- 1 / (4*pi * kappa_mesh_true^2 * tau_mesh_true^2)
# check correlations between truth and estimates on mesh
cat("\n--- Correlation checks on mesh ---\n", "corr(range_true, range_hat): ",
cor(range_mesh_true, range_mesh_hat), "\ncorr(sigma2_true, sigma2_hat): ",
cor(sigma2_mesh_true, sigma2_mesh_hat), "\n")
##
## --- Correlation checks on mesh ---
## corr(range_true, range_hat): 1
## corr(sigma2_true, sigma2_hat): 0.9819166