This document provides a comprehensive derivation of the Conditional Autoregressive (CAR) model, specifically the proper CAR specification implemented in R-INLA (Besag, 1974). We’ll walk through:
Consider \(n\) areas. For each area \(i\):
The CAR model specifies the conditional distribution of \(u_i\) given all other areas \(u_{-i}\) as:
\[ u_i \mid u_{-i} \sim \text{Normal}\left(\mu_i + \sum_{j=1}^n r_{ij}(u_j - \mu_j),\; s_i^2\right) \]
where \(r_{ij} = \phi W_{ij}\).
For centered variables (\(\mu_i = 0\) for all \(i\)), this simplifies to:
\[ u_i \mid u_{-i} \sim \text{Normal}\left(\phi \sum_{j=1}^n W_{ij} u_j,\; \frac{\sigma_u^2}{n_i}\right) \]
\(\phi\): Controls the strength of spatial dependence
\(\sigma_u^2\): Controls the overall variation between spatial random effects
\(n_i\): Areas with more neighbors have smaller conditional variance (more information about their value)
For any multivariate Normal vector \(u \sim \text{MVN}(0, \Sigma)\) with precision matrix \(Q = \Sigma^{-1}\), the conditional distribution of \(u_i\) given all other components \(u_{-i}\) is:
\[ u_i \mid u_{-i} \sim \text{Normal}\left(-\frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij} u_j,\; \frac{1}{Q_{ii}}\right) \]
Why? This comes from expanding the quadratic form \(u^T Q u\):
\[ u^T Q u = Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j + \underbrace{\sum_{j \neq i}\sum_{k \neq i} u_j Q_{jk} u_k}_{\text{constant w.r.t. } u_i} \]
When conditioning on \(u_{-i}\), the constant terms combine into a proportionality constant, leaving:
\[ f(u_i \mid u_{-i}) \propto \exp\left(-\frac{1}{2}\left[Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j\right]\right) \]
Completing the square gives the Normal distribution with: - Mean: \(-\frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij} u_j\) - Variance: \(\frac{1}{Q_{ii}}\)
We have from the CAR model:
Matching these to the general MVN formula:
\[ \phi \sum_{j} W_{ij} u_j = -\frac{1}{Q_{ii}} \sum_{j \neq i} Q_{ij} u_j \]
Substitute \(Q_{ii} = n_i/\sigma_u^2\):
\[ \phi \sum_{j} W_{ij} u_j = -\frac{\sigma_u^2}{n_i} \sum_{j \neq i} Q_{ij} u_j \]
Multiply both sides by \(n_i/\sigma_u^2\):
\[ \frac{n_i}{\sigma_u^2} \sum_{j} \phi W_{ij} u_j = -\sum_{j \neq i} Q_{ij} u_j \]
Since \(n_i W_{ij} = a_{ij}\):
\[ \frac{1}{\sigma_u^2} \sum_{j} \phi a_{ij} u_j = -\sum_{j \neq i} Q_{ij} u_j \]
Therefore, for \(j \neq i\):
\[ Q_{ij} = -\frac{\phi}{\sigma_u^2} a_{ij} \]
We now have all entries of the precision matrix:
In matrix form:
\[ Q = \frac{1}{\sigma_u^2}(D - \phi A) \]
where: - \(D = \text{diag}(n_1, \ldots, n_n)\) is the degree matrix - \(A = (a_{ij})\) is the adjacency matrix
Since \(W = D^{-1}A\) (row-standardized), we have \(A = DW\):
\[ Q = \frac{1}{\sigma_u^2}(D - \phi DW) = \frac{1}{\sigma_u^2} D (I - \phi W) \]
We have \(S^2 = \text{diag}(s_1^2, \ldots, s_n^2) = \text{diag}(\sigma_u^2/n_1, \ldots, \sigma_u^2/n_n) = \sigma_u^2 D^{-1}\).
Therefore, \(D = \sigma_u^2 S^{-2}\).
Substitute into \(Q\):
\[ Q = \frac{1}{\sigma_u^2}(\sigma_u^2 S^{-2})(I - \phi W) = S^{-2}(I - \phi W) \]
Thus:
\[ Q = (I - \phi W) S^{-2} \]
The covariance matrix is the inverse of the precision matrix:
\[ \Sigma = Q^{-1} = \left[(I - \phi W) S^{-2}\right]^{-1} \]
\[ \Sigma = (I - \phi W)^{-1} S^2 \]
Therefore:
\[ u \sim \text{MVN}(0, (I - \phi W)^{-1} S^2) \]
For the CAR model to be proper (valid covariance matrix), \((I - \phi W)^{-1}\) must exist, meaning \(I - \phi W\) must be positive definite.
Let \(\kappa_1 \leq \kappa_2 \leq \cdots \leq \kappa_n\) be the eigenvalues of \(W\).
The matrix \(I - \phi W\) is positive definite if and only if:
\[ \frac{1}{\min_i \kappa_i} < \phi < \frac{1}{\max_i \kappa_i} \]
For row-standardized \(W\) (rows sum to 1): - \(\max_i \kappa_i = 1\) (always) - \(\min_i \kappa_i\) is typically between \(-1\) and \(0\)
Thus the constraint simplifies to:
\[ \frac{1}{\min_i \kappa_i} < \phi < 1 \]
In practice, this usually means \(\phi \in (0, 1)\) for positive spatial correlation.
For \(u \sim \text{MVN}(0, \Sigma)\) with precision matrix \(Q = \Sigma^{-1}\):
\[ f(u_1, \ldots, u_n) = (2\pi)^{-n/2} |\Sigma|^{-1/2} \exp\left(-\frac{1}{2} u^T Q u\right) \]
\[ u^T Q u = Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j + \sum_{j \neq i}\sum_{k \neq i} u_j Q_{jk} u_k \]
Let \(C(u_{-i}) = \sum_{j \neq i}\sum_{k \neq i} u_j Q_{jk} u_k\) (constant w.r.t. \(u_i\)).
By definition, \(f(u_i \mid u_{-i}) = \frac{f(u_1, \ldots, u_n)}{f(u_{-i})}\):
\[ f(u_i \mid u_{-i}) = \frac{(2\pi)^{-n/2} |\Sigma|^{-1/2} \exp\left(-\frac{1}{2}\left[Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j + C(u_{-i})\right]\right)}{f(u_{-i})} \]
\[ f(u_i \mid u_{-i}) \propto \exp\left(-\frac{1}{2}\left[Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j\right]\right) \]
\[ Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j = Q_{ii}\left(u_i + \frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij}u_j\right)^2 - \frac{1}{Q_{ii}}\left(\sum_{j \neq i} Q_{ij}u_j\right)^2 \]
\[ f(u_i \mid u_{-i}) \propto \exp\left(-\frac{Q_{ii}}{2}\left(u_i - \mu\right)^2\right) \]
where:
\[ \mu = -\frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij} u_j \]
and variance:
\[ \sigma^2 = \frac{1}{Q_{ii}} \]
For \(\begin{pmatrix} u_1 \\ u_2 \end{pmatrix} \sim \text{MVN}\left(0, \begin{pmatrix} \sigma_1^2 & \rho\sigma_1\sigma_2 \\ \rho\sigma_1\sigma_2 & \sigma_2^2 \end{pmatrix}\right)\):
The precision matrix is:
\[ Q = \frac{1}{1-\rho^2} \begin{pmatrix} 1/\sigma_1^2 & -\rho/(\sigma_1\sigma_2) \\ -\rho/(\sigma_1\sigma_2) & 1/\sigma_2^2 \end{pmatrix} \]
So \(Q_{11} = \frac{1}{(1-\rho^2)\sigma_1^2}\) and \(Q_{12} = -\frac{\rho}{(1-\rho^2)\sigma_1\sigma_2}\).
Applying the formula:
\[ E[u_1 \mid u_2] = -\frac{1}{Q_{11}} Q_{12} u_2 = \frac{\rho\sigma_1}{\sigma_2}u_2 \]
This matches the known result ✓
\[ u_i \mid u_{-i} \sim \text{Normal}\left(\mu_i + \phi\sum_{j=1}^n W_{ij}(u_j - \mu_j),\; \frac{\sigma_u^2}{n_i}\right) \]
\[ Q = \frac{1}{\sigma_u^2}(D - \phi A) = (I - \phi W) S^{-2} \]
where: - \(D = \text{diag}(n_1, \ldots, n_n)\) - \(A = (a_{ij})\) - \(S^2 = \text{diag}(\sigma_u^2/n_1, \ldots, \sigma_u^2/n_n)\)
\[ \Sigma = (I - \phi W)^{-1} S^2 \]
\[ u \sim \text{MVN}(0, (I - \phi W)^{-1} S^2) \]
For row-standardized \(W\):
\[ \frac{1}{\min_i \kappa_i} < \phi < 1 \]
where \(\kappa_i\) are the eigenvalues of \(W\).
In R-INLA, the CAR model is typically implemented using the
"bym2" model:
# Example of BYM2 model in R-INLA
library(INLA)
# Define the model formula
formula <- y ~ 1 + f(id, model = "bym2",
graph = adjacency_matrix,
hyper = list(
prec.unstruct = list(prior = "pc.prec", param = c(1, 0.01)),
prec.spatial = list(prior = "pc.prec", param = c(1, 0.01)),
phi = list(prior = "pc", param = c(0.5, 2/3))
))
# Fit the model
result <- inla(formula, data = data, family = "gaussian")
Key parameters in R-INLA: - \(\phi\): Proportion of variance attributed to spatial structure (0 = unstructured, 1 = purely spatial) - \(\sigma_u^2\): Overall variance parameter - PC priors: Used to regularize estimates and shrink toward simpler models
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society: Series B, 36(2), 192-225.
Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley.
Rue, H., & Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall/CRC.
R-INLA Project: https://www.r-inla.org/