Introduction

This document provides a comprehensive derivation of the Conditional Autoregressive (CAR) model, specifically the proper CAR specification implemented in R-INLA (Besag, 1974). We’ll walk through:

  1. The conditional specification of the model
  2. Derivation of the joint precision matrix
  3. Derivation of the covariance matrix
  4. Key properties and parameters
  5. Mathematical foundations of conditional Normal distributions

1. The CAR Model Specification

1.1 Notation

Consider \(n\) areas. For each area \(i\):

  • \(\mathcal{N}(i)\): Set of neighbors of area \(i\)
  • \(n_i = \#\mathcal{N}(i)\): Number of neighbors of area \(i\)
  • \(a_{ij}\): Adjacency indicator (\(a_{ij} = 1\) if areas \(i\) and \(j\) are neighbors, 0 otherwise; \(a_{ii} = 0\))
  • \(W_{ij} = a_{ij} / n_i\): Row-standardized weight matrix
  • \(\mu_i\): Mean for area \(i\)
  • \(\sigma_u^2\): Overall variance parameter
  • \(s_i^2 = \sigma_u^2 / n_i\): Conditional variance for area \(i\)
  • \(\phi\): Spatial autocorrelation parameter

1.2 Conditional Distribution

The CAR model specifies the conditional distribution of \(u_i\) given all other areas \(u_{-i}\) as:

\[ u_i \mid u_{-i} \sim \text{Normal}\left(\mu_i + \sum_{j=1}^n r_{ij}(u_j - \mu_j),\; s_i^2\right) \]

where \(r_{ij} = \phi W_{ij}\).

For centered variables (\(\mu_i = 0\) for all \(i\)), this simplifies to:

\[ u_i \mid u_{-i} \sim \text{Normal}\left(\phi \sum_{j=1}^n W_{ij} u_j,\; \frac{\sigma_u^2}{n_i}\right) \]

Key Parameters:

  • \(\phi\): Controls the strength of spatial dependence

    • \(\phi = 0\): No spatial correlation (independence)
    • \(\phi > 0\): Positive spatial correlation (clustering)
    • \(\phi < 0\): Negative spatial correlation (checkerboard pattern)
    • \(\phi \to 1\): Strong positive spatial dependence
  • \(\sigma_u^2\): Controls the overall variation between spatial random effects

  • \(n_i\): Areas with more neighbors have smaller conditional variance (more information about their value)

2. Derivation of the Precision Matrix

2.1 Key Property of Multivariate Normal

For any multivariate Normal vector \(u \sim \text{MVN}(0, \Sigma)\) with precision matrix \(Q = \Sigma^{-1}\), the conditional distribution of \(u_i\) given all other components \(u_{-i}\) is:

\[ u_i \mid u_{-i} \sim \text{Normal}\left(-\frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij} u_j,\; \frac{1}{Q_{ii}}\right) \]

Why? This comes from expanding the quadratic form \(u^T Q u\):

\[ u^T Q u = Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j + \underbrace{\sum_{j \neq i}\sum_{k \neq i} u_j Q_{jk} u_k}_{\text{constant w.r.t. } u_i} \]

When conditioning on \(u_{-i}\), the constant terms combine into a proportionality constant, leaving:

\[ f(u_i \mid u_{-i}) \propto \exp\left(-\frac{1}{2}\left[Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j\right]\right) \]

Completing the square gives the Normal distribution with: - Mean: \(-\frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij} u_j\) - Variance: \(\frac{1}{Q_{ii}}\)

2.2 Matching the CAR Specification

We have from the CAR model:

  • Conditional mean: \(E[u_i \mid u_{-i}] = \phi \sum_{j} W_{ij} u_j\)
  • Conditional precision: \(Q_{ii} = n_i / \sigma_u^2\)

Matching these to the general MVN formula:

\[ \phi \sum_{j} W_{ij} u_j = -\frac{1}{Q_{ii}} \sum_{j \neq i} Q_{ij} u_j \]

Substitute \(Q_{ii} = n_i/\sigma_u^2\):

\[ \phi \sum_{j} W_{ij} u_j = -\frac{\sigma_u^2}{n_i} \sum_{j \neq i} Q_{ij} u_j \]

Multiply both sides by \(n_i/\sigma_u^2\):

\[ \frac{n_i}{\sigma_u^2} \sum_{j} \phi W_{ij} u_j = -\sum_{j \neq i} Q_{ij} u_j \]

Since \(n_i W_{ij} = a_{ij}\):

\[ \frac{1}{\sigma_u^2} \sum_{j} \phi a_{ij} u_j = -\sum_{j \neq i} Q_{ij} u_j \]

Therefore, for \(j \neq i\):

\[ Q_{ij} = -\frac{\phi}{\sigma_u^2} a_{ij} \]

2.3 Constructing the Full Precision Matrix

We now have all entries of the precision matrix:

  • Diagonal: \(Q_{ii} = \frac{n_i}{\sigma_u^2}\)
  • Off-diagonal: \(Q_{ij} = -\frac{\phi}{\sigma_u^2} a_{ij}\) for \(i \neq j\)

In matrix form:

\[ Q = \frac{1}{\sigma_u^2}(D - \phi A) \]

where: - \(D = \text{diag}(n_1, \ldots, n_n)\) is the degree matrix - \(A = (a_{ij})\) is the adjacency matrix

Since \(W = D^{-1}A\) (row-standardized), we have \(A = DW\):

\[ Q = \frac{1}{\sigma_u^2}(D - \phi DW) = \frac{1}{\sigma_u^2} D (I - \phi W) \]

3. Derivation of the Covariance Matrix

3.1 From Precision to Covariance

We have \(S^2 = \text{diag}(s_1^2, \ldots, s_n^2) = \text{diag}(\sigma_u^2/n_1, \ldots, \sigma_u^2/n_n) = \sigma_u^2 D^{-1}\).

Therefore, \(D = \sigma_u^2 S^{-2}\).

Substitute into \(Q\):

\[ Q = \frac{1}{\sigma_u^2}(\sigma_u^2 S^{-2})(I - \phi W) = S^{-2}(I - \phi W) \]

Thus:

\[ Q = (I - \phi W) S^{-2} \]

3.2 The Covariance Matrix

The covariance matrix is the inverse of the precision matrix:

\[ \Sigma = Q^{-1} = \left[(I - \phi W) S^{-2}\right]^{-1} \]

\[ \Sigma = (I - \phi W)^{-1} S^2 \]

Therefore:

\[ u \sim \text{MVN}(0, (I - \phi W)^{-1} S^2) \]

4. Properness Condition

For the CAR model to be proper (valid covariance matrix), \((I - \phi W)^{-1}\) must exist, meaning \(I - \phi W\) must be positive definite.

Let \(\kappa_1 \leq \kappa_2 \leq \cdots \leq \kappa_n\) be the eigenvalues of \(W\).

The matrix \(I - \phi W\) is positive definite if and only if:

\[ \frac{1}{\min_i \kappa_i} < \phi < \frac{1}{\max_i \kappa_i} \]

For row-standardized \(W\) (rows sum to 1): - \(\max_i \kappa_i = 1\) (always) - \(\min_i \kappa_i\) is typically between \(-1\) and \(0\)

Thus the constraint simplifies to:

\[ \frac{1}{\min_i \kappa_i} < \phi < 1 \]

In practice, this usually means \(\phi \in (0, 1)\) for positive spatial correlation.

5. Mathematical Foundation: Conditional Normal Distribution

5.1 Detailed Derivation of \(E[u_i \mid u_{-i}] = -\frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij} u_j\)

Step 1: Joint Density

For \(u \sim \text{MVN}(0, \Sigma)\) with precision matrix \(Q = \Sigma^{-1}\):

\[ f(u_1, \ldots, u_n) = (2\pi)^{-n/2} |\Sigma|^{-1/2} \exp\left(-\frac{1}{2} u^T Q u\right) \]

Step 2: Expand Quadratic Form

\[ u^T Q u = Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j + \sum_{j \neq i}\sum_{k \neq i} u_j Q_{jk} u_k \]

Let \(C(u_{-i}) = \sum_{j \neq i}\sum_{k \neq i} u_j Q_{jk} u_k\) (constant w.r.t. \(u_i\)).

Step 3: Conditional Density

By definition, \(f(u_i \mid u_{-i}) = \frac{f(u_1, \ldots, u_n)}{f(u_{-i})}\):

\[ f(u_i \mid u_{-i}) = \frac{(2\pi)^{-n/2} |\Sigma|^{-1/2} \exp\left(-\frac{1}{2}\left[Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j + C(u_{-i})\right]\right)}{f(u_{-i})} \]

\[ f(u_i \mid u_{-i}) \propto \exp\left(-\frac{1}{2}\left[Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j\right]\right) \]

Step 4: Complete the Square

\[ Q_{ii}u_i^2 + 2u_i\sum_{j \neq i} Q_{ij}u_j = Q_{ii}\left(u_i + \frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij}u_j\right)^2 - \frac{1}{Q_{ii}}\left(\sum_{j \neq i} Q_{ij}u_j\right)^2 \]

Step 5: Identify the Normal Distribution

\[ f(u_i \mid u_{-i}) \propto \exp\left(-\frac{Q_{ii}}{2}\left(u_i - \mu\right)^2\right) \]

where:

\[ \mu = -\frac{1}{Q_{ii}}\sum_{j \neq i} Q_{ij} u_j \]

and variance:

\[ \sigma^2 = \frac{1}{Q_{ii}} \]

5.2 Verification with 2D Example

For \(\begin{pmatrix} u_1 \\ u_2 \end{pmatrix} \sim \text{MVN}\left(0, \begin{pmatrix} \sigma_1^2 & \rho\sigma_1\sigma_2 \\ \rho\sigma_1\sigma_2 & \sigma_2^2 \end{pmatrix}\right)\):

The precision matrix is:

\[ Q = \frac{1}{1-\rho^2} \begin{pmatrix} 1/\sigma_1^2 & -\rho/(\sigma_1\sigma_2) \\ -\rho/(\sigma_1\sigma_2) & 1/\sigma_2^2 \end{pmatrix} \]

So \(Q_{11} = \frac{1}{(1-\rho^2)\sigma_1^2}\) and \(Q_{12} = -\frac{\rho}{(1-\rho^2)\sigma_1\sigma_2}\).

Applying the formula:

\[ E[u_1 \mid u_2] = -\frac{1}{Q_{11}} Q_{12} u_2 = \frac{\rho\sigma_1}{\sigma_2}u_2 \]

This matches the known result ✓

6. Summary of All Key Formulas

6.1 Conditional Specification

\[ u_i \mid u_{-i} \sim \text{Normal}\left(\mu_i + \phi\sum_{j=1}^n W_{ij}(u_j - \mu_j),\; \frac{\sigma_u^2}{n_i}\right) \]

6.2 Precision Matrix

\[ Q = \frac{1}{\sigma_u^2}(D - \phi A) = (I - \phi W) S^{-2} \]

where: - \(D = \text{diag}(n_1, \ldots, n_n)\) - \(A = (a_{ij})\) - \(S^2 = \text{diag}(\sigma_u^2/n_1, \ldots, \sigma_u^2/n_n)\)

6.3 Covariance Matrix

\[ \Sigma = (I - \phi W)^{-1} S^2 \]

6.4 Joint Distribution

\[ u \sim \text{MVN}(0, (I - \phi W)^{-1} S^2) \]

6.5 Properness Constraint

For row-standardized \(W\):

\[ \frac{1}{\min_i \kappa_i} < \phi < 1 \]

where \(\kappa_i\) are the eigenvalues of \(W\).

7. Practical Implementation in R-INLA

In R-INLA, the CAR model is typically implemented using the "bym2" model:

# Example of BYM2 model in R-INLA
library(INLA)

# Define the model formula
formula <- y ~ 1 + f(id, model = "bym2", 
                     graph = adjacency_matrix,
                     hyper = list(
                       prec.unstruct = list(prior = "pc.prec", param = c(1, 0.01)),
                       prec.spatial = list(prior = "pc.prec", param = c(1, 0.01)),
                       phi = list(prior = "pc", param = c(0.5, 2/3))
                     ))

# Fit the model
result <- inla(formula, data = data, family = "gaussian")

Key parameters in R-INLA: - \(\phi\): Proportion of variance attributed to spatial structure (0 = unstructured, 1 = purely spatial) - \(\sigma_u^2\): Overall variance parameter - PC priors: Used to regularize estimates and shrink toward simpler models

References

  1. Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society: Series B, 36(2), 192-225.

  2. Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley.

  3. Rue, H., & Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall/CRC.

  4. R-INLA Project: https://www.r-inla.org/