Terminology breakdown for easy understanding INLA

What is a model?

A model is a simplified representation or abstraction of reality designed to help us understand, explain, or predict phenomena or data. Models can take different forms:

Conceptual Models: Simple explanations or diagrams that describe how things relate to each other.
Mathematical or Statistical Models: Use equations or statistical methods to quantify relationships between variables, as discussed in your lesson (e.g., General Linear Models, Generalized Linear Models, Latent Gaussian Models).
Computational Models: Simulations using computer programs to analyze complex systems.

What are the general mathematical/statistical models?

The general mathematical/statistical models are broadly categorized into several types, each serving distinct purposes.

General linear models: apply for continuous outcomes, predicting values based on one or more predictors. They are also commonly referred to as linear regression models.
Generalized linear models: Expands continuous outcome (linear regression models) modeling to handle various outcome types such as counts, binary outcomes, and rates, which are discrete outcomes, through specific link functions.
Bayesian models: Incorporate prior knowledge with observed data to update beliefs or predictions probabilistically.
Latent variable models: Handle unobserved or hidden variables influencing observed data. Includes Latent Gaussian Models, Factor Analysis, and Structural Equation Modeling.
Time series models: Analyze data collected over time to identify patterns or forecast future observations (e.g., ARIMA, Seasonal models).
Spatial models: Used for data involving spatial relationships, helping identify geographic patterns or clusters (e.g., spatial GLMs, geostatistics).
Machine learning models: Algorithms (e.g., Decision Trees, Random Forests, Neural Networks) that learn patterns from data to predict outcomes or classify data.

Out of these types of models, Integrated Nested Laplace Approximation (INLA) is a tool to handle General/Generalized Linear Models (with spatial/hierarchical structure) and Latent Gaussian Models using Bayesian Analytical Framework.

What is INLA?

INLA (Integrated Nested Laplace Approximation) is a computational approach specifically designed to efficiently perform Bayesian inference for Latent Gaussian Models (LGMs). It quickly estimates complex models that include hidden or latent factors, especially when traditional methods (like Markov Chain Monte Carlo, MCMC) would be slow or computationally demanding.

Why use INLA?

Speed: Significantly faster computations compared to traditional Bayesian methods. Accuracy: Provides accurate approximations for posterior distributions without exhaustive sampling.
Practicality: Ideal for complex models with spatial, temporal, or hierarchical structures—like those used to analyze dengue outbreaks influenced by hidden factors.

INLA Function

What are the commonly use arguments?

Formula
Family
Data
Offset
Scale
Control.xxx

General format for INLA formula

\[\text{response} \sim \text{fixed effects + f(spatial/temporal/random effects)}\]

General Linear Models (GLM)

A General Linear Model (GLM) is a broad framework for modeling the relationship between one or more predictor variables (independent variables) and a response variable (dependent variable) using linear equations.

Basic Structure of GLM

The general form of a GLM is:

\[Y=X\beta+\epsilon\] where:

\(Y\) is the dependent variable (response variable)

\(X\) is the design matrix (independent variables)

\(β\) is the vector of coefficients (parameters)

\(ϵ\) is the error term (assumed to follow a normal distribution with mean 0 and variance \(σ^2\))

Types of General Linear Models:

GLMs cover different statistical models, including:

Simple Linear Regression: One predictor variable

\[Y_i = \beta_0 + \beta_1X_i + \epsilon_i\] R code for simple linear regression.

simpleLinearModel <- lm(formula = y~X, data = my_data)

Multiple Linear Regression: Multiple predictor variables

\[Y_i = \beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + ... + \beta_pX_{ip} + \epsilon_i\]

R code for multiple linear regression.

multipleLinearModel <- lm(formula = y~X1+X2, data = my_data)

Analysis of Variance (ANOVA): Compares means of multiple groups
Analysis of Covariance (ANCOVA): ANOVA + covariates (continuous variables)
Multivariate Linear Regression (MLR): Multiple dependent variables modeled together

Assumptions of GLM

For a standard GLM, the key assumptions are:

Linearity: The relationship between predictors and response is linear.
Independence: Observations are independent of each other.
Homoscedasticity: Equal variance of residuals (constant variance).
Normality: Residuals follow a normal distribution.

Generalized Linear Models

A General Linear Model (GLM) is traditionally used for modeling continuous response variables. However, many real-world applications involve discrete data types, such as count data (e.g., the number of dengue cases) or dichotomous (binary) outcomes (e.g., disease presence: Yes/No). Since standard linear regression assumes a continuous response and normally distributed errors, it does not natively support these types of data. To overcome this limitation, Generalized Linear Models extend the linear regression framework by introducing a link function, which transforms the expected value of the response variable into a continuous scale. This transformation allows discrete response variables to be modeled effectively while maintaining interpretability and statistical validity.

A link function in a Generalized Linear Model plays a crucial role in transforming the relationship between the predictor variables and the response variable. It ensures that predictions remain within a valid range while maintaining linearity in the predictor space. A link function, usually indicated by \(g(μ)\), maps the expected value \(μ\) of the response variable into a continuous, unbounded space so that a linear model can be applied. Let’s explore how the performance of link functions affects GLM modeling.

Why link function is important?

A link function helps in cases where a direct linear relationship between predictors and response variables does not make sense.

Let’s say we want to predict the probability of a person having a disease using temperature and humidity.

If we fit a simple linear regression:

\[P(disease) = \beta_0 + \beta_1*Temperature + \beta_2*Humidity\] This model can predict probabilities less than 0 or greater than 1, which is not valid.

The issue can be resolved using logit link function to transform the unbounded linear relationship (\(-\infty\) to \(+\infty\)) of predictor variables to bounded probability space i.e., 0-1 of response variable. Let’s see how it works.

\[log\left(\frac{P}{1-P}\right) = \beta_0 + \beta_1*Temperature + \beta_2*Humidity\] This gives the relationship between log odds of response variable and the linear predictors. However, we are interested in estimating the probability of being disease. So, now we need to transform log odds to probability applying sigmoid function.

\[x = log\left(\frac{P}{1-P}\right) = \beta_0 + \beta_1*Temperature + \beta_2*Humidity\] \[P(disease) = \frac{e^x}{1+e^x}\] Note: \(e^x\) is the reverse of \(log(P/1-P)\).

Numerical example:

if \(log(odds) = \beta_0 + \beta_1*Temperature + \beta_2*Humidity = 2\)
then \(e^x = e^2\) which is the odds of having disease (7.39)
\(P(disease) = \frac{e^x}{1+e^x}\) = 7.39/1+7.39 = 0.88

Graphical example:

R code for Generalized Linear Models

generalizedLinearModel <- glm(formula = y ~ x1 + x2 + x3, 
                              family = logit, data = my_data)

Latent Gaussian Models

A Latent Gaussian Model (LGM) is a probabilistic model where the observed data is influenced by latent (hidden) Gaussian variables. These models are useful in spatial statistics, time-series modeling, and machine learning when we suspect hidden structures exist in our data but are not directly observable. In many real-world situations, we measure observable variables but some unobserved factors influence the outcome. Instead of ignoring these hidden effects, LGMs help us estimate them and make better predictions.

Latent effects (or latent variables) are unobserved factors that influence the final outcome of the disease process but are not directly measured. Instead, they are inferred from observed data using statistical models. The latent variables are assumed to follow a Gaussian distribution.

Imagine you’re studying dengue cases and you collect data on temperature, humidity, and rainfall. You notice patterns, but the numbers don’t perfectly explain why some areas have more dengue cases than others.

That’s because there might be hidden factors at play—things we can’t directly measure, like:

How urbanization affects mosquito breeding.
How local water drainage impacts stagnant water.
Human behaviors, like leaving water containers open.

These hidden factors are called latent variables. Even though we don’t have direct measurements for them, they influence the number of dengue cases. We can still estimate their effects indirectly by looking at patterns in the data.

How are Latent Variables Linked to Observed Data?

Observed variables (like temperature, humidity, and rainfall) influence dengue cases directly. Latent variables (like unmeasured mosquito breeding conditions) also influence dengue cases, but we don’t see them directly.

Latent variables act like hidden forces shaping the observed data. We don’t see them, but we infer them by modeling patterns in the data. LGMs help us estimate these hidden effects, making our predictions more accurate even when important factors aren’t directly measured.

We assume that latent variables follow a smooth pattern over time and space. This allows us to estimate them using statistical models.

How to Identify a Plausible Latent Variable for a Latent Gaussian Model (LGM)?

When choosing a latent variable for an LGM, consider the following steps:

Identify the Unobserved Factor: Look for an underlying phenomenon affecting observed data.

Example: In spatial modeling, temperature at unmeasured locations can be a latent variable.
Check for Correlation Structures: If your data has spatial, temporal, or hierarchical dependencies, a latent Gaussian model may be useful.

Example: A latent Gaussian random field can model smooth spatial variations.
Consider a Probabilistic Relationship: If your data is modeled with a non-Gaussian likelihood (e.g., Poisson, Binomial), a Gaussian latent variable may smooth variations.
Use Domain Knowledge: Experts in the field often know which variables might be latent.

Example: In economics, “economic confidence” is not directly measured but inferred from stock market indices and surveys.
Use Factor Analysis or PCA

Exploratory Factor Analysis (EFA): Helps detect latent factors from observed variables.
Principal Component Analysis (PCA): Identifies hidden patterns in correlated variables.

An LGM consists of three main layers:

Layer	Explanation	Examples in disease modeling
Likelihhod layer (Observed data)	Defines how we observe data	Dengue case report at different locations
Latent Gaussian Process (Hidden layer)	Represent the hidden structure influencing the observed data	Unmeasured mosquito breading conditions, human mobility, effect of spatial structure, seasonality
Prior on hyperparameters	Controls the latent Gaussian structure	Controls smoothness or spatial or time effects

What is a hyper-parameter?

A hyperparameter is a parameter that controls the behavior or structure of the model itself, rather than being directly estimated as part of the model’s fit like usual (fixed or random) parameters.

General GLM formula

\[Y_i = f(X_i) + L_i + \epsilon_i\]

where:

\(Y_i\) = Observed variable (e.g., number of dengue cases at location)

\(f(X_i)\) = Known effect of covariates (e.g., temperature, humidity).

\(L_i\) = Latent effect (hidden factor like mosquito breeding patterns).

\(ϵ_i\) = Random noise (unexplained variation).

The latent effect \(L_i\) follows a Gaussian process:

\[L \sim~ N(0,\Sigma)\]

Here is a worked example,

Scenario: We aim to predict weekly dengue cases using reported weekly dengue case numbers, along with mean weekly temperature, humidity, and rainfall data.

In this example, we will construct a Latent Gaussian Model (LGM) to predict weekly dengue cases based on observed environmental factors and latent effects. Our goal is to model:

Observed variables:
- Weekly dengue case numbers (count data and response variable)
- Mean temperature, humidity, and rainfall (continuous predictors)
Latent factors:
- Temporal variation (seasonality): To capture periodic trends in dengue cases
- Spatial structure of reporting health units: To account for geographic differences in case reporting.

Step 1:

Layer	Explanation	Examples in disease modeling
Likelihood layer (observed data)	Defines the distribution of the observed response variable	Weekly dengue case counts modeled using a Poisson or Negative Binomial distribution
Latent Gaussian Process (hidden layer)	Models hidden influences in data, such as trends and spatial correlations	- Seasonal effect (temporal) using a random walk model - Spatial correlation (geographic effect) across health units
Prior on hyperparameters	Defines how smooth or structured the latent effects should be	Parameters controlling the smoothness of time and spatial variation

Step 2:

Since dengue cases are count data, we use a Poisson likelihood function:

\[Y_i \sim~Poisson(\lambda_i)\]

where:

\(Y_i\) is the observed number of dengue cases in week \(i\)
\(\lambda_i\) is the expected number of cases, modeled using a log-link function

\[log(\lambda_i) = f(Temperature_i, Humidity_i, Rainfall_i) + L_{time,i} + L_{space,i}\] \[log(\lambda_i) = \beta_0 + \beta_{Temperature} + \beta_{Humidity} + \beta_{Rainfall} + L_{time,i} + L_{space,i}\] where,

\(\lambda_i\) = Expected number of dengue cases in week \(i\)

\(\beta_0\) = Intercept term

\(Temperature,Humidity,Rainfall\) = Observed covarates

\(\beta\) = Coefficient for each predictor

\(L_{time,i}\) = Latent temporal effect (captures seasonality)

\(L_{space,i}\) = Latent spatial effect (captures geographical variations)

Step by step explanation:

The log-link function transforms the linear predictor into a Poisson mean parameter.

\[Y_i = Poisson(\lambda_i)\]

Since \(\lambda_i\) must be non-negative, we model it as,

\[\lambda_i = exp(\eta_i)\]

where \(\eta_i\) is the linear predictor.

The linear predictor \(\eta_i\) includes:

Fixed effects (known covariates like temperature, humidity, rainfall).
Latent temporal variation \(L_{time,i}\)
Latent spatial effect \(L_{space,i}\)

Therefore, we can write,

\[\eta_i = \beta_0+\beta_{Temperature}+\beta_{Humidity}+\beta_{Rainfall}+L_{time,i}+L_{space,i}\]
\[\lambda_i = exp(\beta_0+\beta_{Temperature}+\beta_{Humidity}+\beta_{Rainfall}+L_{time,i}+L_{space,i})\]
\[log(\lambda_i) = \beta_0+\beta_{Temperature}+\beta_{Humidity}+\beta_{Rainfall}+L_{time,i}+L_{space,i}\]

This model can be summarized as follows,

\[\lambda_i = exp\left(\beta_0+\sum_{j=1}^{p}\beta_{i}X_{ij}+L_{time,i}+L_{space,i}\right)\]

Use of log-link function guarantees that \(\lambda_i\) remains positive, ensuring that the Poisson model correctly estimates dengue cases.

Numerical example to demonstrate why we need to use generalized linear model with log-link function rather than a general linear model.

Scenario We want to predict weekly dengue cases (Y) based on temperature (X) using a General Linear Model (GLM).

We have the following data:

Week	Temperature (X)	Observed dengue cases (Y)
1	\(28^0C\)	5
2	\(30^0C\)	10
3	\(32^0C\)	18
4	\(34^0C\)	25

Approach 1: Linear Regression (Without Log-Link) A standard linear regression assumes:

\[Y_i = \beta_0+\beta_1X_i+\epsilon_i\] Let’s assume we fit a linear model and estimate the coefficients:

\[\hat{Y} = -50+2.5X\] Predicted dengue patient counts based on temperatures observations.

Temperature (X)	Predicted cases using the model \(\hat{Y} = -50+2.5X\)
\(28^0C\)	-50+(2.5*28) = 20
\(30^0C\)	-50+(2.5*30) = 25
\(32^0C\)	-50+(2.5*32) = 30
\(22^0C\)	-50+(2.5*22) = 05
\(18^0C\)	-50+(2.5*18) = -05

Problem:

The linear model predicts negative counts, which is not possible for count data.

Approach 2: Poisson Regression (With Log-Link Function)

To ensure non-negative predictions, we use a log-link function.

\[log(\lambda_i) = \beta_0+\beta_1X_i\] Rearrange to get

\[\lambda_i = e^{\beta_0+\beta_1X_i}\] Let’s assume our Poisson regression model estimates the parameters as,

\[log(\lambda) = -5+0.2X\]

Now, let’s compute the prediction,

Predicted Values Using the Log-Link Model

Temperature	Linear predictor (-5+0.2X)	Exponential predictor (\(e^{-5+0.2X}\))
\(28^0C\)	-5+(0.28*28) = 0.6	\(e^{0.6}\) = 1.82
\(30^0C\)	-5+(0.28*30) = 1.0	\(e^{1.0}\) = 2.72
\(32^0C\)	-5+(0.28*32) = 1.4	\(e^{1.4}\) = 4.05
\(22^0C\)	-5+(0.28*22) = -0.6	\(e^{-0.6}\) = 0.55
\(18^0C\)	-5+(0.28*18) = -1.4	\(e^{-1.4}\) = 0.25

Therefore, all predictions are positive, avoiding invalid negative values.

Three words/terms that may important in next few paragraphs

Independent: means that the random variables in question do not influence each other; the occurrence or value of one variable has no effect on the other.
Identical: means that all random variables have the same probability distribution (e.g., normal, binomial, poisson probability distribution etc.). In case of normal distribution all values have the same mean and variance
Exchangeable: refers to a property where the joint distribution of the random variables remains the same when the order of the variables is permuted (swapped). This means that the variables are interchangeable without changing the joint distribution.

\[P(X_1,X_2,X_3) = P(X_2,X_1,X_3)\]

Understanding different model representations for latent predictors of spatial and time in INLA

In the R-INLA package, latent predictors (also called latent effects or latent models) are used to model structured and unstructured dependencies in a Bayesian framework. The Integrated Nested Laplace Approximation (INLA) method allows for flexible latent model representations. Below are the key model representations for latent predictors in R-INLA:

Gaussian Random Walk Models
Autoregressive Models (AR)
Gaussian Markov Random Fields (GMRF)
Stochastic Partial Differential Equation (SPDE) Models
Splines & Nonparametric Smoothing
Independent and Exchangeable Random Effects
Nested and Grouped Random Effects
Survival & Hazard Models

Choosing right latent model

Scenario	Latent models to be used
Structured time effects	rw1, rw2 or ar1
Spatial data	besag, spde or bym
Smoothing trends	rw2, gp or pspline
Unstructured variability	iid or exchangeable
Hierarchical models	iid with group
Survival models	rw1 or coxph

What is Random Walk model?

Imagine you’re taking a random walk—each step you take depends on where you were before, but with a little randomness. That’s the basic idea of a Gaussian Random Walk Model. It helps us model trends where each value depends on the past, with some uncertainty.

In R-INLA, two common types of random walks are used:

RW1 (Random Walk of Order 1) → Captures short-term fluctuations. RW2 (Random Walk of Order 2) → Captures smooth, long-term trends.

Random Walk of Order 1 (RW1):

This is like a drunk person walking—each step is close to the last one, but not exactly the same.

Mathematical Formula

\[x_t = x_{t-1}+\epsilon_t\]

\(x_t\) is the value at time
\(x_{t-1}\) is the value at the previous step
\(\epsilon_t\) is a small random change (normally distributed error)

Example in Real Life

Imagine tracking daily temperature:

Today’s temperature is very close to yesterday’s, plus some small variation. It does not jump randomly but follows a smooth trend.

How it Works

If \(\epsilon_t\) is small, changes are gradual. If \(\epsilon_t\) is large, values change more abruptly.

Random Walk of Order 2 (RW2)

This is like a car driving on a highway—it doesn’t just change randomly; it follows a trend based on previous values.

Mathematical Formula

\[x_t = 2x_{t-1}+x_{t-2}+e_t\]

\(x_t\) depends not only on \(x_{t-1}\) but also on \(x_{t-2}\)

It smooths out fluctuations more than RW1.

Example in Real Life

Imagine tracking stock market trends:

Stock prices don’t change randomly each day but follow an overall trend. RW2 captures this long-term trend.

How it Works?

Helps smooth out short-term noise and captures acceleration/deceleration. Good for modeling economic growth, climate change, and stock prices.

What is Independent and Exchangeable Random Effects model?

When modeling dengue cases in different regions, we may need to account for random variations that are not explained by known factors (e.g., temperature, rainfall, population density). Independent and Exchangeable Random Effects help handle this random variability.

Independent Random Effects (IID Model)

Think of each region acting on its own—dengue cases in one region do not influence another.

Each region’s effect is independent of others.
They follow a random distribution with no structure.
Captures unexplained variability at the regional level.

Example: Dengue Cases in Cities

Suppose we are studying dengue outbreaks in different cities. Some cities might have better mosquito control or different human immunity levels. We assume these city-specific effects are random and uncorrelated i.e., each city is affected differently, but there is no common trend.

\[ \text {Dengu cases}\sim \alpha+x\beta+u_i\]

where,

\(\alpha\) = baseline dengue cases
\(x\beta\) = known effects(e.g.,temperature, humidity etc.)
\(u_i \sim N(0,\sigma^2)\) = random effect for each city, independent from others

Exchangeable Random Effects

Now, imagine that some regions have similar risks due to shared factors (e.g., mosquito breeding patterns, healthcare quality).

Regions within the same group (e.g., coastal vs. inland, urban vs. rural) might share some similarities
We assume all regional effects come from the same distribution, meaning they are not fully independent

Example: Dengue Cases in Districts

Suppose dengue cases are studied across districts
Districts in the same province might have similar mosquito control programs, health services, or weather patterns
So, districts within the same province share a common random effect, but the provinces themselves are independent i.e., some regions share a common risk factor, but still have random variation.

\[ \text {Dengu cases}\sim \alpha+x\beta+v_g\] where,

\(v_g\sim N(0,\sigma^2)\) = random effects for each province(not individual districts)

Feature	Independent (IID)	Exchangeable
Relationship	Each region acts alone	Regions share some similarities
Example	Different cities	Provinces with shared conditions
Use Case	City-specific random variations	Province-level random variations
INLA Code	model = “iid”	model = “exchangeable”

What is autoregression?

Autoregression is when a variable is correlated with itself over time. It means that past values influence future values. This is common in time-series data, where today’s values depend on yesterday’s. Instead of using external factors, we look at previous values of the same variable to predict the next value.

Autoregressive models are based on the idea that past values of the variable have an influence on its future values. The model tries to explain the variable’s behavior by examining how past values (lags) relate to future values.

Simple Example: Dengue Cases

Month	Dengue Cases
January	100
February	120
March	130
April	125
May	??? (predict this)

If we notice that,

When cases were high last month, they tend to stay high.
When cases were low, they tend to stay low.
This suggests that past months influence future months.

So, we can predict May’s dengue cases using April’s cases -> This is Autoregression (AR).

How Does an AR Model Work?

Autoregression assumes,

\[\text{Next month's cases = (Some factor) x (Last month's cases) + Random change}\]

It can be written more formally

\[x_t = \beta x_{t-1}+\epsilon_t\]

Types of AR Models (All percent values are indicative only)

AR(1): Uses only last month’s data. Example: May = 80% of April + random noise

Math notation: \(x_t = \beta x_{t-1}+\epsilon_i\)

AR(2): Uses last 2 months. Example: May = 50% of April + 30% of March + noise

Math notation: \(x_t = \beta_1 x_{t-1}+\beta_2 x_{t-2}+\epsilon_i\)

AR(3) or more: Uses 3+ months

Math notation: \(x_t = \beta_1 x_{t-1}+\beta_2 x_{t-2}+\beta_3 x_{t-3}+...+\beta_p x_{t-p}+\epsilon_i\)

The higher the AR order, the more past months we consider.

What is Gaussian Markov Random Field (GMRF) models?

A Gaussian Markov Random Field (GMRF) is a way to model spatial or structured dependencies in data. These models are powerful tools for modeling spatial and temporal dependencies efficiently. It assumes in general that each value depends only on its neighbors (Markov property), making computations fast and efficient.

Eg., Imagine we are studying dengue cases across different districts. According to GMRF cases in one district are likely influenced by cases in neighboring districts but far away districts have less influence.

How Does GMRF Work?

A GMRF assumes that each location \(x_i\) depends only on its neighboring locations:

\[x_i = \sum_{j \in \text{neighbors}} \phi_i x_j+\epsilon_i\]

where:

\(x_i\) = value at location

\(\phi_i\) = weight for neighbor j

\(\epsilon_i \sim N(0,\sigma^2)\) = random variation.

Example: Dengue Cases in Districts

Suppose we model dengue spread across 5 districts:

District	Dengue Cases	Neighbors
A	120	B, C
B	150	A, C, D
C	130	A, B, D, E
D	170	B, C, E
E	160	C, D

District B is strongly affected by A, C, and D

District A is only affected by B and C

District E is only affected by C and D

A GMRF model ensures that:

If dengue increases in B, it likely also increases in A, C, and D. If dengue is low in A, it doesn’t strongly affect E.

GMRF vs. Standard Gaussian Models

Features	Standard Gaussian Model	GMRF
Dependency	All values are correlated	Only neighbours affect each other
Computational speed	Slow for large data sets	Fast, because only neighbors matter
Example	Dengue cases in random locations	Dengue cases in connected districts

This keeps computations fast because each location depends only on its neighbors, not on the entire data set.

Feature	GMRF
What it does	Models dependencies between neighboring locations
Example	Dengue spread across districts
Key Idea	Each value depends only on its neighbors (Markov property)
INLA Code	f(district, model = “besag”, graph = district_neighbors)
Why use it?	Efficient for spatial modeling

GMRF variants

Intrinsic Conditional Autoregressive (ICAR) Models: ICAR models assume that each location depends only on its neighbors, with a smoothing effect
- Model spatial dependence across discrete areas (e.g., districts, neighborhoods)
- Each region’s value depends on the average of its neighbors
- If neighboring districts have high dengue cases, this one probably will too
\[x_i|x_{i-1} \sim \text{Average of neighbors + random noise}\]
- Used for instances like disease mapping, census mapping and epidemiology in regions
Stochastic Partial Differential Equation (SPDE) Models: SPDE models approximate continuous spatial processes (e.g., Gaussian random fields)
- Model continuous spatial variation (not just area-to-area, but smooth fields)
- It approximates a Matérn Gaussian field using triangles (mesh)
- Captures how something like dengue cases might smoothly vary over space (e.g., from one end of a city to another)
- Used for instances like Air pollution, temperature, environmental modeling, High-resolution spatial data, When spatial locations are points, not areas
Random Walk (RW) Models for Structured Time-Series Data: Although not strictly spatial, RW1 and RW2 behave like GMRF models for temporal structures
- Model temporal (or ordered) dependence — how values evolve over time or sequence
  - RW1: Each value depends on the previous one, \(x_t = x_{t-1} + \epsilon_t\)
  - RW2: Each value depends on the last two (smoother trend), \(x_t = 2x_{t-1} + x_{t-2} + \epsilon_t\)
Used for time series data, trend estimation and seasonality modelling

Feature	ICAR (besag)	SPDE	RW (rw1, rw2)
Domain	Discrete spatial regions	Continuous space (e.g., coordinates)	Ordered data (e.g., time, age)
Dependency	Between neighbors (areas)	Between nearby spatial points	Between adjacent time steps
Smoothness	Low to medium High (depends on mesh + Matérn)	Medium (RW1) to high (RW2)
Precision matrix	Sparse	Sparse (via mesh)	Sparse (chain-like)
Use Case	District-level disease mapping	Pollution, weather, landscape modeling	Time trends, progression over years
INLA Model Name	“besag”	“spde”	“rw1” / “rw2”
Data Structure	Region index + adjacency matrix	Coordinates + mesh	Time index

Scenario	Best Model
Modeling dengue cases by district	ICAR / besag
Modeling temperature across space	SPDE
Modeling dengue trend over months/years	RW1 or RW2
Modeling continuous smooth risk surface (not tied to areas)	SPDE
Modeling structured time-series effects	RW models

In a nutshell

ICAR (besag): Spatial model for regions
SPDE: Smooth spatial model for coordinates
RW1/RW2: Temporal model for time or sequences