Spatial Interpolation

Spatial interpolation is a technique used to estimate unknown values at specific geographic locations based on known values from surrounding points. It assumes that things that are closer together in space tend to be more similar than things that are farther apart — a concept known as spatial autocorrelation.

Why Use Spatial Interpolation?

  1. To fill in gaps in spatial data.
  2. To create continuous surfaces (e.g., temperature, elevation, pollution).
  3. To support decision-making in fields like environmental science, agriculture, urban planning, and public health.

Common Methods

  1. Inverse Distance Weighting (IDW) – weights nearby points more heavily.
  2. Kriging – uses statistical models of spatial autocorrelation.
  3. Spline – fits smooth surfaces through known points.
  4. Natural Neighbor – uses surrounding points based on proximity.

Inverse Distance Weighting (IDW)

Inverse Distance Weighting (IDW) is a deterministic spatial interpolation method used to estimate values at unsampled locations based on values from nearby known points.

Core Idea: The closer a known point is to the location being estimated, the more influence it has on the predicted value.

IDW calculates the value at an unknown location as a weighted average of nearby known values. The weights are inversely related to the distance — meaning:

  1. Closer points get higher weights
  2. Farther points get lower weights

The formula is: \[ \hat{Z}(x_0)=\frac{\sum^n_{i=1}\frac{Z(x_i)}{d(x_0,x_i)^p}}{\sum^n_{i=1}\frac{1}{d(x_0,x_1)^p}}\] where:

  • \(\hat{Z}(x_0)\) is the estimated value at location \(x_0\)
  • \(\hat{Z}(x_i)\) is the estimated value at location \(x_i\)
  • \(d(x_0,x_i)\) is the is the distance between \(x_0\) and \(x_i\)
  • \(p\) is the power parameter (commonly set to 2)

Note: The power parameter controls how quickly the influence of a point decreases with distance.Higher values (e.g., 2 or 3) give more weight to closer points.Lower values (e.g., 1) spread influence more evenly.

When to Use IDW

  • When you have dense and evenly distributed sample points.
  • When you want a quick and simple interpolation.
  • When you don’t need statistical error estimates (unlike Kriging).

In this exercise, we are performing spatial interpolation using the Inverse Distance Weighting (IDW) method. The goal is to estimate zinc concentrations at unsampled locations based on known measurements from sampled points. This is a common task in environmental science, geography, and other spatial disciplines where data is collected at discrete locations but decisions or analyses require continuous surfaces.

We use the Meuse dataset, which contains soil sample data from a region in the Netherlands. This dataset is included in the sp and gstat packages.

We start by loading two essential libraries:

  • sp: This package provides classes and methods for spatial data, including spatial points, grids, and plotting functions.
  • gstat: This package offers geostatistical modeling tools, including interpolation methods like IDW and Kriging.

These libraries are foundational for spatial analysis in R. Without them, we cannot work with spatial objects or perform interpolation.

Then we load and prepare the Sample Data:

  • data(meuse) loads a built-in dataset containing measurements of heavy metals (like zinc, copper, and lead) at specific geographic locations.

  • coordinates(meuse) = ~x + y transforms the data frame into a spatial object by assigning the x and y columns as spatial coordinates.

This transformation is crucial because spatial interpolation methods require data to be in a format that includes both attribute values (e.g., zinc concentration) and spatial location.

# Load required libraries
library(sp)
## Warning: package 'sp' was built under R version 4.4.3
library(gstat)
## Warning: package 'gstat' was built under R version 4.4.3
# Load sample data
data(meuse)
coordinates(meuse) = ~x + y

To visualize the raw meuse data in R,we can use the basic spatial plot of sample locations.

plot(meuse, main = "Sample Locations in Meuse Dataset")

or you can plot with Zinc concentration as symbol size or color using bubble().

bubble(meuse, "zinc", main = "Zinc Concentration at Sample Points", maxsize = 2)

or Colored Plot Using spplot()

spplot(meuse, "zinc", main = "Zinc Concentration (spplot)")

You can also plot multiple variables side-by-side to create a multi-panel plot showing spatial distributions of each variable.

spplot(meuse, c("zinc", "lead", "copper"), main = "Heavy Metal Concentrations")

Anyway, going back to IDW process, we can now load and prepare the grid for prediction

# Load and prepare grid data
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE
  • data(meuse.grid) loads a grid of regularly spaced locations across the study area.

  • coordinates(meuse.grid) = ~x + y assigns spatial coordinates to each grid point. gridded(meuse.grid) = TRUE tells R that this object represents a regular grid, which is required for interpolation output.

This grid defines the locations where we want to predict zinc concentrations. Think of it as a blank canvas where we will “paint” estimated values based on the known sample points.

Next,perform IDW interpolation. This is the core step where interpolation happens.

  • idw() is the function that performs Inverse Distance Weighting.
  • zinc ~ 1 specifies that we are interpolating the zinc variable using a constant model (no additional predictors).
  • meuse is the source of known data points.
  • newdata = meuse.grid specifies the grid where predictions should be made.
  • idp = 2.0 sets the power parameter, which controls how strongly distance affects the weighting. A value of 2 means that the influence of a point decreases with the square of the distance.

The result, idw_result, is a spatial object containing predicted zinc values (var1.pred) at each grid location.

# Perform IDW interpolation
idw_result <- idw(zinc ~ 1, meuse, newdata = meuse.grid, idp = 2.0)
## [inverse distance weighted interpolation]

Finally, visualize the Interpolated Surface.

  • spplot() is used to create a spatial plot of the interpolated values.
  • “var1.pred” is the column in idw_result that contains the predicted zinc concentrations.

The plot shows a smooth surface representing estimated zinc levels across the study area, based on the influence of nearby sample points.

This visualization helps us understand the spatial distribution of zinc and identify areas of high or low concentration.

# Plot the result
spplot(idw_result, "var1.pred", main = "IDW Interpolation of Zinc")

The originally sampled data can be saved in a csv file using the following codes.

meuse_export <- data.frame(x = meuse$x, y = meuse$y, zinc = meuse$zinc)
write.csv(meuse_export, "meuse_zinc_coordinates.csv", row.names = FALSE)

If I want to save the interpolated values instead,use the following code

# Extract interpolated values
interpolated_data <- as.data.frame(idw_result)

# Save to CSV
write.csv(interpolated_data, "meuse_idw_interpolated.csv", row.names = FALSE)

Kriging

Kriging is a geostatistical interpolation technique that not only considers the distance between known and unknown points (like IDW), but also models the spatial autocorrelation — the statistical relationship between measured points. In simpler terms:

Kriging predicts values at unsampled locations by using both the distance and the spatial structure of the data.

It’s named after Danie Krige, a South African mining engineer who developed the method to estimate ore grades.

Key Features of Kriging

  • Statistical model-based: Unlike IDW, Kriging uses a model of spatial correlation (called a variogram) to make predictions.
  • Provides uncertainty estimates: Kriging not only predicts values but also gives an estimate of the error or variance of each prediction.
  • Flexible: Can be adapted to different types of spatial patterns (e.g., trends, anisotropy).

Types of Kriging

  • Ordinary Kriging – assumes a constant mean across the study area.
  • Universal Kriging – allows for a spatial trend (e.g., increasing pollution from west to east).
  • Simple Kriging – assumes a known mean.
  • Indicator Kriging, Co-Kriging, etc. – specialized forms for categorical data or multiple variables.

Ordinary Kriging

Ordinary Kriging estimates the value at an unsampled location \(x_0\) as a weighted linear combination of known values: \[\hat{Z}(x_0)=\sum^n_{i=1}\lambda_i Z(x_i) \]

where:

  • \(\hat{Z}(x_0)\) is the estimated value at location \(x_0\).
  • \(\hat{Z}(x_i)\) is the estimated value at location \(x_i\).
  • \(\lambda_i\) is the weights assigned to each known point.
  • \(n\) is the number of known sample points.

Ordinary Kriging is a best linear unbiased estimator (BLUE). It uses both distance and spatial structure (via the variogram) to make predictions. It provides not only predictions but also error estimates, making it more powerful than simpler methods like IDW.

In this example, we perform Ordinary Kriging, a geostatistical interpolation method that estimates values at unsampled locations by modeling both the distance and the spatial correlation between known data points. Unlike IDW, Kriging provides not only predictions but also error estimates, making it a more statistically robust method.

We use the Meuse dataset, which contains measurements of heavy metals in soil from a region in the Netherlands.

We begin by loading the necessary libraries. Then load and prepare the sample data, and the prediction grid similar to the process in IDW.

library(sp)
library(gstat)

# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE

Next we compute the experimental variogram. The variogram measures how the similarity between data points changes with distance. We use the log-transformed zinc values to stabilize variance and improve model fit.

This step produces an empirical variogram, which shows how spatial dependence behaves in the data.

# Step 1: Compute the experimental variogram
v <- variogram(log(zinc) ~ 1, meuse)

We then fit a theoretical variogram Model. We fit a spherical variogram model to the empirical variogram. vgm(1, “Sph”, 900, 1) specifies:

  • Sill = 1: the maximum variance.
  • Range = 900: the distance beyond which points are no longer spatially correlated.
  • Nugget = 1: small-scale variation or measurement error.

This model is used to determine how much influence each known point has on the prediction.

# Step 2: Fit a theoretical variogram model
model <- fit.variogram(v, vgm(1, "Sph", 900, 1))

Next we will perform Ordinary Kriging.

  • krige() performs the interpolation using the fitted variogram model.
  • log(zinc) ~ 1 indicates we are interpolating the log-transformed zinc values.

The result includes both predicted values (var1.pred) and prediction variances (var1.var), which quantify uncertainty.

# Step 3: Perform Kriging
kriging_result <- krige(log(zinc) ~ 1, meuse, meuse.grid, model)
## [using ordinary kriging]

Finally, visualize the Interpolated Surface.

  • spplot() is used to create a spatial plot of the interpolated values.
  • “var1.pred” is the column in idw_result that contains the predicted zinc concentrations.
# Step 4: Plot the predicted surface
spplot(kriging_result, "var1.pred", main = "Ordinary Kriging of log(Zinc)")

Lastly, you can save the results in a csv file by using the R codes below.

# Convert kriging result to a data frame
kriging_df <- as.data.frame(kriging_result)

# Save to CSV
write.csv(kriging_df, "kriging_predictions.csv", row.names = FALSE)

Universal Kriging

Universal Kriging (also called Kriging with a trend) is a geostatistical interpolation method that extends Ordinary Kriging by allowing the mean of the variable to vary across space. Instead of assuming a constant mean, Universal Kriging models the mean as a function of spatial coordinates (e.g., a linear or polynomial trend).

This is useful when your data shows a spatial trend, such as increasing pollution levels from west to east or elevation rising from south to north.

The Universal Kriging estimator is: \[\hat{Z}(x_0)=\sum^n_{i=1}\lambda_i Z(x_i) \] subject to: \[Z(x)=m(x)+\epsilon(x) \] where:

  • \(Z(x)\) is observed value at location \(x\).
  • \(m(x)\) is trend function (e.g., linear or polynomial function of coordinates).
  • \(\epsilon(x)\) is spatially correlated residuals.

The trend \(m(x)\) is is typically modeled as:

\[m(x)=\beta_0+\beta_1x+\beta_2y\] or higher-order polynomials depending on the complexity of the spatial trend.

Here’s how to implement Universal Kriging using the meuse dataset:

library(sp)
library(gstat)

# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE

# Step 1: Compute experimental variogram with a trend
v <- variogram(log(zinc) ~ x + y, meuse)

# Step 2: Fit a theoretical variogram model
model <- fit.variogram(v, vgm(1, "Sph", 900, 1))

# Step 3: Perform Universal Kriging
uk_result <- krige(log(zinc) ~ x + y, meuse, meuse.grid, model)
## [using universal kriging]
# Step 4: Plot the predicted surface
spplot(uk_result, "var1.pred", main = "Universal Kriging of log(Zinc)")

Explanation of Key Differences from Ordinary Kriging

  • log(zinc) ~ x + y: This formula includes a spatial trend in the model.
  • The variogram is computed and fitted after removing the trend, so it models only the spatial structure of the residuals.
  • The final prediction combines the trend and the spatial correlation.

Simple Kriging

Simple Kriging is the most basic form of Kriging. It assumes that the mean of the variable is known and constant across the entire study area. This is different from:

  • Ordinary Kriging, which assumes an unknown but constant mean.
  • Universal Kriging, which allows the mean to vary spatially.

Because the mean is known, Simple Kriging is mathematically simpler but less flexible and rarely used in practice unless the mean is truly known from external sources.

The Simple Kriging estimator is: \[\hat{Z}(x_0)=\mu +\sum^n_{i=1}\lambda_i(Z(x_i)-\mu) \] where:

  • \(\hat{Z}(x_0)\) is predicted value at location \(x_0\).
  • \(Z(x_i)\) observed value at location \(x_i\).
  • \(\mu\) is the known mean of the variable.
  • \(\lambda_i\) is the weights derived from the variogram and spatial configuration.

The weights \(\lambda_i\) are chosen to minimize the estimation variance, using the spatial autocorrelation structure defined by the variogram.

Here’s how to perform Simple Kriging using the gstat package:

library(sp)
library(gstat)

# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE

# Step 1: Compute experimental variogram
v <- variogram(log(zinc) ~ 1, meuse)

# Step 2: Fit a theoretical variogram model
model <- fit.variogram(v, vgm(1, "Sph", 900, 1))

# Step 3: Define known mean (e.g., mean of log(zinc))
known_mean <- mean(log(meuse$zinc))

# Step 4: Perform Simple Kriging
sk_result <- krige(log(zinc) ~ 1, meuse, meuse.grid, model, beta = known_mean)
## [using simple kriging]
# Step 5: Plot the result
spplot(sk_result, "var1.pred", main = "Simple Kriging of log(Zinc)")

Explanation of Key Differences

  • The beta = known_mean argument explicitly sets the known mean.
  • Simple Kriging uses this mean in both the prediction and the weighting process.
  • It’s best used when the mean is known from theory or extensive sampling.

Indicator Kriging

Indicator Kriging is used when the variable of interest is binary or categorical, rather than continuous. Instead of interpolating actual values, it estimates the probability that a condition is met at unsampled locations.

Use Case:

  • Mapping the probability of contamination exceeding a threshold.
  • Estimating the likelihood of a disease presence.
  • Land cover classification (e.g., forest vs. non-forest).

how it works:

  • Convert the variable into indicators (0 or 1) based on a threshold.
  • Perform Kriging on the indicator values.
  • The result is a probability surface showing the likelihood of the condition being true.

In this example, we perform Indicator Kriging, a geostatistical method used to estimate the probability that a variable exceeds a certain threshold at unsampled locations. Unlike traditional Kriging methods that predict continuous values, Indicator Kriging is designed for binary or categorical data, making it useful for risk mapping, environmental thresholds, or presence/absence modeling. We use the Meuse dataset, which contains measurements of heavy metals in soil from a region in the Netherlands.

We follow the same procedures described earlier in other Kriging methods from loading libraries, prepararing sample data.

library(sp)
library(gstat)

# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y

To use Indicator Kriging, We define a threshold of 1000 for zinc concentration. The new variable zinc_ind is binary:

  • 1 if zinc > 1000
  • 0 otherwise

This transforms the continuous zinc data into a format suitable for Indicator Kriging.

# Create indicator variable (e.g., zinc > 1000)
meuse$zinc_ind <- ifelse(meuse$zinc > 1000, 1, 0)

Next, we calculate the experimental variogram: The variogram measures how the similarity between indicator values changes with distance. This step produces an empirical variogram, which shows the spatial structure of the binary data.

# Compute experimental variogram for indicator
v <- variogram(zinc_ind ~ 1, meuse)

We fit a spherical variogram model to the empirical variogram. vgm(0.5, “Sph”, 900, 0.1) specifies:

  • Sill = 0.5: the maximum variance.
  • Range = 900: the distance beyond which points are no longer spatially correlated.
  • Nugget = 0.1: small-scale variation or measurement error.
# Fit a variogram model
model <- fit.variogram(v, vgm(0.5, "Sph", 900, 0.1))
## Warning in fit.variogram(v, vgm(0.5, "Sph", 900, 0.1)): linear model has
## singular covariance matrix
## Warning in fit.variogram(v, vgm(0.5, "Sph", 900, 0.1)): singular model in
## variogram fit

Then, prepare the Prediction Grid:

  • meuse.grid is a grid of locations where we want to estimate the probability that zinc exceeds 1000.

We assign coordinates and declare it as a gridded spatial object, which is required for Kriging output.

# Prepare prediction grid
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE

Then, perform Indicator Kriging and plot the probability surface.

# Perform Indicator Kriging
ik_result <- krige(zinc_ind ~ 1, meuse, meuse.grid, model)
## [using ordinary kriging]
# Plot probability surface
spplot(ik_result, "var1.pred", main = "Indicator Kriging: P(Zinc > 1000)")

  • krige() performs the interpolation using the fitted variogram model. The result includes:

  • var1.pred: the estimated probability that zinc > 1000 at each grid location. var1.var: the variance of the prediction.

Co-Kriging

Co-Kriging is a multivariate extension of Kriging that uses multiple correlated variables to improve predictions of the primary variable. It’s useful when the secondary variable is more densely sampled or easier to measure.

🔹 Use Case:

  • Predicting soil moisture using both moisture and elevation.
  • Estimating pollution using both pollutant concentration and land use.

🔹 How it works:

  • Model the cross-variogram between the primary and secondary variables.
  • Use both variables in the Kriging system to estimate the primary variable.

In this example, we perform Co-Kriging, a multivariate geostatistical interpolation method that uses two or more spatially correlated variables to improve the prediction of a primary variable. Unlike Ordinary Kriging, which uses only one variable, Co-Kriging leverages the relationship between variables to enhance accuracy, especially when the secondary variable is more densely sampled or easier to measure. We use the Meuse dataset, which contains measurements of heavy metals in soil, including zinc and lead.

The preliminary process involving loading of necessary libraries and data preparation is similar to the previous illustrations.

library(sp)
library(gstat)

# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y

We need to define a multivariate gstat object to perform co-kriging.

We create a gstat object that includes two variables: zinc (primary) and lead (secondary). Each variable is modeled independently at first, but their spatial relationship will be captured through cross-variograms.

# Define gstat object with two variables
g <- gstat(NULL, id = "zinc", formula = zinc ~ 1, data = meuse)
g <- gstat(g, id = "lead", formula = lead ~ 1, data = meuse)

Then, we compute for cross-variograms:auto-variograms for zinc and lead, and cross-variograms between them. Cross-variograms measure how the spatial patterns of one variable relate to another.

We fit a Linear Model of Coregionalization (LMC), which models the spatial structure of both variables and their interaction. - vgm(1, “Sph”, 900, 1) defines a spherical variogram model. - fit.method = 7 uses weighted least squares to fit the model.

# Compute cross-variograms
v <- variogram(g)
model <- fit.lmc(v, g, vgm(1, "Sph", 900, 1), fit.method = 7)

Then, prepare the Prediction Grid and perform co-kriging and visualize the results.

 #Prediction grid
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE

# Perform Co-Kriging
ck_result <- predict(model, meuse.grid)
## Linear Model of Coregionalization found. Good.
## [using ordinary cokriging]
# Plot result
spplot(ck_result, "zinc.pred", main = "Co-Kriging Prediction of Zinc")

🔹 Summary

  1. Co-Kriging improves predictions by using multiple correlated variables.
  2. It requires modeling both auto-variograms and cross-variograms.
  3. It is especially useful when the secondary variable is more densely sampled or easier to measure.
  4. The output includes predicted values and uncertainty estimates, just like Ordinary Kriging.