Spatial interpolation is a technique used to estimate unknown values at specific geographic locations based on known values from surrounding points. It assumes that things that are closer together in space tend to be more similar than things that are farther apart — a concept known as spatial autocorrelation.
Why Use Spatial Interpolation?
Common Methods
Inverse Distance Weighting (IDW) is a deterministic spatial interpolation method used to estimate values at unsampled locations based on values from nearby known points.
Core Idea: The closer a known point is to the location being estimated, the more influence it has on the predicted value.
IDW calculates the value at an unknown location as a weighted average of nearby known values. The weights are inversely related to the distance — meaning:
The formula is: \[ \hat{Z}(x_0)=\frac{\sum^n_{i=1}\frac{Z(x_i)}{d(x_0,x_i)^p}}{\sum^n_{i=1}\frac{1}{d(x_0,x_1)^p}}\] where:
Note: The power parameter controls how quickly the influence of a point decreases with distance.Higher values (e.g., 2 or 3) give more weight to closer points.Lower values (e.g., 1) spread influence more evenly.
When to Use IDW
In this exercise, we are performing spatial interpolation using the Inverse Distance Weighting (IDW) method. The goal is to estimate zinc concentrations at unsampled locations based on known measurements from sampled points. This is a common task in environmental science, geography, and other spatial disciplines where data is collected at discrete locations but decisions or analyses require continuous surfaces.
We use the Meuse dataset, which contains soil sample data from a region in the Netherlands. This dataset is included in the sp and gstat packages.
We start by loading two essential libraries:
These libraries are foundational for spatial analysis in R. Without them, we cannot work with spatial objects or perform interpolation.
Then we load and prepare the Sample Data:
data(meuse) loads a built-in dataset containing measurements of heavy metals (like zinc, copper, and lead) at specific geographic locations.
coordinates(meuse) = ~x + y transforms the data frame into a spatial object by assigning the x and y columns as spatial coordinates.
This transformation is crucial because spatial interpolation methods require data to be in a format that includes both attribute values (e.g., zinc concentration) and spatial location.
## Warning: package 'sp' was built under R version 4.4.3
## Warning: package 'gstat' was built under R version 4.4.3
To visualize the raw meuse data in R,we can use the basic spatial plot of sample locations.
or you can plot with Zinc concentration as symbol size or color using bubble().
or Colored Plot Using spplot()
You can also plot multiple variables side-by-side to create a multi-panel plot showing spatial distributions of each variable.
Anyway, going back to IDW process, we can now load and prepare the grid for prediction
# Load and prepare grid data
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUEdata(meuse.grid) loads a grid of regularly spaced locations across the study area.
coordinates(meuse.grid) = ~x + y assigns spatial coordinates to each grid point. gridded(meuse.grid) = TRUE tells R that this object represents a regular grid, which is required for interpolation output.
This grid defines the locations where we want to predict zinc concentrations. Think of it as a blank canvas where we will “paint” estimated values based on the known sample points.
Next,perform IDW interpolation. This is the core step where interpolation happens.
The result, idw_result, is a spatial object containing predicted zinc values (var1.pred) at each grid location.
## [inverse distance weighted interpolation]
Finally, visualize the Interpolated Surface.
The plot shows a smooth surface representing estimated zinc levels across the study area, based on the influence of nearby sample points.
This visualization helps us understand the spatial distribution of zinc and identify areas of high or low concentration.
The originally sampled data can be saved in a csv file using the following codes.
meuse_export <- data.frame(x = meuse$x, y = meuse$y, zinc = meuse$zinc)
write.csv(meuse_export, "meuse_zinc_coordinates.csv", row.names = FALSE)If I want to save the interpolated values instead,use the following code
Kriging is a geostatistical interpolation technique that not only considers the distance between known and unknown points (like IDW), but also models the spatial autocorrelation — the statistical relationship between measured points. In simpler terms:
Kriging predicts values at unsampled locations by using both the distance and the spatial structure of the data.
It’s named after Danie Krige, a South African mining engineer who developed the method to estimate ore grades.
Key Features of Kriging
Types of Kriging
Ordinary Kriging estimates the value at an unsampled location \(x_0\) as a weighted linear combination of known values: \[\hat{Z}(x_0)=\sum^n_{i=1}\lambda_i Z(x_i) \]
where:
Ordinary Kriging is a best linear unbiased estimator (BLUE). It uses both distance and spatial structure (via the variogram) to make predictions. It provides not only predictions but also error estimates, making it more powerful than simpler methods like IDW.
In this example, we perform Ordinary Kriging, a geostatistical interpolation method that estimates values at unsampled locations by modeling both the distance and the spatial correlation between known data points. Unlike IDW, Kriging provides not only predictions but also error estimates, making it a more statistically robust method.
We use the Meuse dataset, which contains measurements of heavy metals in soil from a region in the Netherlands.
We begin by loading the necessary libraries. Then load and prepare the sample data, and the prediction grid similar to the process in IDW.
library(sp)
library(gstat)
# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUENext we compute the experimental variogram. The variogram measures how the similarity between data points changes with distance. We use the log-transformed zinc values to stabilize variance and improve model fit.
This step produces an empirical variogram, which shows how spatial dependence behaves in the data.
We then fit a theoretical variogram Model. We fit a spherical variogram model to the empirical variogram. vgm(1, “Sph”, 900, 1) specifies:
This model is used to determine how much influence each known point has on the prediction.
Next we will perform Ordinary Kriging.
The result includes both predicted values (var1.pred) and prediction variances (var1.var), which quantify uncertainty.
## [using ordinary kriging]
Finally, visualize the Interpolated Surface.
# Step 4: Plot the predicted surface
spplot(kriging_result, "var1.pred", main = "Ordinary Kriging of log(Zinc)")Lastly, you can save the results in a csv file by using the R codes below.
Universal Kriging (also called Kriging with a trend) is a geostatistical interpolation method that extends Ordinary Kriging by allowing the mean of the variable to vary across space. Instead of assuming a constant mean, Universal Kriging models the mean as a function of spatial coordinates (e.g., a linear or polynomial trend).
This is useful when your data shows a spatial trend, such as increasing pollution levels from west to east or elevation rising from south to north.
The Universal Kriging estimator is: \[\hat{Z}(x_0)=\sum^n_{i=1}\lambda_i Z(x_i) \] subject to: \[Z(x)=m(x)+\epsilon(x) \] where:
The trend \(m(x)\) is is typically modeled as:
\[m(x)=\beta_0+\beta_1x+\beta_2y\] or higher-order polynomials depending on the complexity of the spatial trend.
Here’s how to implement Universal Kriging using the meuse dataset:
library(sp)
library(gstat)
# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE
# Step 1: Compute experimental variogram with a trend
v <- variogram(log(zinc) ~ x + y, meuse)
# Step 2: Fit a theoretical variogram model
model <- fit.variogram(v, vgm(1, "Sph", 900, 1))
# Step 3: Perform Universal Kriging
uk_result <- krige(log(zinc) ~ x + y, meuse, meuse.grid, model)## [using universal kriging]
# Step 4: Plot the predicted surface
spplot(uk_result, "var1.pred", main = "Universal Kriging of log(Zinc)")Explanation of Key Differences from Ordinary Kriging
Simple Kriging is the most basic form of Kriging. It assumes that the mean of the variable is known and constant across the entire study area. This is different from:
Because the mean is known, Simple Kriging is mathematically simpler but less flexible and rarely used in practice unless the mean is truly known from external sources.
The Simple Kriging estimator is: \[\hat{Z}(x_0)=\mu +\sum^n_{i=1}\lambda_i(Z(x_i)-\mu) \] where:
The weights \(\lambda_i\) are chosen to minimize the estimation variance, using the spatial autocorrelation structure defined by the variogram.
Here’s how to perform Simple Kriging using the gstat package:
library(sp)
library(gstat)
# Load and prepare data
data(meuse)
coordinates(meuse) = ~x + y
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE
# Step 1: Compute experimental variogram
v <- variogram(log(zinc) ~ 1, meuse)
# Step 2: Fit a theoretical variogram model
model <- fit.variogram(v, vgm(1, "Sph", 900, 1))
# Step 3: Define known mean (e.g., mean of log(zinc))
known_mean <- mean(log(meuse$zinc))
# Step 4: Perform Simple Kriging
sk_result <- krige(log(zinc) ~ 1, meuse, meuse.grid, model, beta = known_mean)## [using simple kriging]
Explanation of Key Differences
Indicator Kriging is used when the variable of interest is binary or categorical, rather than continuous. Instead of interpolating actual values, it estimates the probability that a condition is met at unsampled locations.
Use Case:
how it works:
In this example, we perform Indicator Kriging, a geostatistical method used to estimate the probability that a variable exceeds a certain threshold at unsampled locations. Unlike traditional Kriging methods that predict continuous values, Indicator Kriging is designed for binary or categorical data, making it useful for risk mapping, environmental thresholds, or presence/absence modeling. We use the Meuse dataset, which contains measurements of heavy metals in soil from a region in the Netherlands.
We follow the same procedures described earlier in other Kriging methods from loading libraries, prepararing sample data.
To use Indicator Kriging, We define a threshold of 1000 for zinc concentration. The new variable zinc_ind is binary:
This transforms the continuous zinc data into a format suitable for Indicator Kriging.
Next, we calculate the experimental variogram: The variogram measures how the similarity between indicator values changes with distance. This step produces an empirical variogram, which shows the spatial structure of the binary data.
We fit a spherical variogram model to the empirical variogram. vgm(0.5, “Sph”, 900, 0.1) specifies:
## Warning in fit.variogram(v, vgm(0.5, "Sph", 900, 0.1)): linear model has
## singular covariance matrix
## Warning in fit.variogram(v, vgm(0.5, "Sph", 900, 0.1)): singular model in
## variogram fit
Then, prepare the Prediction Grid:
We assign coordinates and declare it as a gridded spatial object, which is required for Kriging output.
# Prepare prediction grid
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUEThen, perform Indicator Kriging and plot the probability surface.
## [using ordinary kriging]
# Plot probability surface
spplot(ik_result, "var1.pred", main = "Indicator Kriging: P(Zinc > 1000)")krige() performs the interpolation using the fitted variogram model. The result includes:
var1.pred: the estimated probability that zinc > 1000 at each grid location. var1.var: the variance of the prediction.
Co-Kriging is a multivariate extension of Kriging that uses multiple correlated variables to improve predictions of the primary variable. It’s useful when the secondary variable is more densely sampled or easier to measure.
🔹 Use Case:
🔹 How it works:
In this example, we perform Co-Kriging, a multivariate geostatistical interpolation method that uses two or more spatially correlated variables to improve the prediction of a primary variable. Unlike Ordinary Kriging, which uses only one variable, Co-Kriging leverages the relationship between variables to enhance accuracy, especially when the secondary variable is more densely sampled or easier to measure. We use the Meuse dataset, which contains measurements of heavy metals in soil, including zinc and lead.
The preliminary process involving loading of necessary libraries and data preparation is similar to the previous illustrations.
We need to define a multivariate gstat object to perform co-kriging.
We create a gstat object that includes two variables: zinc (primary) and lead (secondary). Each variable is modeled independently at first, but their spatial relationship will be captured through cross-variograms.
# Define gstat object with two variables
g <- gstat(NULL, id = "zinc", formula = zinc ~ 1, data = meuse)
g <- gstat(g, id = "lead", formula = lead ~ 1, data = meuse)Then, we compute for cross-variograms:auto-variograms for zinc and lead, and cross-variograms between them. Cross-variograms measure how the spatial patterns of one variable relate to another.
We fit a Linear Model of Coregionalization (LMC), which models the spatial structure of both variables and their interaction. - vgm(1, “Sph”, 900, 1) defines a spherical variogram model. - fit.method = 7 uses weighted least squares to fit the model.
# Compute cross-variograms
v <- variogram(g)
model <- fit.lmc(v, g, vgm(1, "Sph", 900, 1), fit.method = 7)Then, prepare the Prediction Grid and perform co-kriging and visualize the results.
#Prediction grid
data(meuse.grid)
coordinates(meuse.grid) = ~x + y
gridded(meuse.grid) = TRUE
# Perform Co-Kriging
ck_result <- predict(model, meuse.grid)## Linear Model of Coregionalization found. Good.
## [using ordinary cokriging]
🔹 Summary