Universal Kriging (UK) presents an alternative to Ordinary Kriging, tailored to non-stationary scenarios where the mean exhibits deterministic variations across different locations (such as trends or drifts), while the variance remains constant. The nature of this trend can span from a local scale, capturing immediate neighborhood influences, to a global scale encompassing the entire area. This approach, known as second-order stationarity or “weak stationarity,” is frequently applicable in contexts involving environmental exposures.
In Universal Kriging, the first step typically involves the calculation of the trend, which is expressed as a function of geographical coordinates. Subsequently, this trend is combined with the remaining variation (referred to as residuals), modeled as a random field. The integration of the trend and residuals is crucial in generating the final predictions.
The UK model conceptualizes the value of a variable at a specific location as the composite of a regionally varying non-stationary trend and a locally correlated random component, which represents the deviations from the regional trend.
library(plyr)
library(dplyr)
library(gstat)
library(raster)
library(ggplot2)
library(car)
library(classInt)
library(RStoolbox)
library(spatstat)
library(dismo)
library(fields)
library(gridExtra)
The soil organic carbon data (train and test data set) could be found here.
# Define data folder
dataFolder<-"D:\\Punjab University Courses\\pu notes\\Spatial Statistics\\Lecture 5\\geospatial-r-github.io-sites\\Data\\DATA_08\\"
train<-read.csv(paste0(dataFolder,"train_data.csv"), header= TRUE)
test<-read.csv(paste0(dataFolder,"test_data.csv"), header= TRUE)
state<-shapefile(paste0(dataFolder,"GP_STATE.shp"))
grid<-read.csv(paste0(dataFolder, "GP_prediction_grid_data.csv"), header= TRUE)
Power Transform uses the maximum likelihood-like approach of Box and Cox (1964) to select a transformation of a univariate or multivariate response for normality. First we have to calculate appropriate transformation parameters using powerTransform() function of car package and then use this parameter to transform the data using bcPower() function.
powerTransform(train$SOC)
## Estimated transformation parameter
## train$SOC
## 0.2523339
train$SOC.bc<-bcPower(train$SOC, 0.2523339)
First. we have to define x & y variables to coordinates
coordinates(train) = ~x+y
coordinates(grid) = ~x+y
First, we will compute and visualize a first-order trend surface using krige() function.
trend<-krige(SOC.bc~x+y, train, grid, model=NULL)
## [ordinary or weighted least squares prediction]
trend.r<-rasterFromXYZ(as.data.frame(trend)[, c("x", "y", "var1.pred")])
ggR(trend.r, geom_raster = TRUE) +
scale_fill_gradientn("", colours = c("orange", "yellow", "green", "sky blue","blue"))+
theme_bw()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())+
ggtitle("Global Trend of BoxCox-SOC")+
theme(plot.title = element_text(hjust = 0.5))
In UK, the semivariances are based on the residuals, not the original data, because the random part of the spatial structure applies only to these residuals. The model parameters for the residuals will usually be very different from the original variogram model (often: lower sill, shorter range), since the global trend has taken out some of the variation and the long-range structure. In gstat, we can compute residual varoigram directly, if we provide an appropriate model formula; you do not have to compute the residuals manually.
We use the variogram method and specify the spatial dependence with the formula SOC.bc ~ x+y (as opposed to SOC.bc ~ 1 in the ordinary variogram). This has the same meaning as in the lm (linear regression) model specification: the SOC concentration is to be predicted using Ist order trend; then the residuals are to be modeled spatially.
# Variogram
v<-variogram(SOC.bc~ x+y, data = train, cloud=F)
# Intial parameter set by eye esitmation
m<-vgm(1.5,"Exp",40000,0.5)
# least square fit
m.f<-fit.variogram(v, m)
m.f
## model psill range
## 1 Nug 0.5186164 0.00
## 2 Exp 1.0744976 81954.85
#### Plot varigram and fitted model:
plot(v, pl=F,
model=m.f,
col="black",
cex=0.9,
lwd=0.5,
lty=1,
pch=19,
main="Variogram of Residuals",
xlab="Distance (m)",
ylab="Semivariance")
krige() function in gstat package use for simple, ordinary or universal kriging (sometimes called external drift kriging), kriging in a local neighborhood, point kriging or kriging of block mean values (rectangular or irregular blocks), and conditional (Gaussian or indicator) simulation equivalents for all kriging varieties, and function for inverse distance weighted interpolation. For multivariate prediction.
UK<-krige(SOC.bc~x+y,
loc= train, # Data frame
newdata=grid, # Prediction grid
model = m.f) # fitted varigram model
## [using universal kriging]
summary(UK)
## Object of class SpatialPointsDataFrame
## Coordinates:
## min max
## x -1245285 114715
## y 1003795 2533795
## Is projected: NA
## proj4string : [NA]
## Number of points: 10674
## Data attributes:
## var1.pred var1.var
## Min. :-0.1549 Min. :0.7235
## 1st Qu.: 1.2799 1st Qu.:0.9298
## Median : 1.8873 Median :1.0023
## Mean : 1.8862 Mean :1.0244
## 3rd Qu.: 2.5169 3rd Qu.:1.1020
## Max. : 4.1413 Max. :1.4817
We will back transformation using transformation parameters that have used Box-cos transformation
k1<-1/0.2523339
UK$UK.pred <-((UK$var1.pred *0.2523339+1)^k1)
UK$UK.var <-((UK$var1.var *0.2523339+1)^k1)
summary(UK)
## Object of class SpatialPointsDataFrame
## Coordinates:
## min max
## x -1245285 114715
## y 1003795 2533795
## Is projected: NA
## proj4string : [NA]
## Number of points: 10674
## Data attributes:
## var1.pred var1.var UK.pred UK.var
## Min. :-0.1549 Min. :0.7235 Min. : 0.8539 Min. :1.944
## 1st Qu.: 1.2799 1st Qu.:0.9298 1st Qu.: 3.0319 1st Qu.:2.305
## Median : 1.8873 Median :1.0023 Median : 4.6814 Median :2.444
## Mean : 1.8862 Mean :1.0244 Mean : 5.2142 Mean :2.497
## 3rd Qu.: 2.5169 3rd Qu.:1.1020 3rd Qu.: 7.0191 3rd Qu.:2.644
## Max. : 4.1413 Max. :1.4817 Max. :17.0323 Max. :3.521
UK.pred<-rasterFromXYZ(as.data.frame(UK)[, c("x", "y", "UK.pred")])
UK.var<-rasterFromXYZ(as.data.frame(UK)[, c("x", "y", "UK.var")])
p1<-ggR(UK.pred, geom_raster = TRUE) +
scale_fill_gradientn("", colours = c("orange", "yellow", "green", "sky blue","blue"))+
theme_bw()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())+
ggtitle("UK Predicted SOC")+
theme(plot.title = element_text(hjust = 0.5))
p2<-ggR(UK.var, geom_raster = TRUE) +
scale_fill_gradientn("", colours = c("blue", "green","yellow", "orange"))+
theme_bw()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())+
ggtitle("UK Predition Variance")+
theme(plot.title = element_text(hjust = 0.5))
grid.arrange(p1,p2, ncol = 2) # Multiplot
The visual representations above depict interpolated maps of soil organic carbon (SOC) along with their respective error assessments within each prediction grid. The Ordinary Kriging (OK) predicted map unveils the overarching trends and identifies hotspots of SOC concentration. It’s worth noting that kriging variance tends to be elevated in areas where no actual samples were collected, as it is contingent on the spatial arrangement of the sampling locations. In contrast, lower variance is observed in proximity to the actual sampling sites. This kriging variance is influenced by the covariance model but remains independent of the specific data values.
Note: The data and text are copied from https://github.com/zia207.