1. Introduction

This notebook illustrates how to conduct spatial interpolation in R using precipitation data as example. This notebook is written to help Geomatica Basica students at Universidad Nacional de Colombia to become familiar with geospatial functionalities provided by the R software environment.

Please note that this notebook explores geostatistics functions but does not discuss spatial analysis nor geostatistics concepts. Before proceeding, make sure you understand these concepts. In case of doubt, a visit to this link may help.

For a complete tour on R geospatial functionalities, please visit Intro to GIS and Spatial Analysis

This notebook includes the following sections: - Downloading global precipitation data - Pre-processing precipitation data - Interpolating precipitation data

The interpolation section follows closely the code written by Manuel Gimond. Many thanks!

2. Downloading global precipitation data

Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) is a 35+ year quasi-global rainfall data set. Spanning 50°S-50°N (and all longitudes) and ranging from 1981 to near-present, CHIRPS incorporates our in-house climatology, CHPclim, 0.05° resolution satellite imagery, and in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring. You can find more information about CHIRPS here

You can download CHIRPS data going to this place. If you go there you will find this:

By clicking on the global_pentad directory, you will find:

library(knitr)
include_graphics('./chirps/chirps_pentad.png')

Under the tifs directory, several compressed files are waiting for you.

library(knitr)
include_graphics('./chirps/chirps_pentad_2020.png')

Download the most recent file to your work directory. Uncompress it to obtain a tif file.

Now, let’s install (not shown) and load the required libraries:

library(rgdal)
## Loading required package: sp
## rgdal: version: 1.4-8, (SVN revision 845)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.4.2, released 2019/06/28
##  Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/rgdal/gdal
##  GDAL binary built with GEOS: FALSE 
##  Loaded PROJ.4 runtime: Rel. 5.2.0, September 15th, 2018, [PJ_VERSION: 520]
##  Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/rgdal/proj
##  Linking to sp version: 1.3-2
library(raster)
library(sf)
## Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0     ✓ purrr   0.3.4
## ✓ tibble  3.0.1     ✓ dplyr   0.8.5
## ✓ tidyr   1.0.3     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x tidyr::extract() masks raster::extract()
## x dplyr::filter()  masks stats::filter()
## x dplyr::lag()     masks stats::lag()
## x dplyr::select()  masks raster::select()
library(tmap)
library(gstat)
library(sp)

3. Pre-processing CHIRPS data

Read the uncompressed CHIRPS file:

## Read the CHIRPS dataset
precip <- raster("./chirps/chirps-v2.0.2020.03.6.tif")

As you may have already noted, this file represents the cumulative rainfall in the last days of March 2020 (i.e. the precipitation from 26th to 31 th March).

Check the contents of the precip object:

## What is precip?
precip
## class      : RasterLayer 
## dimensions : 2000, 7200, 14400000  (nrow, ncol, ncell)
## resolution : 0.05, 0.05  (x, y)
## extent     : -180, 180, -50, 50  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : /Users/ivanlizarazo/Documents/ivan/GB/notebooks/chirps/chirps-v2.0.2020.03.6.tif 
## names      : chirps.v2.0.2020.03.6

Note that this is a dataset with global coverage. Its CRS is a geographic coordinate system.

Let’s load a shapefile representing our area-of-interest:

(aoi <- shapefile("/Users/ivanlizarazo/Documents/ivan/pr/montes/MontesdeMaria_WGS84.shp"))
## class       : SpatialPolygonsDataFrame 
## features    : 15 
## extent      : -75.70418, -74.77214, 9.22535, 10.14548  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 11
## names       : DPTO_CCDGO, MPIO_CCDGO, MPIO_CNMBR,                        MPIO_CRSLC,    MPIO_NAREA, MPIO_NANO, DPTO_CNMBR,    Shape_Leng,    Shape_Area,      layer,                                                                             path 
## min values  :         13,      13212,     CHALÁN,                              1780,   83.71001417,      2017,    BOLÍVAR,  0.5720244142, 0.00689090159, bol_montes, /Users/ivanlizarazo/Documents/ivan/PR/montes/bol_montes.shp|layername=bol_montes 
## max values  :         70,      70823,   ZAMBRANO, Ordenanza 6 de Octubre 30 de 1968, 1035.07793303,      2017,      SUCRE, 2.08052417308, 0.08525488217, suc_montes, /Users/ivanlizarazo/Documents/ivan/PR/montes/suc_montes.shp|layername=suc_montes

Note that the coordinate reference systems for both datasets are the same.

Now, let’s clip the precipitation data:

# Crop precipitation data by extent of area-of-interest
precip.crop <- raster::crop(precip, extent(aoi))

Now, mask the raster layer:

precip.mask <- mask(x = precip.crop, mask = aoi)

Check the output:

precip.mask 
## class      : RasterLayer 
## dimensions : 18, 19, 342  (nrow, ncol, ncell)
## resolution : 0.05, 0.05  (x, y)
## extent     : -75.7, -74.75, 9.249999, 10.15  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : memory
## names      : chirps.v2.0.2020.03.6 
## values     : 1.061518, 5.766768  (min, max)

Plot the masked raster layer:

plot(precip.mask, main= "CHIRPS rainfall in Montes from 26.03. to 30.03 in 2020 [mm]")
plot(aoi, add=TRUE)

A better map can be made with Leaflet using the addRasterImage function:

library(leaflet)
library(RColorBrewer)
pal <- colorNumeric(c("red", "orange", "yellow", "blue", "darkblue"), values(precip.mask),
  na.color = "transparent")

leaflet() %>% addTiles() %>%
  addRasterImage(precip.mask, colors = pal, opacity = 0.6) %>%
  addLegend(pal = pal, values = values(precip.mask),
    title = "CHIRPS rainfall in Montes\n From 26.03 to 30.03 in 2020 [mm]")

Conversion of raster into points may be conducted using the rasterToPoints function from the raster library:

precip.points <- rasterToPoints(precip.mask, spatial = TRUE)
precip.points
## class       : SpatialPointsDataFrame 
## features    : 213 
## extent      : -75.675, -74.825, 9.274999, 10.075  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 1
## names       : chirps.v2.0.2020.03.6 
## min values  :      1.06151795387268 
## max values  :      5.76676845550537
names(precip.points) <- "rain"
precip.points
## class       : SpatialPointsDataFrame 
## features    : 213 
## extent      : -75.675, -74.825, 9.274999, 10.075  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 1
## names       :             rain 
## min values  : 1.06151795387268 
## max values  : 5.76676845550537
str(precip.points)
## Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
##   ..@ data       :'data.frame':  213 obs. of  1 variable:
##   .. ..$ rain: num [1:213] 3.64 3.63 3.51 4.95 5.31 ...
##   ..@ coords.nrs : num(0) 
##   ..@ coords     : num [1:213, 1:2] -75.3 -75.3 -75.2 -75.2 -75.1 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : NULL
##   .. .. ..$ : chr [1:2] "x" "y"
##   ..@ bbox       : num [1:2, 1:2] -75.67 9.27 -74.82 10.07
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:2] "x" "y"
##   .. .. ..$ : chr [1:2] "min" "max"
##   ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
##   .. .. ..@ projargs: chr "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

Let’s plot the points:

plot(precip.mask, main= "CHIRPS rainfall from 26 to 30.03.2020 [mm]")
plot(aoi, add=TRUE)
points(precip.points$x, precip.points$y, col = "red", cex = .6)

Let’s write the points spatial object to a disk file. Just to save time and work in case anything goes wrong.

We will use the rgdal library as illustrated here.

geojsonio::geojson_write(precip.points, file = "./chirps/ppoints.geojson")
## Success! File is at ./chirps/ppoints.geojson
## <geojson-file>
##   Path:       ./chirps/ppoints.geojson
##   From class: SpatialPointsDataFrame

Let’s read the precipitation points. See the geojsonio documentation to understant what parameters need to be passed to the geojson_read function.

precip.points <- geojsonio::geojson_read("./chirps/ppoints.geojson", what="sp")

Let’s check what is the precip.points object:

precip.points
## class       : SpatialPointsDataFrame 
## features    : 213 
## extent      : -75.675, -74.825, 9.274999, 10.075  (xmin, xmax, ymin, ymax)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 1
## names       :             rain 
## min values  : 1.06151795387268 
## max values  : 5.76676845550537

What is the point of converting raster precipitation data into point precipitation data?

Well, there are few World Meteorological Organization (WMO)’s weather stations in our country:

Thus, CHIRPS may be an option to get the point precipitation data needed to obtain a continuous precipitation surface.

Further preparation

In case the connection is lost, we can read again the shapefile with the area of interest:

(aoi <- shapefile("/Users/ivanlizarazo/Documents/ivan/pr/montes/MontesdeMaria_WGS84.shp"))
## class       : SpatialPolygonsDataFrame 
## features    : 15 
## extent      : -75.70418, -74.77214, 9.22535, 10.14548  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 11
## names       : DPTO_CCDGO, MPIO_CCDGO, MPIO_CNMBR,                        MPIO_CRSLC,    MPIO_NAREA, MPIO_NANO, DPTO_CNMBR,    Shape_Leng,    Shape_Area,      layer,                                                                             path 
## min values  :         13,      13212,     CHALÁN,                              1780,   83.71001417,      2017,    BOLÍVAR,  0.5720244142, 0.00689090159, bol_montes, /Users/ivanlizarazo/Documents/ivan/PR/montes/bol_montes.shp|layername=bol_montes 
## max values  :         70,      70823,   ZAMBRANO, Ordenanza 6 de Octubre 30 de 1968, 1035.07793303,      2017,      SUCRE, 2.08052417308, 0.08525488217, suc_montes, /Users/ivanlizarazo/Documents/ivan/PR/montes/suc_montes.shp|layername=suc_montes

We need to convert it to a spatial feature:

montes_sf <-  sf::st_as_sf(aoi)

Now, dissolve the internal boundaries:

(border_sf <-
  montes_sf %>%
  summarise(area = sum(MPIO_NAREA)))
## Simple feature collection with 1 feature and 1 field
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -75.70418 ymin: 9.22535 xmax: -74.77214 ymax: 10.14548
## CRS:            +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
##       area                       geometry
## 1 6513.845 POLYGON ((-75.11082 9.44062...

Convert the spatial feature to a spatial dataframe:

 (border <- as(border_sf, 'Spatial')) 
## class       : SpatialPolygonsDataFrame 
## features    : 1 
## extent      : -75.70418, -74.77214, 9.22535, 10.14548  (xmin, xmax, ymin, ymax)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 1
## names       :          area 
## value       : 6513.84545314

Another conversion:

(montes.sf <- st_as_sf(aoi) %>% mutate(MUNIC = MPIO_CNMBR, CODIGO = MPIO_CCDGO) %>% select(MUNIC, CODIGO))
## Simple feature collection with 15 features and 2 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -75.70418 ymin: 9.22535 xmax: -74.77214 ymax: 10.14548
## CRS:            +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## First 10 features:
##                   MUNIC CODIGO                       geometry
## 1               CÓRDOBA  13212 POLYGON ((-74.96119 9.68082...
## 2  EL CARMEN DE BOLÍVAR  13244 POLYGON ((-75.3143 9.878536...
## 3              EL GUAMO  13248 POLYGON ((-74.94394 10.1252...
## 4         MARIA LA BAJA  13442 POLYGON ((-75.30714 10.0921...
## 5           SAN JACINTO  13654 POLYGON ((-75.20899 9.92045...
## 6   SAN JUAN NEPOMUCENO  13657 POLYGON ((-75.05241 10.1210...
## 7              ZAMBRANO  13894 POLYGON ((-74.87798 9.85810...
## 8             SINCELEJO  70001 POLYGON ((-75.51313 9.40564...
## 9     RICAURTE (COLOSO)  70204 POLYGON ((-75.3837 9.639216...
## 10               CHALÁN  70230 POLYGON ((-75.35705 9.64703...

Now, we will try a st_intersection task. This is done using the sf library:

# conversion
p.sf <- st_as_sf(precip.points)
## intersect polygons with points, keeping the information from both
(precip.sf = st_intersection(montes.sf, p.sf))
## although coordinates are longitude/latitude, st_intersection assumes that they are planar
## Warning: attribute variables are assumed to be spatially constant throughout all
## geometries
## Simple feature collection with 213 features and 3 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -75.675 ymin: 9.274999 xmax: -74.825 ymax: 10.075
## CRS:            +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## First 10 features:
##                   MUNIC CODIGO     rain               geometry
## 4         MARIA LA BAJA  13442 3.638230 POINT (-75.325 10.075)
## 4.1       MARIA LA BAJA  13442 3.631608 POINT (-75.275 10.075)
## 4.2       MARIA LA BAJA  13442 3.505968 POINT (-75.225 10.075)
## 6   SAN JUAN NEPOMUCENO  13657 4.947118 POINT (-75.175 10.075)
## 6.1 SAN JUAN NEPOMUCENO  13657 5.308706 POINT (-75.125 10.075)
## 6.2 SAN JUAN NEPOMUCENO  13657 5.208715 POINT (-75.075 10.075)
## 3              EL GUAMO  13248 4.233178 POINT (-75.025 10.075)
## 3.1            EL GUAMO  13248 3.827909 POINT (-74.975 10.075)
## 3.2            EL GUAMO  13248 3.400251 POINT (-74.925 10.075)
## 3.3            EL GUAMO  13248 2.834286 POINT (-74.875 10.075)

Two reprojection tasks:

p.sf.magna <- st_transform(precip.sf, crs=3116)
montes.sf.magna <- st_transform(montes.sf, crs=3116)

A conversion:

(precip2 <- as(p.sf.magna, 'Spatial'))
## class       : SpatialPointsDataFrame 
## features    : 213 
## extent      : 824685.3, 918049.7, 1517694, 1606178  (xmin, xmax, ymin, ymax)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## crs         : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## variables   : 3
## names       :    MUNIC, CODIGO,             rain 
## min values  :   CHALÁN,  13212, 1.06151795387268 
## max values  : ZAMBRANO,  70823, 5.76676845550537

Again, we can write the intermediate dataset (just in case):

shapefile(precip2, filename='./chirps/precip2.shp', overwrite=TRUE)

If needed, read it. Otherwise, keep going:

precip2$rainfall <- round(precip2$rain, 1)
##precip2$rainfall <- round(precip2$chirps.v2.0.2020.03.6, 1)

What we got:

precip2
## class       : SpatialPointsDataFrame 
## features    : 213 
## extent      : 824685.3, 918049.7, 1517694, 1606178  (xmin, xmax, ymin, ymax)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## crs         : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## variables   : 4
## names       :    MUNIC, CODIGO,             rain, rainfall 
## min values  :   CHALÁN,  13212, 1.06151795387268,      1.1 
## max values  : ZAMBRANO,  70823, 5.76676845550537,      5.8
(montes2 <- as(montes.sf.magna, 'Spatial'))
## class       : SpatialPolygonsDataFrame 
## features    : 15 
## extent      : 821468.2, 923749.4, 1512192, 1614054  (xmin, xmax, ymin, ymax)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## crs         : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## variables   : 2
## names       :    MUNIC, CODIGO 
## min values  :   CHALÁN,  13212 
## max values  : ZAMBRANO,  70823

You may want to write the new object (just in case):

shapefile(montes2, filename='./chirps/montes2.shp', overwrite=TRUE)

Make sure the two extents match:

# Replace point boundary extent with that of Montes
precip2@bbox <- montes2@bbox

Plotting the data with tmap

tm_shape(montes2) + tm_polygons() +
  tm_shape(precip2) +
  tm_dots(col="rainfall", palette = "RdBu", midpoint = 3.0,
             title="Sampled precipitation \n(in mm)", size=0.2) +
  tm_text("rainfall", just="center", xmod=.6, size = 0.5) +
  tm_legend(legend.outside=TRUE)
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

4. Interpolating precipitation data

4.1 Thiessen polygons

The Thiessen polygons (or proximity interpolation) can be created using spatstat’s dirichlet function.

## Loading required package: spatstat.data
## Loading required package: nlme
## 
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
## 
##     collapse
## The following object is masked from 'package:raster':
## 
##     getData
## Loading required package: rpart
## Registered S3 method overwritten by 'spatstat':
##   method     from
##   print.boxx cli
## 
## spatstat 1.63-3       (nickname: 'Wet paint') 
## For an introduction to spatstat, type 'beginner'
## 
## Note: spatstat version 1.63-3 is out of date by more than 12 weeks; a newer version should be available.
## 
## Attaching package: 'spatstat'
## The following object is masked from 'package:gstat':
## 
##     idw
## The following objects are masked from 'package:raster':
## 
##     area, rotate, shift
## Checking rgeos availability: TRUE
# Create a tessellated surface
th  <-  as(dirichlet(as.ppp(precip2)), "SpatialPolygons")

# The dirichlet function does not carry over projection information
# requiring that this information be added manually
crs(th) <- crs(precip2)
crs(montes2) <- crs(precip2)
crs(th) 
## CRS arguments:
##  +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1
## +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m
## +no_defs
crs(precip2)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## CRS arguments:
##  +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1
## +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m
## +no_defs

In case of getting lost at any step, please review the corresponding sp and sf functionalities.

# The tessellated surface does not store attribute information
# from the point data layer. We'll use the over() function (from the sp
# package) to join the point attributes to the tesselated surface via
# a spatial join. The over() function creates a dataframe that will need to
# be added to the `th` object thus creating a SpatialPolygonsDataFrame object
th.z     <- over(th, precip2, fn=mean)
th.spdf  <-  SpatialPolygonsDataFrame(th, th.z)

# Finally, we'll clip the tessellated  surface to the AOI boundaries
th.clp   <- raster::intersect(montes2,th.spdf)
# Map the data
tm_shape(th.clp) + 
  tm_polygons(col="rainfall", palette="RdBu", midpoint=3.0,
              title="Thiessen Polygons \nPredicted precipitation \n(in mm)") +
  tm_legend(legend.outside=TRUE)

4.2 Inverse Distance Weighted Interpolation

Many packages share the same function names. This can be a problem when these packages are loaded in a same R session. For example, the intersect function is available in the base, spatstat and raster packages–all of which are loaded in this current session. To ensure that the proper function is selected, it’s a good idea to preface the function name with the package name as in raster::intersect().

This tip will be used in the next chunk of code when calling the idw function which is available in both spatstat and gstat.

Note that the dirichlet function (like most functions in the spatsat package) require that the point object be in a ppp format hence the inline as.ppp(precip2) syntax.

The IDW output is a raster. This requires that we first create an empty raster grid, then interpolate the precipitation values to each unsampled grid cell. An IDW power value of 2 (idp=2.0) will be used.

# Create an empty grid where n is the total number of cells
grd              <- as.data.frame(spsample(precip2, "regular", n=100000))
# You need to figure out what is the expected size of the output grd
names(grd)       <- c("X", "Y")
coordinates(grd) <- c("X", "Y")
gridded(grd)     <- TRUE  # Create SpatialPixel object
fullgrid(grd)    <- TRUE  # Create SpatialGrid object

# Add P's projection information to the empty grid
proj4string(grd) <- proj4string(precip2)

# Interpolate the grid cells using a power value of 2 (idp=2.0)
P.idw <- gstat::idw(rainfall ~ 1, precip2, newdata=grd, idp=2.0)

# Convert to raster object then clip to AOI
r       <- raster(P.idw)
r
## class      : RasterLayer 
## dimensions : 316, 317, 100172  (nrow, ncol, ncell)
## resolution : 322.7781, 322.7781  (x, y)
## extent     : 821467.1, 923787.8, 1512070, 1614068  (xmin, xmax, ymin, ymax)
## crs        : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## source     : memory
## names      : var1.pred 
## values     : 1.111297, 5.793569  (min, max)
montes2
## class       : SpatialPolygonsDataFrame 
## features    : 15 
## extent      : 821468.2, 923749.4, 1512192, 1614054  (xmin, xmax, ymin, ymax)
## crs         : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## variables   : 2
## names       :    MUNIC, CODIGO 
## min values  :   CHALÁN,  13212 
## max values  : ZAMBRANO,  70823
r.m   <- raster::mask(r, montes2)
r.m 
## class      : RasterLayer 
## dimensions : 316, 317, 100172  (nrow, ncol, ncell)
## resolution : 322.7781, 322.7781  (x, y)
## extent     : 821467.1, 923787.8, 1512070, 1614068  (xmin, xmax, ymin, ymax)
## crs        : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## source     : memory
## names      : var1.pred 
## values     : 1.111297, 5.793569  (min, max)
# Plot
tm_shape(r.m) + 
  tm_raster(n=10,palette = "RdBu", auto.palette.mapping = FALSE,
            title="Inverse Distance Weighted \nPredicted precipitation \n(in mm)") + 
  tm_shape(precip2) + tm_dots(size=0.2) +
  tm_legend(legend.outside=TRUE)

A better visualization of the interpolated surface:

library(leaflet)
library(RColorBrewer)
pal <- colorNumeric(c("red", "orange", "yellow", "blue", "darkblue"), values(precip.mask),
  na.color = "transparent")

leaflet() %>% addTiles() %>%
  addRasterImage(r.m, colors = pal, opacity = 0.6) %>%
  addLegend(pal = pal, values = values(r.m),
    title = "IDW interpolated rainfall in Montes\n From 26.03 to 30.03 in 2020 [mm]")

Fine-tuning the interpolation

The choice of power function can be subjective. To fine-tune the choice of the power parameter, you can perform a leave-one-out validation routine to measure the error in the interpolated values.

precip2
## class       : SpatialPointsDataFrame 
## features    : 213 
## extent      : 821468.2, 923749.4, 1512192, 1614054  (xmin, xmax, ymin, ymax)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## crs         : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## variables   : 4
## names       :    MUNIC, CODIGO,             rain, rainfall 
## min values  :   CHALÁN,  13212, 1.06151795387268,      1.1 
## max values  : ZAMBRANO,  70823, 5.76676845550537,      5.8
P <- precip2
IDW.out <- vector(length = length(P))
for (i in 1:length(P)) {
  IDW.out[i] <- gstat::idw(rainfall ~ 1, P[-i,], P[i,], idp=2.0)$var1.pred
}
# Plot the differences
OP <- par(pty="s", mar=c(4,3,0,0))
  plot(IDW.out ~ P$rainfall, asp=1, xlab="Observed", ylab="Predicted", pch=16,
       col=rgb(0,0,0,0.5))
  abline(lm(IDW.out ~ P$rainfall), col="red", lw=2,lty=2)
  abline(0,1)

par(OP)

The root mean square error (RMSE) can be computed from IDW.out as follows:

# Compute RMSE
sqrt( sum((IDW.out - P$rainfall)^2) / length(P))
## [1] 0.4970752

Cross-validation

In addition to generating an interpolated surface, you can create a 95% confidence interval map of the interpolation model. Here we’ll create a 95% CI map from an IDW interpolation that uses a power parameter of 2 (idp=2.0).

# Implementation of a jackknife technique to estimate 
# a confidence interval at each unsampled point.

# Create the interpolated surface
img <- gstat::idw(rainfall~1, P, newdata=grd, idp=2.0)
n   <- length(P)
Zi  <- matrix(nrow = length(img$var1.pred), ncol = n)

# Remove a point then interpolate (do this n times for each point)
st <- stack()
for (i in 1:n){
  Z1 <- gstat::idw(rainfall~1, P[-i,], newdata=grd, idp=2.0)
  st <- addLayer(st,raster(Z1,layer=1))
  # Calculated pseudo-value Z at j
  Zi[,i] <- n * img$var1.pred - (n-1) * Z1$var1.pred
}

# Jackknife estimator of parameter Z at location j
Zj <- as.matrix(apply(Zi, 1, sum, na.rm=T) / n )

# Compute (Zi* - Zj)^2
c1 <- apply(Zi,2,'-',Zj)            # Compute the difference
c1 <- apply(c1^2, 1, sum, na.rm=T ) # Sum the square of the difference

# Compute the confidence interval
CI <- sqrt( 1/(n*(n-1)) * c1)

# Create (CI / interpolated value) raster
img.sig   <- img
img.sig$v <- CI /img$var1.pred 
# Clip the confidence raster to AOI
r <- raster(img.sig, layer="v")
r.m <- raster::mask(r, montes2)

# Plot the map
tm_shape(r.m) + tm_raster(n=7,title="IDW\n95% confidence interval \n(in mm)") +
  tm_shape(P) + tm_dots(size=0.2) +
  
  tm_legend(legend.outside=TRUE)
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

4.3 1st order polynomial fit

To fit a first order polynomial model of the form \(precip = intercept + aX +bY\)

# Define the 1st order polynomial equation
f.1 <- as.formula(rainfall ~ X + Y) 
 
# Add X and Y to P
P$X <- coordinates(P)[,1]
P$Y <- coordinates(P)[,2]

# Run the regression model
lm.1 <- lm( f.1, data=P)

# Use the regression model output to interpolate the surface
dat.1st <- SpatialGridDataFrame(grd, data.frame(var1.pred = predict(lm.1, newdata=grd))) 
# Clip the interpolated raster to Texas
r   <- raster(dat.1st)
r.m <- raster::mask(r, montes2)

# Plot the map
tm_shape(r.m) + 
  tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE, 
            title="1st order polinomial fir \nPredicted precipitation \n(in mm)") +
  tm_shape(P) + tm_dots(size=0.2) +
  tm_legend(legend.outside=TRUE)

4.4 2nd order polynomial fit

To fit a second order polynomial model of the form \(precip = intercept + aX +bY +dX^2 +eY^2 +fXY\)

# Define the 2nd order polynomial equation
f.2 <- as.formula(rainfall ~ X + Y + I(X*X)+I(Y*Y) + I(X*Y))

# Add X and Y to P
P$X <- coordinates(P)[,1]
P$Y <- coordinates(P)[,2]
 
# Run the regression model
lm.2 <- lm( f.2, data=P)

# Use the regression model output to interpolate the surface
dat.2nd <- SpatialGridDataFrame(grd, data.frame(var1.pred = predict(lm.2, newdata=grd))) 
# Clip the interpolated raster to Texas
r   <- raster(dat.2nd)
r.m <- raster::mask(r, montes2)

# Plot the map
tm_shape(r.m) + 
  tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE,midpoint = NA,
            title="2nd order polinomial fit\nPredicted precipitation \n(in mm)") +
  tm_shape(P) + tm_dots(size=0.2) +
  tm_legend(legend.outside=TRUE)

4.5 Kriging

Fit the variogram model

First, we need to create a variogram model. Note that the variogram model is computed on the de-trended data. This is implemented in the following chunk of code by passing the 1st order trend model (defined in an earlier code chunk as formula object f.1) to the variogram function.

# Define the 1st order polynomial equation
f.1 <- as.formula(rainfall ~ X + Y) 

# Compute the sample variogram; note that the f.1 trend model is one of the
# parameters passed to variogram(). This tells the function to create the 
# variogram on the de-trended data.
var.smpl <- variogram(f.1, P, cloud = FALSE, cutoff=100000, width=8990)

# Compute the variogram model by passing the nugget, sill and range values
# to fit.variogram() via the vgm() function.
dat.fit  <- fit.variogram(var.smpl, fit.ranges = TRUE, fit.sills = TRUE,
                          vgm(psill=3, model="Mat", range=150000, nugget=0.0))

## model can be Exp, Gau, Sph, Mat
## See gstat package - vgm function - to get the names of these models
## Review kriging theory to understand their meaning
dat.fit
##   model     psill    range kappa
## 1   Nug 0.0000000     0.00   0.0
## 2   Mat 0.6828444 23216.39   0.5
# The following plot allows us to assess the fit
plot(var.smpl, dat.fit, xlim=c(0,130000))

Generate Kriged surface

Next, use the variogram model dat.fit to generate a kriged interpolated surface. The krige function allows us to include the trend model thus saving us from having to de-trend the data, krige the residuals, then combine the two rasters. Instead, all we need to do is pass krige the trend formula f.1.

# Define the trend model
f.1 <- as.formula(rainfall ~ X + Y) 

# Perform the krige interpolation (note the use of the variogram model
# created in the earlier step)
dat.krg <- krige( f.1, P, grd, dat.fit)
## Warning in proj4string(d$data): CRS object has comment, which is lost in output
## [using universal kriging]
# Convert kriged surface to a raster object for clipping
r <- raster(dat.krg)
r.m <- raster::mask(r, montes2)
r.m
## class      : RasterLayer 
## dimensions : 316, 317, 100172  (nrow, ncol, ncell)
## resolution : 322.7781, 322.7781  (x, y)
## extent     : 821467.1, 923787.8, 1512070, 1614068  (xmin, xmax, ymin, ymax)
## crs        : +proj=tmerc +lat_0=4.596200416666666 +lon_0=-74.07750791666666 +k=1 +x_0=1000000 +y_0=1000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## source     : memory
## names      : var1.pred 
## values     : 1.10649, 5.785779  (min, max)
# Plot the map
tm_shape(r.m) + 
  tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE, 
            title="Universal Kriging\nPredicted precipitation \n(in mm)") +
  tm_shape(P) + tm_dots(size=0.2) +
  tm_legend(legend.outside=TRUE)

A better visualization of the interpolated surface:

library(leaflet)
library(RColorBrewer)
pal <- colorNumeric(c("red", "orange", "yellow", "blue", "darkblue"), values(precip.mask),
  na.color = "transparent")

leaflet() %>% addTiles() %>%
  addRasterImage(r.m, colors = pal, opacity = 0.6) %>%
  addLegend(pal = pal, values = values(r.m),
    title = "Kriging interpolated rainfall in Montes\n From 26.03 to 30.03 in 2020 [mm]")

Generate the variance and confidence interval maps

The dat.krg object stores not just the interpolated values, but the variance values as well. These can be passed to the raster object for mapping as follows:

r   <- raster(dat.krg, layer="var1.var")
r.m <- raster::mask(r, montes2)

tm_shape(r.m) + 
  tm_raster(n=7, palette ="Reds",
            title="Kriging Interpolation\nVariance map \n(in squared mm)") +tm_shape(P) + tm_dots(size=0.2) +
  tm_legend(legend.outside=TRUE)
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

A more easily interpretable map is the 95% confidence interval map which can be generated from the variance object as follows (the map values should be interpreted as the number of mm above and below the estimated rainfall amount).

r   <- sqrt(raster(dat.krg, layer="var1.var")) * 1.96
r.m <- raster::mask(r, montes2)

tm_shape(r.m) + 
  tm_raster(n=7, palette ="Reds",
            title="Kriging Interpolation\n95% CI map \n(in mm)") +tm_shape(P) + tm_dots(size=0.2) +
  tm_legend(legend.outside=TRUE)
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

I did not interpret the output of each interpolation method nor compare the different outputs. These are very important tasks for your report.

sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] maptools_0.9-9      spatstat_1.63-3     rpart_4.1-15       
##  [4] nlme_3.1-147        spatstat.data_1.4-3 RColorBrewer_1.1-2 
##  [7] leaflet_2.0.3       gstat_2.0-6         tmap_3.0           
## [10] forcats_0.5.0       stringr_1.4.0       dplyr_0.8.5        
## [13] purrr_0.3.4         readr_1.3.1         tidyr_1.0.3        
## [16] tibble_3.0.1        ggplot2_3.3.0       tidyverse_1.3.0    
## [19] sf_0.9-3            raster_3.1-5        rgdal_1.4-8        
## [22] sp_1.4-2            knitr_1.28         
## 
## loaded via a namespace (and not attached):
##  [1] leafem_0.1.1          colorspace_1.4-1      deldir_0.1-25        
##  [4] ellipsis_0.3.1        class_7.3-17          base64enc_0.1-3      
##  [7] fs_1.4.1              dichromat_2.0-0       rstudioapi_0.11      
## [10] httpcode_0.3.0        farver_2.0.3          fansi_0.4.1          
## [13] lubridate_1.7.8       xml2_1.3.2            splines_3.6.3        
## [16] codetools_0.2-16      polyclip_1.10-0       jsonlite_1.6.1       
## [19] tmaptools_3.0         broom_0.5.6           dbplyr_1.4.3         
## [22] png_0.1-7             rgeos_0.5-3           compiler_3.6.3       
## [25] httr_1.4.1            backports_1.1.7       assertthat_0.2.1     
## [28] Matrix_1.2-18         lazyeval_0.2.2        cli_2.0.2            
## [31] htmltools_0.4.0       tools_3.6.3           gtable_0.3.0         
## [34] glue_1.4.1            geojson_0.3.2         V8_3.0.2             
## [37] Rcpp_1.0.4.6          cellranger_1.1.0      vctrs_0.3.0          
## [40] crul_0.9.0            leafsync_0.1.0        crosstalk_1.1.0.1    
## [43] lwgeom_0.2-3          xfun_0.14             rvest_0.3.5          
## [46] lifecycle_0.2.0       XML_3.99-0.3          goftest_1.2-2        
## [49] jqr_1.1.0             zoo_1.8-8             scales_1.1.1         
## [52] hms_0.5.3             spatstat.utils_1.17-0 parallel_3.6.3       
## [55] yaml_2.2.1            curl_4.3              stringi_1.4.6        
## [58] e1071_1.7-3           intervals_0.15.2      rlang_0.4.6          
## [61] pkgconfig_2.0.3       evaluate_0.14         lattice_0.20-41      
## [64] tensor_1.5            htmlwidgets_1.5.1     tidyselect_1.1.0     
## [67] magrittr_1.5          R6_2.4.1              geojsonio_0.9.2      
## [70] generics_0.0.2        DBI_1.1.0             mgcv_1.8-31          
## [73] pillar_1.4.4          haven_2.2.0           foreign_0.8-75       
## [76] withr_2.2.0           units_0.6-6           stars_0.4-2          
## [79] xts_0.12-0            abind_1.4-5           spacetime_1.2-3      
## [82] modelr_0.1.6          crayon_1.3.4          KernSmooth_2.23-17   
## [85] rmarkdown_2.1         grid_3.6.3            readxl_1.3.1         
## [88] FNN_1.1.3             reprex_0.3.0          digest_0.6.25        
## [91] classInt_0.4-3        munsell_0.5.0         viridisLite_0.3.0