Spatial Downscaling (Soil Texture Classes)

This is a code to spatially downscale categorical soil maps based off of the dissever R function developed by Malone et al., (2012) and improved by Roudier et al., (2017). In contrast to the dissever algorithm, the rafiki() function downscales soil categorical maps through Monte Carlo sampling. Since this code is in Google Earth Engine, it works well for large geographic areas at a fine resolution. Therefore, we can at least give an idea of what the soil is at a resolution suitable for ones needs.

The data for this example comes from the AfSIS dataset of soil texture classes aggregated to the ERZD (effective root zone depth) of maize found at https://www.data.isric.com (1 km2 resolution) and the covariates from a 30 m NASADEM calculated through Terrain Analysis in Google Earth Engine (TAGEE; Safanelli et al., 2020), climate from www.worldclim.org and oblique geographic coordinates (OGC) from Møller et al. (2020).

Note that images may take a little time to render online.

Load libraries

#Load libraries needed
library(rafikisol)
library(rgee)
ee_Initialize()

## ── rgee 1.1.5 ─────────────────────────────────────── earthengine-api 0.1.356 ── 
##  ✔ user: not_defined
##  ✔ Initializing Google Earth Engine:
 ✔ Initializing Google Earth Engine:  DONE!
## 
 ✔ Earth Engine account: users/trevanflynn 
## ────────────────────────────────────────────────────────────────────────────────

We now set labels and where we want to predict.

#set label of response and parameters
label = "Txt" #response
scale = 30 #resolution to scale to
country = "Rwanda" #which area to predict over

#Load AfSIS texture class grids for EZD
txt = ee$Image("projects/rafkisol/assets/AfSIS_TXTclss")$
  rename("Txt")$#keep at same resolution as original
  int()$reproject(crs = "EPSG: 3857")

#Load samples from Rwanda
validation = ee$FeatureCollection("projects/rafkisol/assets/Rwanda_TXTclss")

#Load country FeatureCollection (bounderies)
worldcountries = ee$FeatureCollection("FAO/GAUL_SIMPLIFIED_500m/2015/level1");
filterCountry = ee$Filter$eq('ADM0_NAME', country);
region = worldcountries$filter(filterCountry);

Lets make a legend for the maps. Its easier to set up the legend to be used for the final map. The names must match the number of colors and must specify “character” in color_mapping argument.

#labels of colors
color_name = c("C", "SiC", "SC", "CL", "SiCL", "SCL", "L", "SiL", "SL", "Si", "LS", "S")

#visParams
vis = list(min = 1, max = 12, palette = rainbow(12), values = color_name)

#make legend
leg = Map$addLegend(vis, name = "Texture", color_mapping = "character")

For the terrain derivatives we use TAGEE which must be installed using:

reticulate::py_install("tagee", pip = T),

Great package for some terrain derivatives but we will remove min, max and shape index (leave null values) unfortunate because they are good covariates for texture. Then we use the ogcCalc() function for the OGC, all bioclim variables and add them to the terrain covariate image.

#we will use terrain data from TAGEE (must be installed)
tagee = reticulate::import("tagee")

#get a water mask
waterMask = ee$Image('UMD/hansen/global_forest_change_2016_v1_4')$
  select('datamask')$neq(1)$eq(0)

#Get NASADEM 
dem = ee$Image("NASA/NASADEM_HGT/001")$select('elevation')$
  clip(region)$rename('DEM')

#Make kernel for smoothing
gauFilter = ee$Kernel$gaussian(
  radius= 3, sigma= 2, units= 'pixels', normalize= T)

#use convolutions 
dem = dem$convolve(gauFilter)$
  resample("bilinear")

#Terrain analysis (13 covariates)
taImg = tagee$terrainAnalysis(dem, region)$
  reproject(crs = "EPSG:3857", scale = 30)

#lets add some more covariates (19 climates and 6 OGC)
coImg = taImg$
  addBands(ogcCalc(area = region, crs = taImg$projection(), nr = 6,
  scale = taImg$projection()$nominalScale()))$#add OGC
  addBands(ee$Image("WORLDCLIM/V1/BIO")$clip(region))$#climate
  reproject(crs = "EPSG:3857", scale = 30)$#Make sure at same
  resample("bilinear")

#remove unwanted covariates
labs = coImg$bandNames()$getInfo() #get names
labs = labs[!labs %in% c("MinimalCurvature", "MaximalCurvature", "ShapeIndex")]#remove
coImg = coImg$select(labs)$updateMask(waterMask) #select covariates and mask

#min
mn = coImg$reduceRegion(reducer = 'min', geometry = region, 
                     maxPixels = 1e13, scale = 30)

#max
mx = coImg$reduceRegion(reducer = 'max', geometry = region, 
                     maxPixels = 1e13, scale = 30)

#map just to check
Map$centerObject(region, 8)#set center

visEl = list(min = mn$getNumber("Elevation")$getInfo(), max = mx$getNumber("Elevation")$getInfo(), palette = rev(topo.colors(10)))#set visual

Map$addLayer(coImg$select("Elevation"), visEl)+Map$addLegend(visEl, name = "Elevation (m)")

Seems really high but Rwanda is on the African plateau and is part of the Virunga Mountain range so it is pretty high.

For the downscaling, the rafiki() function is used. Here we return the modal of bootstrap predictions [[1]], all predictions [[2]], each model [[3]] and the samples [[4]].

The rafiki() function inputs are:

covar = covariate image.

response = response image of higher scale.

scale = if you want to evaluate at a certain scale (default = scale of response).

model = character (default = "rf"), model to use (rf, gbt, cart, svm, ).  

stype = sampling type, either "s" (stratified) or "r" (random) (default = "r").

sampNum = number of samples per bootstrap.

boot = number of bootstraps to perform. 

... = other params of each model. For rf, gbt, svm there are requirements (see GEE doc) such as numberOfTrees for rf and gbt

#Downscale
preds = rafiki(covar = coImg, response = txt,
                 model = 'gtb', stype = "r", sampNum = 500,
                 numberOfTrees = 50)

Now plot the predictions from the spatial downscaling.

#check out switching to terrain background (givers better representation)
Map$addLayer(preds[['modal']], vis)+leg

See how it compares to just resampling.

#clip and nearest 
example = txt$clip(region)$
  reproject(crs = "EPSG:4326", scale = 30)

#plot same way
Map$addLayer(example,vis)+leg

There are some abnormalities relative to the ISRIC map but now we can use the ground truth to evaluate the predictions.

#extract prediction values
eval = preds[['modal']]$sampleRegions(
  collection = validation,
  properties = list(label),
  scale = scale,
  geometries = T)

#confusion matrix
errMat = eval$errorMatrix(label, "modal")

#some metricse
print(errMat$kappa()$getInfo())

## [1] 0.5865394

print(errMat$accuracy()$getInfo())

## [1] 0.7065728

print(errMat$consumersAccuracy()$getInfo())

## [[1]]
## [[1]][[1]]
## [1] 0
## 
## [[1]][[2]]
## [1] 0.921875
## 
## [[1]][[3]]
## [1] 0
## 
## [[1]][[4]]
## [1] 0
## 
## [[1]][[5]]
## [1] 0.5919283
## 
## [[1]][[6]]
## [1] 0
## 
## [[1]][[7]]
## [1] 0.7807018
## 
## [[1]][[8]]
## [1] 1

print(errMat$producersAccuracy()$getInfo())

## [[1]]
## [1] 0
## 
## [[2]]
## [1] 0.7763158
## 
## [[3]]
## [1] 0
## 
## [[4]]
## [1] 0
## 
## [[5]]
## [1] 0.9041096
## 
## [[6]]
## [1] 0
## 
## [[7]]
## [1] 0.712
## 
## [[8]]
## [1] 0.5121951

Predicted 4 out of the 7 (exclude 0) of the classes well but not bad its actually pretty good results. Lets see what the covariates can tell us.

Calculate importance and make a chart with impCalc(). Inputs are:

models = Results from rafiki() function.
mn = The number of covariates to display in plot.

#get importance
coImp = impCalc(models = preds, mn = 10)

#plot results
coImp[["plot"]]

Looks like OGC are extremely good covariates with being the most important.

Get the probabilities of the predictions with probCalc(). Unfortunetly, we have to manually figure out how many classes were predicted from the model first or we will run out of mememory in the free GEE. The inputs are:

models = results from rafiki() function.
covar = covariate image.
levels = The number of classes actually predicted.

#probabilities
probs = probCalc(models = preds, covar = coImg, levels = 4)

Lets see the confusion index

#calculate
ci = ciCalc(probs = probs)

#vis
visCI = list(min = 0, max = 1, palette = c("white", "grey", "black"))

#Plot
Map$addLayer(ci, visCI) + Map$addLegend(visCI, name = "CI")

Confusion index looks pretty high, so needs improvement. However, at this scale its difficult to know if this confusion index was the right one to use.

REFERENCES

Burrough, P.A., Van Gaans, P.F.M., Hootsmans, R., (1997). Continuous classification in soil survey: spatial correlation, confusion and boundaries. Geoderma 77, 115–135. https://doi.org/10.1016/S0016-7061(97)00018-9.

Brendan P. Malone, Alex B. McBratney, Budiman Minasny, Ichsani Wheeler, (2012). A general method for downscaling earth resource information. Computers & Geosciences, Volume 41, Pages 119-125. https://doi.org/10.1016/j.cageo.2011.08.021.

P. Roudier, B.P. Malone, C.B. Hedley, B. Minasny, A.B. McBratney, (2017). Comparison of regression methods for spatial downscaling of soil organic carbon stocks maps. Computers and Electronics in Agriculture. Volume 142, Part A, Pages 91-100. https://doi.org/10.1016/j.compag.2017.08.021.

Møller, A. B., Beucher, A. M., and Pouladi, N., Greve, M. H. (2020) Oblique geographic coordinates as covariates for digital soil mapping, SOIL 6(2) 269 - 289https:/10.5194/soil-6-269-2020

Safanelli, J.L.; Poppiel, R.R.; Ruiz, L.F.C.; Bonfatti, B.R.; Mello, F.A.O.; Rizzo, R.; Demattê, J.A.M. Terrain Analysis in Google Earth Engine: A Method Adapted for High-Performance Global-Scale Analysis. ISPRS Int. J. Geo-Inf. 2020, 9, 400. DOI: https://doi.org/10.3390/ijgi9060400

Spatial Downscaling (Soil Texture Classes)

Trevan Flynn

2023-06-27