Objetivo de la practica. Introducir los conceptos basicos de geoestadistica utilizando el conjunto de datos de “Meuse”.
Esta practica se basa en el documento: “An introduction to geostatistics with R/gstat” de D G Rossiter, Cornell University, School of Integrative Plant Science, Crop and Soil Sciences Section. Febr, 2022. http://www.css.cornell.edu/faculty/dgr2/_static/files/R_PDF/gs_intro_sf.pdf
Otros materiales del mismo autor: http://www.css.cornell.edu/faculty/dgr2/tutorials/index.html
Hengl, T. (2009). A practical guide to geostatistical mapping. http://spatial-analyst.net/book/system/files/Hengl_2009_GEOSTATe2c1w.pdf
# install.packages(c("sf", "sp", "gstat", "ggplot2"), dependencies=TRUE)
library(sf)## Warning: package 'sf' was built under R version 4.1.1
library(sp)
library(gstat)
library(ggplot2)Utilizaremos el conjunto de datos “Meuse” de polucion del suelo con muestras de concentracion de metales pesados (paquete sp). Meuse esta previamente cargado en R, de modo que solo lo llamamos con data()
data(package="sp")
data(meuse)
str(meuse) #estructura de datos meuse. Algunas variables son continuas y otras categoricas (factors)## 'data.frame': 155 obs. of 14 variables:
## $ x : num 181072 181025 181165 181298 181307 ...
## $ y : num 333611 333558 333537 333484 333330 ...
## $ cadmium: num 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
## $ copper : num 85 81 68 81 48 61 31 29 37 24 ...
## $ lead : num 299 277 199 116 117 137 132 150 133 80 ...
## $ zinc : num 1022 1141 640 257 269 ...
## $ elev : num 7.91 6.98 7.8 7.66 7.48 ...
## $ dist : num 0.00136 0.01222 0.10303 0.19009 0.27709 ...
## $ om : num 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
## $ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
## $ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
## $ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
## $ landuse: Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
## $ dist.m : num 50 30 150 270 380 470 240 120 240 420 ...
EDA Exploratory Data Analysis
En este caso nos centraremos en la variable ZINC.
Se comenzara explorando el histograma. El mismo muestra una distribucion no simetrica, sesgada hacia la derecha. Del mismo modo que summary muestra que la media es mayor a la mediana (sesgada hacia la derecha, distribucion no-normal).
meuse$zinc## [1] 1022 1141 640 257 269 281 346 406 347 183 189 251 1096 504 326
## [16] 1032 606 711 735 1052 673 402 343 218 200 194 207 180 240 180
## [31] 208 198 250 192 213 321 569 833 906 1454 298 167 176 258 746
## [46] 746 464 365 282 375 222 812 1548 1839 1528 933 432 550 1571 1190
## [61] 907 761 659 643 801 784 1060 119 778 703 676 793 685 593 549
## [76] 680 539 560 1136 1383 1161 1672 765 279 241 317 545 505 420 332
## [91] 400 553 577 155 224 180 226 186 198 187 199 157 203 143 136
## [106] 117 113 130 192 240 221 140 128 166 191 232 203 722 210 198
## [121] 139 253 703 832 262 142 119 152 415 474 126 210 220 133 141
## [136] 158 129 206 451 296 189 154 169 403 471 612 601 783 258 214
## [151] 166 496 342 162 375
sort(meuse$zinc)## [1] 113 117 119 119 126 128 129 130 133 136 139 140 141 142 143
## [16] 152 154 155 157 158 162 166 166 167 169 176 180 180 180 183
## [31] 186 187 189 189 191 192 192 194 198 198 198 199 200 203 203
## [46] 206 207 208 210 210 213 214 218 220 221 222 224 226 232 240
## [61] 240 241 250 251 253 257 258 258 262 269 279 281 282 296 298
## [76] 317 321 326 332 342 343 346 347 365 375 375 400 402 403 406
## [91] 415 420 432 451 464 471 474 496 504 505 539 545 549 550 553
## [106] 560 569 577 593 601 606 612 640 643 659 673 676 680 685 703
## [121] 703 711 722 735 746 746 761 765 778 783 784 793 801 812 832
## [136] 833 906 907 933 1022 1032 1052 1060 1096 1136 1141 1161 1190 1383 1454
## [151] 1528 1548 1571 1672 1839
hist(meuse$zinc, breaks = 16) summary(meuse$zinc) ## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 113.0 198.0 326.0 469.7 674.5 1839.0
Al ser una distribucion no simetrica, se aplica logaritmo para transformar los valores y obtener una distribucion simetrica (normal). Esto, ademas, reduce los posibles outliers.
En este caso se utiliza log. Luego para transformar a los valores originales se utiliza antilog (antilog (x)=10^x)
meuse$logZn <- log10(meuse$zinc)
hist(meuse$logZn, breaks = 16)summary(meuse$logZn)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.053 2.297 2.513 2.556 2.829 3.265
head(meuse$zinc)## [1] 1022 1141 640 257 269 281
head(meuse$logZn)## [1] 3.009451 3.057286 2.806180 2.409933 2.429752 2.448706
head(10^(meuse$logZn)) ## [1] 1022 1141 640 257 269 281
Estructura espacial local –> autocorrelacion
Se comienza georrefereciando los datos mesue, y para ello se utiliza el paquete sf. Luego se explora el tipo de dato.
class(meuse)## [1] "data.frame"
meuse.sf <- st_as_sf(meuse, coords = c("x","y"))
class(meuse.sf)## [1] "sf" "data.frame"
str(meuse.sf)## Classes 'sf' and 'data.frame': 155 obs. of 14 variables:
## $ cadmium : num 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
## $ copper : num 85 81 68 81 48 61 31 29 37 24 ...
## $ lead : num 299 277 199 116 117 137 132 150 133 80 ...
## $ zinc : num 1022 1141 640 257 269 ...
## $ elev : num 7.91 6.98 7.8 7.66 7.48 ...
## $ dist : num 0.00136 0.01222 0.10303 0.19009 0.27709 ...
## $ om : num 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
## $ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
## $ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
## $ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
## $ landuse : Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
## $ dist.m : num 50 30 150 270 380 470 240 120 240 420 ...
## $ logZn : num 3.01 3.06 2.81 2.41 2.43 ...
## $ geometry:sfc_POINT of length 155; first list element: 'XY' num 181072 333611
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## ..- attr(*, "names")= chr [1:13] "cadmium" "copper" "lead" "zinc" ...
Se reasigna el sistema de referencia correcto.
st_crs(meuse.sf)## Coordinate Reference System: NA
st_crs(meuse.sf) <- 28992
print(st_crs(meuse.sf))## Coordinate Reference System:
## User input: EPSG:28992
## wkt:
## PROJCRS["Amersfoort / RD New",
## BASEGEOGCRS["Amersfoort",
## DATUM["Amersfoort",
## ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4289]],
## CONVERSION["RD New",
## METHOD["Oblique Stereographic",
## ID["EPSG",9809]],
## PARAMETER["Latitude of natural origin",52.1561605555556,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",5.38763888888889,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9999079,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",155000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",463000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Engineering survey, topographic mapping."],
## AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
## BBOX[50.75,3.2,53.7,7.22]],
## ID["EPSG",28992]]
A contuniacion visualizaremos la distribucion espacial de los puntos. Los puntos no estan regularmente distribuidos sino que son mas densos en la cercania del rio. Previo a esto se cargan la capa de rio.
data(meuse.riv, package="sp")
class(meuse.riv)## [1] "matrix" "array"
meuse.riv.sf <- st_sfc(st_linestring(meuse.riv, dim="XY"), crs = st_crs(meuse.sf))
class(meuse.riv.sf)## [1] "sfc_LINESTRING" "sfc"
summary(meuse.riv.sf)## LINESTRING epsg:28992 +proj=ster...
## 1 0 0
ggplot(data = meuse.sf) +
geom_sf(mapping = aes(size=zinc, color=dist.m)) +
labs(x = "Longitude", y = "Latitude", title = "Meuse River (NL)",
color="Distance to river [m]", size = "Zn concentration, ppm")Otra representacion.
plot(meuse.sf["zinc"],
reset = FALSE,
nbreaks = 64, pch = 20,
cex=4*meuse.sf$zinc/max(meuse.sf$zinc),
main = "Zn concentration [ppm]")
plot(meuse.riv.sf, add = TRUE)La dependencia espacial local significa que cuanto mas cerca esten los puntos en el espacio geografico, mas cerca tambien lo estan en el espacio de atributos. Esto se llama autocorrelacion.
Los valores de un atributo pueden estar correlacionados con si mismos, y la fuerza de esta correlacion depende de la distancia de separacion entre puntos.
Cada par de puntos, estara separado por una distancia en el espacio geografico y una semivarianza en el espacio de atributos.
La semivarianza se calcula de la siguiente manera: hagamos algunos calculos a mano.
#Calcule cuantos pares de puntos hay en el dataset meuse.
n <- length(meuse.sf$logZn)
n*(n-1)/2## [1] 11935
#calcule la distancia y semivarianza entre los dos puntos primeros puntos del dataset
dim(st_coordinates(meuse.sf))## [1] 155 2
st_coordinates(meuse.sf)[1,]## X Y
## 181072 333611
st_coordinates(meuse.sf)[2,]## X Y
## 181025 333558
(sep <- dist(st_coordinates(meuse.sf)[1:2,]))## 1
## 2 70.83784
(gamma <- 0.5 * (meuse.sf$logZn[1] - meuse.sf$logZn[2])^2) # semivarianza, unidades log(mg kg-1)^2## [1] 0.001144082
gamma## [1] 0.001144082
Como la semivarianza y distancia se relacionan en toda el area de estudio? Es a traves del variograma empirico o experimental. Se define como la semivarianza media dividida para un rango de separacion determinado.
Graficaremos el variograma experimental de las concentraciones de Zinc (log). Es decir, la media de las semivarianzas, respecto de la distancia media (bin o lag). En este caso de 90m, hasta una distancia maxima (cutoff) de 1300m.
(ve <- variogram(logZn ~ 1, meuse.sf, cutoff = 1300,width = 90))## np dist gamma dir.hor dir.ver id
## 1 41 72.24836 0.02649954 0 0 var1
## 2 212 142.88031 0.03242411 0 0 var1
## 3 320 227.32202 0.04818895 0 0 var1
## 4 371 315.85549 0.06543093 0 0 var1
## 5 423 406.44801 0.08025949 0 0 var1
## 6 458 496.09401 0.09509850 0 0 var1
## 7 455 586.78634 0.10656591 0 0 var1
## 8 466 677.39566 0.10333481 0 0 var1
## 9 503 764.55712 0.11461332 0 0 var1
## 10 480 856.69422 0.12924402 0 0 var1
## 11 468 944.02864 0.12290106 0 0 var1
## 12 460 1033.62277 0.12820318 0 0 var1
## 13 422 1125.63214 0.13206510 0 0 var1
## 14 408 1212.62350 0.11591294 0 0 var1
## 15 173 1280.65364 0.11719960 0 0 var1
variogram: genera la nube de variograma.
logZn ~ 1: logZn es dependiente de si misma –> autocorrelacion.
Por defecto (si no se especifica cutoff y with) el cutoff es 1/3 de la maxima distancia (diagonal del bbox). Esto es dividido en 15 clases igualmente espaciadas.
Cual es la evidencia de dependencia espacial local?
np=numero de pares de puntos para cada una de las 15 clases,
dist=distancia media,
gamma=semivarianza media.
A medida que la distancia aumenta, tambien lo hace la semivarianza, pero hasta una distancia determinada donde la semivarianza se estabiliza.
En el siguiente grafico, identifica los argumentos del semivaririograma: nugget, sill y range.
print(plot(ve, plot.numbers = TRUE, pch = 20,
xlab = "separacion", ylab = "semivarianza [log10(Zn)^2]"))Modelo de variograma o variograma teorico
Funcion que ajusta el variograma teorico. Hay diferentes tipos de funciones que pueden ser utiles:
show.vgms()Metodos para seleccionar el modelo que mejor ajusta al variograma teorico.
Conocer el proceso espacial que gobierna a la variable
Ajuste visual
Ajuste automatico y comparar la bondad del ajuste
Range o rango: separacion o distancia entre pares de puntos en la cual ya no hay dependencia espacial. aprox 850m
Nugget o pepita: semivarianza a la separacion de 0m. aprox 0.01
Total-sill o meseta: semivarianza a la distancia del rango. aprox 0.13
Partial-sill o meseta parcial: total sill - nugget. aprox 0.12
vgm genera el modelo de variograma
vt <- vgm(psill = 0.12, model = "Sph", range = 850,nugget = 0.01)
vt## model psill range
## 1 Nug 0.01 0
## 2 Sph 0.12 850
plot(ve, pl = T, model = vt)Ajuste automatico fit.variogram: ajuta el modelo de variograma a un variograma empirico.
va <- fit.variogram(ve, vt)
va## model psill range
## 1 Nug 0.01004123 0.0000
## 2 Sph 0.11525698 967.2634
plot(ve, pl = T, model = va)Usualmente kriging se utiliza para predecir los pixeles (o nodos) de una malla regular que cubre la zona de estudio. kriging ordinario “ordinary” significa que (1) la variable es modelada a partir de si misma; (2) la media espacial no es conocida a priori, sino estimada de los datos.
Primero se obtiene una grilla en la cual nos basaremos para la interpolacion.
data(meuse.grid, package="sp")
class(meuse.grid)## [1] "data.frame"
names(meuse.grid)## [1] "x" "y" "part.a" "part.b" "dist" "soil" "ffreq"
meuse.grid.sf <- st_as_sf(meuse.grid, coords = c("x", "y"))
st_crs(meuse.grid.sf) <- st_crs(meuse.sf)ok <- krige(logZn ~ 1, locations = meuse.sf, newdata = meuse.grid.sf, model = va) ## [using ordinary kriging]
ok$pred <- 10^(ok$var1.pred)#volver a valores originales
str(ok)## Classes 'sf' and 'data.frame': 3103 obs. of 4 variables:
## $ var1.pred: num 2.83 2.88 2.83 2.78 2.94 ...
## $ var1.var : num 0.0596 0.0471 0.0509 0.055 0.0335 ...
## $ geometry :sfc_POINT of length 3103; first list element: 'XY' num 181180 333740
## $ pred : num 679 763 680 605 871 ...
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
## ..- attr(*, "names")= chr [1:3] "var1.pred" "var1.var" "pred"
plot(ok["var1.pred"], pch=15,
nbreaks = 64,
main="OK prediction [log10(Zn ppm)]")plot(ok["var1.var"], pch=15,
nbreaks = 64,
pal = cm.colors,
main="OK prediction variance [log10(Zn ppm)^2]")plot(ok["var1.pred"], pch=15,
nbreaks = 64,
reset = FALSE, # allow further elements to be plotted
main="OK prediction [log10(Zn ppm)]")
plot(meuse.sf["logZn"], add = TRUE, pch = 21,
col = "lightgray",
cex=4*meuse.sf$zinc/max(meuse.sf$zinc))plot(ok["var1.var"], pch=15,
nbreaks = 64,
reset = FALSE, # allow further elements to be plotted
pal = cm.colors,
main="OK prediction variance [log10(Zn ppm)^2]")
plot(meuse.sf["logZn"], add = TRUE, pch = 20,
col = "darkgray")Otros metodos de interpolacion no-geoestadisticos
Poligonos de Thiessen: 1. Cambios abruptos de bordes.
Inverso de la distancia: 1. No hay una manera objetiva de seleccionar el peso (inverso o inveso cuadrado…).
En todos los casos, la distribucion no regular puede sobre-enfatizar areas (o lo contrario). La varianza debe ser estimada a partir de un conjunto de datos de validacion, especificamente reservado para ello.
thiessen = krige(log10(zinc) ~ 1, meuse.sf, meuse.grid.sf, nmax = 1)## [inverse distance weighted interpolation]
pts.s <- list("sp.points", meuse.sf, col="white",pch=20)
plot(thiessen["var1.pred"], pch=15,main="Thiessen")idw = idw(log10(zinc) ~ 1, meuse.sf, meuse.grid.sf)## [inverse distance weighted interpolation]
plot(idw["var1.pred"], pch=15,main="IDW")#Cambiando el numero de vecinos
idw$vecino<- krige(log10(zinc) ~ 1, meuse.sf, meuse.grid.sf,nmax=6)## [inverse distance weighted interpolation]
#Cambiando pesos
idw$idp05 = idw(log10(zinc) ~ 1, meuse.sf, meuse.grid.sf, idp = 0.5)$var1.pred## [inverse distance weighted interpolation]
idw$idp5 = idw(log10(zinc) ~ 1, meuse.sf, meuse.grid.sf, idp = 5)$var1.pred## [inverse distance weighted interpolation]
idw$idp10 = idw(log10(zinc) ~ 1, meuse.sf, meuse.grid.sf, idp = 10)$var1.pred## [inverse distance weighted interpolation]
plot(idw[c("var1.pred","idp05", "idp5", "idp10")], pch=15,main="IDW")thiessen.cv.a <- krige.cv(log10(zinc) ~ 1, meuse.sf,nmax = 1)
idw.cv.a <- krige.cv(log10(zinc) ~ 1, meuse.sf)
ok.cv.a <- krige.cv(log10(zinc) ~ 1, locations = meuse.sf, model = va)thiessen.cv.a[1:5,]## Simple feature collection with 5 features and 6 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 181025 ymin: 333330 xmax: 181307 ymax: 333611
## Projected CRS: Amersfoort / RD New
## var1.pred var1.var observed residual zscore fold geometry
## 1 3.057286 NA 3.009451 -0.04783475 NA 1 POINT (181072 333611)
## 2 3.009451 NA 3.057286 0.04783475 NA 2 POINT (181025 333558)
## 3 3.009451 NA 2.806180 -0.20327092 NA 3 POINT (181165 333537)
## 4 2.806180 NA 2.409933 -0.39624685 NA 4 POINT (181298 333484)
## 5 2.448706 NA 2.429752 -0.01895404 NA 5 POINT (181307 333330)
idw.cv.a[1:5,]## Simple feature collection with 5 features and 6 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 181025 ymin: 333330 xmax: 181307 ymax: 333611
## Projected CRS: Amersfoort / RD New
## var1.pred var1.var observed residual zscore fold geometry
## 1 2.830957 NA 3.009451 0.1784941 NA 1 POINT (181072 333611)
## 2 2.797944 NA 3.057286 0.2593420 NA 2 POINT (181025 333558)
## 3 2.687633 NA 2.806180 0.1185473 NA 3 POINT (181165 333537)
## 4 2.605734 NA 2.409933 -0.1958009 NA 4 POINT (181298 333484)
## 5 2.505657 NA 2.429752 -0.0759050 NA 5 POINT (181307 333330)
ok.cv.a[1:5,]## Simple feature collection with 5 features and 6 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 181025 ymin: 333330 xmax: 181307 ymax: 333611
## Projected CRS: Amersfoort / RD New
## var1.pred var1.var observed residual zscore fold
## 1 2.941046 0.03408723 3.009451 0.068404693 0.37050156 1
## 2 2.938452 0.03300608 3.057286 0.118833193 0.65409497 2
## 3 2.735836 0.03416563 2.806180 0.070343768 0.38056679 3
## 4 2.627050 0.04264164 2.409933 -0.217116908 -1.05142115 4
## 5 2.425835 0.03281227 2.429752 0.003917739 0.02162805 5
## geometry
## 1 POINT (181072 333611)
## 2 POINT (181025 333558)
## 3 POINT (181165 333537)
## 4 POINT (181298 333484)
## 5 POINT (181307 333330)
Correlacion
par(mfrow=c(1,3))
print(plot(var1.pred~observed,thiessen.cv.a, main="Thiessen"), split=c(1,1,3,1), more=TRUE)## NULL
print(plot(var1.pred~observed,idw.cv.a, main="IDW"), split=c(2,1,3,1), more=TRUE)## NULL
print(plot(var1.pred~observed,ok.cv.a, main="OK"), split=c(3,1,2,1), more=FALSE)## NULL
Correlacion, idealmente 1
cor(thiessen.cv.a$var1.pred,thiessen.cv.a$observed)## [1] 0.685839
cor(idw.cv.a$var1.pred,idw.cv.a$observed)## [1] 0.7640391
cor(ok.cv.a$var1.pred,ok.cv.a$observed)## [1] 0.8340013
Media de residuos, idealmente 0
mean(thiessen.cv.a$residual)## [1] 0.003457601
mean(idw.cv.a$residual)## [1] -0.005565866
mean(ok.cv.a$residual)## [1] -0.0001470167
Desviaciones estandar del error de interpolacion (residuos). Idealmente pequeno
sd(thiessen.cv.a$residual)## [1] 0.2463506
sd(idw.cv.a$residual)## [1] 0.2238086
sd(ok.cv.a$residual)## [1] 0.1731181
Boxplots
boxplot(thiessen.cv.a$residual, idw.cv.a$residual, ok.cv.a$residual, main="Thiessen, IDW, OK")MSPE (mean square predictor error), idealmente pequeno
mean(thiessen.cv.a$residual^2)## [1] 0.06030903
mean(idw.cv.a$residual^2)## [1] 0.0497981
mean(ok.cv.a$residual^2)## [1] 0.02977656
Error medio cuadratico (RMSE) es una medida general. Idealmente pequeno
sqrt(sum(thiessen.cv.a$residual^2)/length(thiessen.cv.a$residual))## [1] 0.245579
sqrt(sum(idw.cv.a$residual^2)/length(idw.cv.a$residual))## [1] 0.2231549
sqrt(sum(ok.cv.a$residual^2)/length(ok.cv.a$residual))## [1] 0.1725589
Varianza de residuos, idealmente cercanos a cero
var(thiessen.cv.a$residual, na.rm=T)## [1] 0.06068862
var(idw.cv.a$residual, na.rm=T)## [1] 0.05009028
var(ok.cv.a$residual, na.rm=T)## [1] 0.02996989
Guardar y recuperar archivos en memoria y espacio de trabajo
save.image(file = "geostat_b.RData")
load(file = "geostat_b.RData")library(automap)Funcion autofitVariogram
coordinates(meuse) = ~x+y
proj4string(meuse)=CRS("+init=epsg:28992")## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum Amersfoort in Proj4 definition
autoZn=autofitVariogram(log(zinc)~1, meuse)
summary(autoZn)## Experimental variogram:
## np dist gamma dir.hor dir.ver id
## 1 17 59.33470 0.1102869 0 0 var1
## 2 36 86.01449 0.1382850 0 0 var1
## 3 114 131.02870 0.1541601 0 0 var1
## 4 149 176.18845 0.2506480 0 0 var1
## 5 184 226.75652 0.2499933 0 0 var1
## 6 711 337.60359 0.3776705 0 0 var1
## 7 830 502.04773 0.4874784 0 0 var1
## 8 1349 713.21485 0.5928954 0 0 var1
## 9 1314 961.27179 0.6711660 0 0 var1
## 10 1139 1213.41157 0.6374432 0 0 var1
## 11 1355 1506.55052 0.5748673 0 0 var1
##
## Fitted variogram model:
## model psill range
## 1 Nug 0.04848089 0.0000
## 2 Sph 0.58754741 889.9084
## Sums of squares betw. var. model and sample var.[1] 1.434433e-05
plot(autoZn, pch=19, col="black")Funcion autokrige
#paletas de colores
bluepal=colorRampPalette(c("azure1", "steelblue4"))
brks=c(4.5,5,5.5,6,6.5,7,7.5,8)
mycol=bluepal(length(brks)-1)
coordinates(meuse.grid)=~x+y
proj4string(meuse.grid)=CRS("+init=epsg:28992")## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum Amersfoort in Proj4 definition
meuse.grid=as(meuse.grid,"SpatialPixelsDataFrame")
ZnAKrig=autoKrige(log(zinc)~1, meuse, meuse.grid, model="Sph")## Warning in proj4string(input_data): CRS object has comment, which is lost in
## output
## Warning in proj4string(new_data): CRS object has comment, which is lost in
## output
## Warning in proj4string(input_data): CRS object has comment, which is lost in
## output
## Warning in proj4string(new_data): CRS object has comment, which is lost in
## output
## [using ordinary kriging]
plot(ZnAKrig, col.regions=mycol)automapPlot(ZnAKrig$krige_output, "var1.pred",
sp.layout = list("sp.points", meuse))# Extracting parts from the autoKrige object
prediction_spdf = ZnAKrig$krige_output
sample_variogram = ZnAKrig$exp_var
variogram_model = ZnAKrig$var_modelFuncion autoKrige.cv
kr.cv = autoKrige.cv(log(zinc)~1, meuse, model = c("Exp"), nfold = 10)##
|
| | 0%
|
|======== | 11%
|
|================ | 22%
|
|======================= | 33%
|
|=============================== | 44%
|
|======================================= | 56%
|
|=============================================== | 67%
|
|====================================================== | 78%
|
|============================================================== | 89%
|
|======================================================================| 100%
kr.cv$krige.cv_output[1:5,]## coordinates var1.pred var1.var observed residual zscore fold
## 1 (181072, 333611) 6.826644 0.1635531 6.929517 0.10287273 0.25437293 10
## 2 (181025, 333558) 6.783560 0.1628050 7.039660 0.25610041 0.63471162 4
## 3 (181165, 333537) 6.294457 0.1852956 6.461468 0.16701147 0.38798392 2
## 4 (181298, 333484) 6.052150 0.2418010 5.549076 -0.50307389 -1.02306381 6
## 5 (181307, 333330) 5.600439 0.1763132 5.594711 -0.00572719 -0.01363952 5
idw.cv = autoKrige.cv(log(zinc)~1, meuse, nfold = 10)##
|
| | 0%
|
|======== | 11%
|
|================ | 22%
|
|======================= | 33%
|
|=============================== | 44%
|
|======================================= | 56%
|
|=============================================== | 67%
|
|====================================================== | 78%
|
|============================================================== | 89%
|
|======================================================================| 100%
idw.cv$krige.cv_output[1:5,]## coordinates var1.pred var1.var observed residual zscore fold
## 1 (181072, 333611) 6.769779 0.1782166 6.929517 0.1597377 0.37838458 7
## 2 (181025, 333558) 6.768315 0.1730285 7.039660 0.2713450 0.65232344 9
## 3 (181165, 333537) 6.302869 0.1804978 6.461468 0.1585987 0.37330497 4
## 4 (181298, 333484) 6.041531 0.2267957 5.549076 -0.4924553 -1.03406877 3
## 5 (181307, 333330) 5.580586 0.1728805 5.594711 0.0141255 0.03397275 6
Funcion compare.cv
compare.cv(kr.cv, idw.cv,bubbleplots = TRUE, col.names = c("OK","IDW"))## OK IDW
## mean_error -0.002162 0.007606
## me_mean -0.0003673 0.001292
## MAE 0.2953 0.3028
## MSE 0.1589 0.1658
## MSNE 0.8358 0.8722
## cor_obspred 0.8334 0.8247
## cor_predres 0.06811 0.03551
## RMSE 0.3986 0.4072
## RMSE_sd 0.5521 0.5641
## URMSE 0.3986 0.4072
## iqr 0.4415 0.424
compare.cv(kr.cv, idw.cv,bubbleplots = TRUE, col.names = c("OK","IDW"),plot.diff = TRUE)## OK IDW
## mean_error -0.002162 0.007606
## me_mean -0.0003673 0.001292
## MAE 0.2953 0.3028
## MSE 0.1589 0.1658
## MSNE 0.8358 0.8722
## cor_obspred 0.8334 0.8247
## cor_predres 0.06811 0.03551
## RMSE 0.3986 0.4072
## RMSE_sd 0.5521 0.5641
## URMSE 0.3986 0.4072
## iqr 0.4415 0.424
Realiza las interpolaciones para una variable: Selecciona el modelo/metodo de interpolacion que realice una mejor prediccion.
Ha llegado al final de este material!