Laporan Praktikum 2
Tugas STA553 - Analisis Statistika Spasial
Klik disini untuk ke halaman rpubs.
setwd(“D:\Kuliah S2 IPB\Bahan Kuliah\Semester 2 SSD 2020\STA533 Spasial\PRAKTIKUM\R”)
Sebaran Spasial Dua Tipe Titik (Responsi Pertemuan 3)
MultiType Point Patterns
A marked point pattern in which the marks are a categorical variable is usually called a multitype point pattern. The types are the different values or levels of the mark variable.
Packages
For reference, here is a list of the standard point pattern datasets that are supplied with the installation of spatstat.
library(sp)
library(gstat)
library(rgdal) # membaca shapefile
library(spData)
library(spdep) # pembobot data spasial
library(spatstat)Ants Data Set
Data dalam package R
Disini digunakan Ants datasets
help(ants)## starting httpd help server ... done
Harkness-Isham ants’ nests data
Description
These data give the spatial locations of nests of two species of ants, Messor wasmanni and Cataglyphis bicolor, recorded by Professor R.D. Harkness at a site in northern Greece, and described in Harkness & Isham (1983). The full dataset (supplied here) has an irregular polygonal boundary, while most analyses have been confined to two rectangular subsets of the pattern (also supplied here).
The harvester ant M. wasmanni collects seeds for food and builds a nest composed mainly of seed husks. C. bicolor is a heat-tolerant desert foraging ant which eats dead insects and other arthropods. Interest focuses on whether there is evidence in the data for intra-species competition between Messor nests (i.e. competition for resources) and for preferential placement of Cataglyphis nests in the vicinity of Messor nests.
The full dataset is displayed in Figure 1 of Harkness & Isham (1983). See Usage below to produce a comparable plot. It comprises 97 nests (68 Messor and 29 Cataglyphis) inside an irregular convex polygonal boundary, together with annotations showing a foot track through the region, the boundary between field and scrub areas inside the region, and indicating the two rectangular subregions A and B used in their analysis.
Rectangular subsets of the data were analysed by Harkness & Isham (1983), Isham (1984), Takacs & Fiksel (1986), S"arkk"a (1993, section 5.3), H"ogmander and S"arkk"a (1999) and Baddeley & Turner (2000). The full dataset (inside its irregular boundary) was first analysed by Baddeley & Turner (2005b).
The dataset ants is the full point pattern enclosed by the irregular polygonal boundary. The x and y coordinates are eastings (E-W) and northings (N-S) scaled so that 1 unit equals 0.5 feet. This is a multitype point pattern object, each point carrying a mark indicating the ant species (with levels Cataglyphis and Messor).
Semut yang diamati
- Loads Ants Data Set *
data(ants)
ants## Marked planar point pattern: 97 points
## Multitype, with levels = Cataglyphis, Messor
## window: polygonal boundary
## enclosing rectangle: [-25, 803] x [-49, 717] units (one unit = 0.5 feet)
plot(ants)plot(split(ants))Metode Kuadran
quad1 <- quadratcount(split(ants))
plot(quad1)catag<-quad1[[1]]
mess<-quad1[[2]]
Obs<-matrix(NA,2,2)
rownames(Obs)<-c("A","0")
colnames(Obs)<-c("B","0")
Obs[1,1]<-sum(((catag>0)+(mess>0))==2) #ab
Obs[2,1]<-sum(((catag>0)-(mess>0))==1) #a0
Obs[1,2]<-sum(((catag>0)-(mess>0))==-1) #0b
Obs[2,2]<-sum(((catag>0)+(mess>0))==0) #00
Tes<-chisq.test(Obs)## Warning in chisq.test(Obs): Chi-squared approximation may be incorrect
chisqhitung<-sum((Obs-Tes$expected)^2/Tes$expected)
chistabel<-qchisq(0.05,df=1,lower.tail=F)
p_valuechi<-pchisq(chisqhitung,1,lower.tail=F)
chisq.test(Obs, correct = F)## Warning in chisq.test(Obs, correct = F): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: Obs
## X-squared = 11.2, df = 1, p-value = 0.000818
Metode K-function
Kest{spatstat}
K-function Description
Estimates Ripley’s reduced second moment function K(r) from a point pattern in a window of arbitrary shape.
Usage
Kest(X, …, r=NULL, rmax=NULL, breaks=NULL, correction=c(‘border’, “isotropic”, “Ripley”,“translate”), nlarge=3000, domain=NULL, var.approx=FALSE, ratio=FALSE)
The estimator Kest ignores marks. Its counterparts for multitype point patterns are Kcross, Kdot, and for general marked point patterns see Kmulti.
Ripley’s K estimator for Multitype Point
unique(ants$marks)## [1] Messor Cataglyphis
## Levels: Cataglyphis Messor
Kh <- Kdot(ants, "Messor")
plot(Kh)diagnostic plot for independence between Messor and other species
f1 <- function(X) { marks(X) == "Messor"}
f2 <- function(X) { marks(X) == "Cataglyphis"}
K <- Kmulti(ants,f1,f2)
plot(K)K01 <- Kcross(ants, "Messor", "Cataglyphis")
plot(K01)plot(envelope(ants, Kcross,
i="Messor",
j="Cataglyphis", nsim=99,
fix.n=T, fix.marks=T))## Generating 99 simulations of CSR with fixed number of points of each type ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99.
##
## Done.
plot(envelope(ants, Kdot,
i="Messor",
nsim=99,
fix.n=T, fix.marks=T))## Generating 99 simulations of CSR with fixed number of points of each type ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99.
##
## Done.
plot(envelope(ants, Kmulti,
I=f1, J=f2, nsim=99,
fix.n=T, fix.marks=T))## Generating 99 simulations of CSR with fixed number of points of each type ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99.
##
## Done.
Mad Test
mad.test(ants, Kcross, i="Messor", j="Cataglyphis", nsim=99,
fix.n=T, fix.marks=T)## Generating 99 simulations of CSR with fixed number of points of each type ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99.
##
## Done.
##
## Maximum absolute deviation test of CSR
## Monte Carlo test based on 99 simulations with fixed number of points
## of each type
## Summary function: "K"["Messor", "Cataglyphis"](r)
## Reference function: theoretical
## Alternative: two.sided
## Interval of distance values: [0, 191.5] units (one unit = 0.5 feet)
## Test statistic: Maximum absolute deviation
## Deviation = observed minus theoretical
##
## data: ants
## mad = 5161, rank = 76, p-value = 0.76
Lansing Wood Data
Here is the famous Lansing Woods dataset recording the positions of 2251 trees of 6 different species (hickories, maples, red oaks, white oaks, black oaks and miscellaneous trees).
data(lansing)
lansing## Marked planar point pattern: 2251 points
## Multitype, with levels = blackoak, hickory, maple, misc, redoak, whiteoak
## window: rectangle = [0, 1] x [0, 1] units (one unit = 924 feet)
summary(lansing)## Marked planar point pattern: 2251 points
## Average intensity 2251 points per square unit (one unit = 924 feet)
##
## *Pattern contains duplicated points*
##
## Coordinates are given to 3 decimal places
## i.e. rounded to the nearest multiple of 0.001 units (one unit = 924 feet)
##
## Multitype:
## frequency proportion intensity
## blackoak 135 0.05997335 135
## hickory 703 0.31230560 703
## maple 514 0.22834300 514
## misc 105 0.04664594 105
## redoak 346 0.15370950 346
## whiteoak 448 0.19902270 448
##
## Window: rectangle = [0, 1] x [0, 1] units
## Window area = 1 square unit
## Unit of length: 924 feet
plot(lansing)plot(split(lansing))plot(density(split(lansing)), ribbon = F)hick <- split(lansing)$hickory
plot(hick)Metode Kuadran
plot(quadratcount(split(lansing)))Metode K-function
Kh. <- Kdot(lansing,"hickory")
plot(Kh.)Interpolasi Spasial (Responsi Pertemuan 4)
1 Trend surface interpolation
Data yang digunakan adalah data topo yang mencatat spatial topographic. Namun sebelumnya, kita perlu memanggil beberapa package terlebih dulu.
library(MASS)
library(raster)
library(gstat)
library(sp)data(topo)
str(topo)## 'data.frame': 52 obs. of 3 variables:
## $ x: num 0.3 1.4 2.4 3.6 5.7 1.6 2.9 3.4 3.4 4.8 ...
## $ y: num 6.1 6.2 6.1 6.2 6.2 5.2 5.1 5.3 5.7 5.6 ...
## $ z: int 870 793 755 690 800 800 730 728 710 780 ...
head(topo)## x y z
## 1 0.3 6.1 870
## 2 1.4 6.2 793
## 3 2.4 6.1 755
## 4 3.6 6.2 690
## 5 5.7 6.2 800
## 6 1.6 5.2 800
dim(topo)## [1] 52 3
help(topo)Spatial Topographic Data
Description
The topo data frame has 52 rows and 3 columns, of topographic heights within a 310 feet square.
Usage
topo
Format
This data frame contains the following columns:
x
x coordinates (units of 50 feet)
y
y coordinates (units of 50 feet)
z
heights (feet)
Selanjutnya kita perlu mendefinisikan koordinat pada data topo.
coordinates(topo)<-~x+y
par(mfrow=c(1,2))
plot(topo)
plot(topo, pch=1, cex=topo$z/600)Tahapan selanjutnya adalah mempersiapkan grid untuk interpolasi.
x=seq(0,6.3, by=0.1);y=seq(0,6.3, by=0.1)
topo.grid <- expand.grid(x=x,y=y)
gridded(topo.grid) <- ~x+y
plot(topo.grid)1.1 1st-Order Trend Surface
Selanjutnya akan dilakukan 1st-order trend surface.
x1 <- gstat(formula=topo$z ~ 1, data=topo, degree=1)
kt <- predict(x1, newdata=topo.grid)## [ordinary or weighted least squares prediction]
spplot(kt[1], contour=TRUE,main="topo trend surface (1)-gstat")1.2 2nd-Order Trend Surface
Berikut ini adalah contoh syntax untuk melakukan 2nd-order trend surface.
x <- gstat(formula=topo$z ~ 1, data=topo, degree=2)
kt2 <- predict(x, newdata=topo.grid)## [ordinary or weighted least squares prediction]
spplot( kt2[1], contour=TRUE, main="topo trend surface (2)-gstat")2 Inverse Distance Weighting (IDW)
Beberapa fitur dari pendekatan ini adalah:
titik yang lebih dekat kepada titik yang akan diinterpolasi akan lebih berpengaruh terhadap hasil interpolasi di titik tersebut
teknik yang digunakan untuk memprediksi suatu titik adalah dengan memberikan bobot dari pengaruh titik di sekitarnya berdasarkan jarak terhadap titik yang akan diinterpolasi
2.1 IDW on topo data set
kid <- krige(topo$z ~ 1, topo, topo.grid, NULL)## [inverse distance weighted interpolation]
spplot(kid["var1.pred"], main = "Topo IDW",contour=TRUE)3 Thiessen Polygons
Data meuse akan digunakan sebagai ilustrasi pendekatan ini.
data(meuse)
names(meuse)## [1] "x" "y" "cadmium" "copper" "lead" "zinc" "elev"
## [8] "dist" "om" "ffreq" "soil" "lime" "landuse" "dist.m"
hist(meuse$zinc)plot(meuse$x,meuse$y,asp=1)Perhatikan bahwa data meuse masih berupa data.frame.
class(meuse)## [1] "data.frame"
Oleh karenanya, kita akan mendefinisikan koordinat lokasi pada data meuse, agar data tersebut menjadi class data spatial. Terlihat pula pada gambar bahwa kandungan zinc cenderung lebih tinggi pada daerah di pinggir sungai (daerah paling kiri) yang ditunjukkan dengan lingkaran yang lebih besar.
coordinates(meuse) = ~x + y # convert data to spatial points data frame
class(meuse)## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
plot(meuse,pch=1, cex=meuse$zinc/500)Selanjutnya, kita perlu mempersiapkan grid untuk interpolasi. Kebetulan, R sudah memepersiapkan data grid tersebut yang berpadanan dengan data meuse.
data(meuse.grid)
str(meuse.grid)## 'data.frame': 3103 obs. of 7 variables:
## $ x : num 181180 181140 181180 181220 181100 ...
## $ y : num 333740 333700 333700 333700 333660 ...
## $ part.a: num 1 1 1 1 1 1 1 1 1 1 ...
## $ part.b: num 0 0 0 0 0 0 0 0 0 0 ...
## $ dist : num 0 0 0.0122 0.0435 0 ...
## $ soil : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
## $ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
Sama seperti sebelumnya, kita perlu mendefinisikan koordinat lokasi untuk data meuse.grid.
coordinates(meuse.grid) = ~x + y # converts to spatial class
gridded(meuse.grid) <- TRUEKemudian setelah tersedia data titik dan data grid, kita dapat melakukan interpolasi pada data meuse.
zn.tp = krige(log(zinc) ~ 1, meuse, meuse.grid, nmax = 1)## [inverse distance weighted interpolation]
#for this search, the neighborhood is set to nmax=1
image(zn.tp["var1.pred"])
points(meuse, pch = "+", cex = 0.65)cc = coordinates(meuse)Plot interpolasi dapat pula ditampilkan dengan package tripack.
library(tripack)
image(zn.tp["var1.pred"])
points(meuse, pch = "+", cex = 0.65)
cc = coordinates(meuse)
plot(voronoi.mosaic(cc[, 1], cc[, 2]), do.points = FALSE, add = TRUE)4 Spline Interpolation
Pendekatan ini menggunakan fungsi interpolasi pada beberapa bagian yang dipisahkan oleh beberapa titik (knot).
Kita dapat melakukan interpolasi ini dengan menggunakan fungsi pada package akima.
library(akima)
data(meuse)
meuse.zinc <- data.frame(cbind(meuse$x,
meuse$y,meuse$zinc))
meuse.akim <- interp(meuse.zinc$X1, meuse.zinc$X2, meuse.zinc$X3)
image(meuse.akim,,xlab="", ylab="",asp=1)
contour(meuse.akim,add=TRUE, levels = seq(100,2000,200))5 Choosing the Right Interpolation
The quality of sample point set can affect choice of interpolation method as well. If the sample points are poorly distributed or there are few of them, the surface might not represent the actual terrain very well. If we have too few sample points, add more sample points in areas where the terrain changes abruptly or frequently, then try using Kriging.
The real-world knowledge of the subject matter will initially affect which interpolation method to use. If it is know that some of the features in surface exceed the z value, for example, and that IDW will result in a surface that does not exceed the highest or lowest z value in the sample point set, choose the Spline method.
If we know that the splined surface might end up with features that we know don’t exist because Spline interpolation doesn’t work well with sample points that are close together and have extreme differences in value, you might decide to try IDW.
Autokorelasi Spasial (Responsi Pertemuan 5)
Autokorelasi Temporal
set.seed(0)
d <- sample(100, 10)
d## [1] 14 68 39 1 34 87 43 100 82 59
a <- d[-length(d)]
b <- d[-1]
plot(a, b, xlab='t', ylab='t-1')cor(a, b)## [1] 0.1227634
d <- sort(d)
d## [1] 1 14 34 39 43 59 68 82 87 100
a <- d[-length(d)]
b <- d[-1]
plot(a, b, xlab='t', ylab='t-1')acf(d)Autokorelasi Spasial
Contoh Data Spasial
library(raster)
p <- shapefile(system.file("external/lux.shp", package="raster"))
p <- p[p$NAME_1=="Diekirch", ]
p$value <- c(10, 6, 4, 11, 6)
data.frame(p)## ID_1 NAME_1 ID_2 NAME_2 AREA value
## 0 1 Diekirch 1 Clervaux 312 10
## 1 1 Diekirch 2 Diekirch 218 6
## 2 1 Diekirch 3 Redange 259 4
## 3 1 Diekirch 4 Vianden 76 11
## 4 1 Diekirch 5 Wiltz 263 6
par(mai=c(0,0,0,0))
plot(p, col=2:7)
coords <- coordinates(p)
points(coords, cex=6, pch=20, col='white')
text(p, 'ID_2', cex=1.5)Steps in determining the extent of spatial autocorrelation in your data:
Contiguity based neighbors
Areas sharing any boundary point (QUEEN) are taken as neighbors, using the poly2nb function, which accepts a SpatialPolygonsDataFrame
library(spdep)
w <- poly2nb(p)If contiguity is defined as areas sharing more than one boundary point (ROOK), the queen= argument is set to FALSE
w.rook <- poly2nb(p, queen=FALSE)
coords<-coordinates(p)
plot(p)
plot(w, coords, add=T)Distance based neighbors k nearest neighbors:
- Can also choose the k nearest points as neighbors
coords<-coordinates(p)
IDs<-row.names(as(p, "data.frame"))
p_kn1<-knn2nb(knearneigh(coords, k=1), row.names=IDs)
p_kn2<-knn2nb(knearneigh(coords, k=2), row.names=IDs)## Warning in knearneigh(coords, k = 2): k greater than one-third of the number of
## data points
p_kn4<-knn2nb(knearneigh(coords, k=4), row.names=IDs)## Warning in knearneigh(coords, k = 4): k greater than one-third of the number of
## data points
plot(p, main = "k=1")
plot(p_kn1, coords, add=T)plot(p, main = "k=2")
plot(p_kn2, coords, add=T)plot(p, main = "k=4")
plot(p_kn4, coords, add=T)Distance based neighbors :
- Specified distance Can also assign neighbors based on a specified distance
dist<-unlist(nbdists(p_kn1, coords))
summary(dist)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.07316 0.07316 0.14159 0.11832 0.14159 0.16213
sort(dist)## [1] 0.07315711 0.07315711 0.14158521 0.14158521 0.16213194
max_k1<-max(dist)
p_kd1<-dnearneigh(coords, d1=0, d2=0.75*max_k1, row.names=IDs)
p_kd2<-dnearneigh(coords, d1=0, d2=1*max_k1, row.names=IDs)
p_kd3<-dnearneigh(coords, d1=0, d2=1.5*max_k1, row.names=IDs)
plot(p, main = "Distance=0.75*max_k1")
plot(p_kd1,coords, add=T)plot(p, main = "Distance=1*max_k1")
plot(p_kd2,coords, add=T)plot(p, main = "Distance=1.5*max_k1")
plot(p_kd3,coords, add=T)p_nbq_w<- nb2listw(w)
p_nbq_w$weights## [[1]]
## [1] 0.3333333 0.3333333 0.3333333
##
## [[2]]
## [1] 0.25 0.25 0.25 0.25
##
## [[3]]
## [1] 0.5 0.5
##
## [[4]]
## [1] 0.5 0.5
##
## [[5]]
## [1] 0.3333333 0.3333333 0.3333333
##
## attr(,"mode")
## [1] "binary"
## attr(,"W")
## [1] TRUE
## attr(,"comp")
## attr(,"comp")$d
## [1] 3 4 2 2 3
Binary weights
Row-standardised weights increase the influence of links from observations with few neighbours
Binary weights vary the influence of observations.
Those with many neighbours are upweighted compared to those with few
w_nbq_wb<-nb2listw(w, style="B")
w_nbq_wb$weights## [[1]]
## [1] 1 1 1
##
## [[2]]
## [1] 1 1 1 1
##
## [[3]]
## [1] 1 1
##
## [[4]]
## [1] 1 1
##
## [[5]]
## [1] 1 1 1
##
## attr(,"mode")
## [1] "binary"
## attr(,"B")
## [1] TRUE
Binary vs. row-standardized
- A binary weights matrix looks like:
## [,1] [,2] [,3] [,4]
## [1,] 0 1 0 0
## [2,] 0 0 1 1
## [3,] 1 1 0 0
## [4,] 0 1 1 1- A row-standardized matrix it looks like:
## [,1] [,2] [,3] [,4]
## [1,] 0.0 1.00 0.00 0.00
## [2,] 0.0 0.00 0.50 0.50
## [3,] 0.5 0.50 0.00 0.00
## [4,] 0.0 0.33 0.33 0.33## p_nbq_w<-nb2listw(p_nbq, zero.policy=T)Autokorelasi Spasial
library(spdep)
w <- poly2nb(p, row.names=p$Id)
xy <- coordinates(p)
class(w)## [1] "nb"
summary(w)## Neighbour list object:
## Number of regions: 5
## Number of nonzero links: 14
## Percentage nonzero weights: 56
## Average number of links: 2.8
## Link number distribution:
##
## 2 3 4
## 2 2 1
## 2 least connected regions:
## 2 3 with 2 links
## 1 most connected region:
## 1 with 4 links
str(w)## List of 5
## $ : int [1:3] 2 4 5
## $ : int [1:4] 1 3 4 5
## $ : int [1:2] 2 5
## $ : int [1:2] 1 2
## $ : int [1:3] 1 2 3
## - attr(*, "class")= chr "nb"
## - attr(*, "region.id")= chr [1:5] "0" "1" "2" "3" ...
## - attr(*, "call")= language poly2nb(pl = p, row.names = p$Id)
## - attr(*, "type")= chr "queen"
## - attr(*, "sym")= logi TRUE
plot(p, col='gray', border='blue', lwd=2)
plot(w, xy, col='red', lwd=2, add=TRUE)wm <- nb2mat(w, style='B')
wm## [,1] [,2] [,3] [,4] [,5]
## 0 0 1 0 1 1
## 1 1 0 1 1 1
## 2 0 1 0 0 1
## 3 1 1 0 0 0
## 4 1 1 1 0 0
## attr(,"call")
## nb2mat(neighbours = w, style = "B")
Menghitung Indeks Moran (1)
n <- length(p)
y <- p$value
ybar <- mean(y)
#####1st method
dy <- y - ybar
g <- expand.grid(dy, dy)
yiyj <- g[,1] * g[,2]
#####2nd method
yi <- rep(dy, each=n)
yj <- rep(dy)
yiyj <- yi * yj
pm <- matrix(yiyj, ncol=n)
pmw <- pm * wm
pmw## [,1] [,2] [,3] [,4] [,5]
## 0 0.00 -3.64 0.00 9.36 -3.64
## 1 -3.64 0.00 4.76 -5.04 1.96
## 2 0.00 4.76 0.00 0.00 4.76
## 3 9.36 -5.04 0.00 0.00 0.00
## 4 -3.64 1.96 4.76 0.00 0.00
## attr(,"call")
## nb2mat(neighbours = w, style = "B")
spmw <- sum(pmw)
spmw## [1] 17.04
smw <- sum(wm)
sw <- spmw / smw
vr <- n / sum(dy^2)
MI <- vr * sw
MI## [1] 0.1728896
EI <- -1/(n-1)
EI## [1] -0.25
Menghitung indeks moran(2)
#compute moran's using spdep function
ww <- nb2listw(w, style='B')
ww## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 5
## Number of nonzero links: 14
## Percentage nonzero weights: 56
## Average number of links: 2.8
##
## Weights style: B
## Weights constants summary:
## n nn S0 S1 S2
## B 5 25 14 28 168
#Note that
Szero(ww)## [1] 14
# is the same as
pmw## [,1] [,2] [,3] [,4] [,5]
## 0 0.00 -3.64 0.00 9.36 -3.64
## 1 -3.64 0.00 4.76 -5.04 1.96
## 2 0.00 4.76 0.00 0.00 4.76
## 3 9.36 -5.04 0.00 0.00 0.00
## 4 -3.64 1.96 4.76 0.00 0.00
## attr(,"call")
## nb2mat(neighbours = w, style = "B")
Moran’s I in R
moran.test(p$value, listw=ww, randomisation = F, alternative = 'less')##
## Moran I test under normality
##
## data: p$value
## weights: ww
##
## Moran I statistic standard deviate = 2.3372, p-value = 0.9903
## alternative hypothesis: less
## sample estimates:
## Moran I statistic Expectation Variance
## 0.1728896 -0.2500000 0.0327381
localmoran(p$value,ww)## Ii E.Ii Var.Ii Z.Ii Pr(z > 0)
## 0 0.2954545 -0.75 1.0456160 1.022396 0.15329681
## 1 -0.2784091 -1.00 0.4324638 1.097277 0.13626023
## 2 1.3522727 -0.50 1.1779227 1.706658 0.04394281
## 3 0.6136364 -0.50 1.1779227 1.026089 0.15242483
## 4 0.4375000 -0.75 1.0456160 1.161308 0.12275828
## attr(,"call")
## localmoran(x = p$value, listw = ww)
## attr(,"class")
## [1] "localmoran" "matrix" "array"
Moran Scatter Plot
n <- length(p)
ms <- cbind(id=rep(1:n, each=n), y=rep(y, each=n), value=as.vector(wm * y))
ms <- ms[ms[,3] > 0, ]
ams <- aggregate(ms[,2:3], list(ms[,1]), FUN=mean)
ams <- ams[,-1]
colnames(ams) <- c('y', 'spatially lagged y')
ams; plot(ams)## y spatially lagged y
## 1 10 7.666667
## 2 6 7.750000
## 3 4 6.000000
## 4 11 8.000000
## 5 6 6.666667
n <- length(p)
ms <- cbind(id=rep(1:n, each=n), y=rep(y, each=n), value=as.vector(wm * y))
ms <- ms[ms[,3] > 0, ]
ams <- aggregate(ms[,2:3], list(ms[,1]), FUN=mean)
ams <- ams[,-1]
colnames(ams) <- c('y', 'spatially lagged y')
ams; plot(ams)## y spatially lagged y
## 1 10 7.666667
## 2 6 7.750000
## 3 4 6.000000
## 4 11 8.000000
## 5 6 6.666667
reg <- lm(ams[,2] ~ ams[,1])
abline(reg, lwd=2)
abline(h=mean(ams[,2]), lt=2)
abline(v=mean(ams[,1]), lt=2)coefficients(reg)[2]## ams[, 1]
## 0.2315341
rwm <- mat2listw(wm, style='W')
# Checking if rows add up to 1
mat <- listw2mat(rwm)
apply(mat, 1, sum)## 0 1 2 3 4
## 1 1 1 1 1
moran.plot(y, rwm)Latihan
Input data berikut:
kemiskinan<-read.csv("http://bit.ly/dataKemiskinan",sep=',',header=T)Input data bobot berikut:
bobot<-read.csv("http://bit.ly/bobot_kemiskinan",sep=',',header=F)Hitung indeks moran data kemiskinan tsb!
Mengubah data bobot ke dalam bentuk matriks
bot<-as.matrix(bobot)Menghitung indeks moran global
w=mat2listw(bot)
moran(kemiskinan$Y, listw=w, n=112, S0=Szero(w))## $I
## [1] 0.5126348
##
## $K
## [1] 2.255458
Menghitung indeks moran lokal
localmoran(kemiskinan$Y, w)## Ii E.Ii Var.Ii Z.Ii Pr(z > 0)
## 1 2.9956708941 -0.009189189 0.16372181 7.426274243 5.584964e-14
## 2 2.9707599924 -0.009189189 0.16372181 7.364708918 8.876691e-14
## 3 3.0286701205 -0.009009009 0.24048871 6.194333091 2.926616e-10
## 4 2.6898081328 -0.009009009 0.19061406 6.181530200 3.174161e-10
## 5 2.2487947570 -0.009009009 0.19061406 5.171407114 1.161689e-07
## 6 0.4077235364 -0.009009009 0.09086477 1.382482564 8.341179e-02
## 7 0.1508819214 -0.009009009 0.19061406 0.366223632 3.570991e-01
## 8 -0.0074689774 -0.009369369 0.12525154 0.005369720 4.978578e-01
## 9 0.2856458209 -0.009369369 0.12525154 0.833590663 2.022558e-01
## 10 -0.0030022172 -0.009189189 0.16372181 0.015290612 4.939002e-01
## 11 -0.0346518471 -0.009009009 0.19061406 -0.058733871 5.234180e-01
## 12 -0.1802319799 -0.009189189 0.16372181 -0.422718741 6.637498e-01
## 13 0.1601571948 -0.009009009 0.19061406 0.387468266 3.492048e-01
## 14 0.4498618110 -0.009009009 0.19061406 1.051024832 1.466236e-01
## 15 0.1186974977 -0.008828829 0.12832333 0.355997574 3.609212e-01
## 16 -0.0008167697 -0.009189189 0.16372181 0.020691773 4.917458e-01
## 17 0.2225974507 -0.009009009 0.24048871 0.472284101 3.183620e-01
## 18 0.0011698851 -0.009189189 0.16372181 0.025601633 4.897875e-01
## 19 0.1465644602 -0.009009009 0.24048871 0.317240185 3.755307e-01
## 20 0.1045004982 -0.009189189 0.16372181 0.280975078 3.893648e-01
## 21 1.4694655705 -0.009189189 0.16372181 3.654378391 1.289029e-04
## 22 -0.1958580510 -0.009189189 0.16372181 -0.461337341 6.777217e-01
## 23 0.6110886565 -0.009189189 0.16372181 1.532967679 6.264191e-02
## 24 0.2206918852 -0.009009009 0.98860840 0.231020509 4.086494e-01
## 25 0.8554215765 -0.008918919 0.31717322 1.534746347 6.242307e-02
## 26 -0.1306815410 -0.009009009 0.98860840 -0.122371532 5.486976e-01
## 27 1.9007112256 -0.009009009 0.24048871 3.894237258 4.925410e-05
## 28 2.4294942847 -0.009189189 0.16372181 6.027013494 8.350853e-10
## 29 0.7520034876 -0.008918919 0.31717322 1.351114393 8.832940e-02
## 30 -0.4339517318 -0.009009009 0.48986194 -0.607146573 7.281232e-01
## 31 -0.2248337606 -0.009009009 0.48986194 -0.308364519 6.210975e-01
## 32 0.5811260804 -0.009189189 0.16372181 1.458917540 7.229390e-02
## 33 1.4762726887 -0.008828829 0.12832333 4.145752115 1.693500e-05
## 34 1.9930121664 -0.009009009 0.24048871 4.082454231 2.228130e-05
## 35 1.6651891010 -0.009189189 0.16372181 4.138093630 1.751017e-05
## 36 2.2053703930 -0.009009009 0.19061406 5.071945385 1.968847e-07
## 37 0.6619985010 -0.009009009 0.24048871 1.368295941 8.560973e-02
## 38 1.2890903248 -0.008828829 0.12832333 3.623221048 1.454785e-04
## 39 0.0737409545 -0.009369369 0.12525154 0.234835332 4.071683e-01
## 40 0.0449553785 -0.008918919 0.09991926 0.170434310 4.323343e-01
## 41 0.2492968996 -0.009009009 0.24048871 0.526728719 2.991910e-01
## 42 -0.3176614306 -0.009189189 0.16372181 -0.762364768 7.770788e-01
## 43 0.3441276444 -0.009189189 0.16372181 0.873194634 1.912785e-01
## 44 0.0247764717 -0.008828829 0.12832333 0.093811261 4.626295e-01
## 45 0.4090979976 -0.009009009 0.24048871 0.852589743 1.969434e-01
## 46 0.2627661390 -0.009369369 0.12525154 0.768942165 2.209638e-01
## 47 0.6223833197 -0.009189189 0.16372181 1.560881547 5.927585e-02
## 48 1.5505834943 -0.008918919 0.31717322 2.769094639 2.810615e-03
## 49 0.1658372041 -0.009009009 0.19061406 0.400478095 3.444022e-01
## 50 -0.1405077380 -0.009009009 0.24048871 -0.268147784 6.057072e-01
## 51 -0.1937253572 -0.008918919 0.31717322 -0.328147307 6.285999e-01
## 52 -0.4282223407 -0.009009009 0.19061406 -0.960190978 8.315204e-01
## 53 -0.0181784492 -0.009369369 0.12525154 -0.024890809 5.099290e-01
## 54 0.0847981667 -0.009009009 0.24048871 0.191288437 4.241498e-01
## 55 0.0420767291 -0.009009009 0.19061406 0.117009792 4.534261e-01
## 56 0.3000974561 -0.009009009 0.19061406 0.707995707 2.394740e-01
## 57 0.5317402982 -0.009009009 0.19061406 1.238564155 1.077535e-01
## 58 1.2361795596 -0.009009009 0.24048871 2.539146640 5.556162e-03
## 59 0.0745403774 -0.009009009 0.24048871 0.170371098 4.323592e-01
## 60 0.7242946555 -0.009189189 0.16372181 1.812747360 3.493540e-02
## 61 -0.1195124323 -0.009009009 0.98860840 -0.111138258 5.442466e-01
## 62 -0.0054993854 -0.008918919 0.31717322 0.006071816 4.975777e-01
## 63 0.6068677056 -0.009009009 0.98860840 0.619414882 2.678215e-01
## 64 -0.3399943206 -0.008918919 0.31717322 -0.587866432 7.216890e-01
## 65 -0.4105879683 -0.009009009 0.48986194 -0.573765065 7.169366e-01
## 66 -0.5194137447 -0.009009009 0.48986194 -0.729252367 7.670763e-01
## 67 0.3164305172 -0.009009009 0.24048871 0.663625334 2.534651e-01
## 68 0.2900569137 -0.009009009 0.24048871 0.609845169 2.709822e-01
## 69 0.4646988766 -0.009009009 0.19061406 1.085008523 1.389589e-01
## 70 -0.2903809828 -0.008828829 0.12832333 -0.785970133 7.840575e-01
## 71 -0.0218161927 -0.009009009 0.48986194 -0.018298555 5.072997e-01
## 72 0.5608175133 -0.008918919 0.31717322 1.011639409 1.558552e-01
## 73 0.1090982758 -0.009369369 0.12525154 0.334740468 3.689104e-01
## 74 0.1877661132 -0.008918919 0.31717322 0.349239259 3.634548e-01
## 75 -0.1872119459 -0.009009009 0.19061406 -0.408166533 6.584243e-01
## 76 0.0212454393 -0.009009009 0.24048871 0.061693853 4.754033e-01
## 77 -0.0001770613 -0.008828829 0.12832333 0.024151940 4.903657e-01
## 78 -0.0027028916 -0.008918919 0.09991926 0.019664745 4.921554e-01
## 79 0.3437759205 -0.009009009 0.24048871 0.719387160 2.359512e-01
## 80 0.3439685845 -0.009009009 0.24048871 0.719780034 2.358302e-01
## 81 -0.1245813948 -0.008918919 0.31717322 -0.205373419 5.813598e-01
## 82 0.7696778122 -0.009009009 0.24048871 1.587871970 5.615765e-02
## 83 0.4415414333 -0.008918919 0.31717322 0.799849578 2.118990e-01
## 84 1.2614533459 -0.008828829 0.12832333 3.546070724 1.955106e-04
## 85 -0.0263476656 -0.008828829 0.12832333 -0.048904909 5.195025e-01
## 86 -0.0559779114 -0.009009009 0.24048871 -0.095777406 5.381513e-01
## 87 0.0091481408 -0.009369369 0.12525154 0.052322809 4.791357e-01
## 88 0.1137183674 -0.009189189 0.16372181 0.303756314 3.806568e-01
## 89 0.3349008822 -0.008828829 0.12832333 0.959542604 1.686427e-01
## 90 0.1568435456 -0.009189189 0.16372181 0.410336784 3.407795e-01
## 91 0.0334094342 -0.009009009 0.19061406 0.097157708 4.613006e-01
## 92 0.4879521558 -0.008828829 0.12832333 1.386794636 8.275220e-02
## 93 1.0176921264 -0.008828829 0.12832333 2.865596305 2.081123e-03
## 94 1.9433857006 -0.009009009 0.24048871 3.981257612 3.427580e-05
## 95 0.8821671049 -0.009189189 0.16372181 2.202916644 1.380031e-02
## 96 -0.2602806958 -0.009009009 0.24048871 -0.512384771 6.958091e-01
## 97 -0.2966078581 -0.009009009 0.98860840 -0.289251084 6.138054e-01
## 98 0.2614821187 -0.009009009 0.48986194 0.386470346 3.495742e-01
## 99 0.0275611608 -0.009009009 0.98860840 0.036780263 4.853301e-01
## 100 2.7966498232 -0.009009009 0.98860840 2.821777142 2.387918e-03
## 101 -0.2636636845 -0.009009009 0.98860840 -0.256117648 6.010700e-01
## 102 0.0912964930 -0.009009009 0.98860840 0.100881750 4.598222e-01
## 103 -0.6503093806 -0.009009009 0.98860840 -0.644984596 7.405314e-01
## 104 0.0550347574 -0.009009009 0.48986194 0.091503987 4.635461e-01
## 105 -0.1422312884 -0.008918919 0.31717322 -0.236713047 5.935603e-01
## 106 0.3896745865 -0.008918919 0.31717322 0.707753402 2.395492e-01
## 107 0.3942031890 -0.009189189 0.16372181 0.996952385 1.593938e-01
## 108 1.6281809136 -0.009369369 0.12525154 4.627038444 1.854658e-06
## 109 1.3019775314 -0.009009009 0.19061406 3.002761012 1.337712e-03
## 110 2.0290430716 -0.008918919 0.31717322 3.618660398 1.480660e-04
## 111 2.0694222962 -0.009009009 0.48986194 2.969605966 1.490910e-03
## 112 1.0569918900 -0.009009009 0.24048871 2.173753172 1.486184e-02
## attr(,"call")
## localmoran(x = kemiskinan$Y, listw = w)
## attr(,"class")
## [1] "localmoran" "matrix" "array"
Membuat Moran scatter plot
moran.plot(kemiskinan$Y, mat2listw(bot,style='W'),
labels=kemiskinan$Nama.Kabupaten)Kriging dan CoKriging (Responsi Pertemuan 6)
Semivariance
Regionalized variable theory uses a related property called the semivariance to express the degree of relationship between points on a surface.
The semivariance is simply half the variance of the differences between all possible points spaced a constant distance apart.
Semivariance is a measure of the degree of spatial dependence between samples (elevation)
##Meuse River Dataset
library(sp)data(meuse)
class(meuse)## [1] "data.frame"
coordinates(meuse) <- c("x", "y")
class(meuse)## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
str(meuse)## Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
## ..@ data :'data.frame': 155 obs. of 12 variables:
## .. ..$ cadmium: num [1:155] 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
## .. ..$ copper : num [1:155] 85 81 68 81 48 61 31 29 37 24 ...
## .. ..$ lead : num [1:155] 299 277 199 116 117 137 132 150 133 80 ...
## .. ..$ zinc : num [1:155] 1022 1141 640 257 269 ...
## .. ..$ elev : num [1:155] 7.91 6.98 7.8 7.66 7.48 ...
## .. ..$ dist : num [1:155] 0.00136 0.01222 0.10303 0.19009 0.27709 ...
## .. ..$ om : num [1:155] 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
## .. ..$ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
## .. ..$ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
## .. ..$ landuse: Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
## .. ..$ dist.m : num [1:155] 50 30 150 270 380 470 240 120 240 420 ...
## ..@ coords.nrs : int [1:2] 1 2
## ..@ coords : num [1:155, 1:2] 181072 181025 181165 181298 181307 ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:155] "1" "2" "3" "4" ...
## .. .. ..$ : chr [1:2] "x" "y"
## ..@ bbox : num [1:2, 1:2] 178605 329714 181390 333611
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:2] "x" "y"
## .. .. ..$ : chr [1:2] "min" "max"
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
## .. .. ..@ projargs: chr NA
plot(meuse, asp = 1, pch = 1)
data(meuse.riv)
lines(meuse.riv)Display a postplot of the untransformed Zn values, that is, plot the sample locations (as above) and represent the data value by the size of the symbol.
plot(meuse, asp = 1,
cex = 4*meuse$zinc/max(meuse$zinc),
pch = 1)
lines(meuse.riv)Compute the number of point-pairs
meuse$logZn<-log10(meuse$zinc)
n <- length(meuse$logZn)
n * (n - 1)/2## [1] 11935
Compute the distance and semivariance between the first two points in the data set
dim(coordinates(meuse))## [1] 155 2
coordinates(meuse)[1, ]## x y
## 181072 333611
coordinates(meuse)[2, ]## x y
## 181025 333558
(sep <- dist(coordinates(meuse)[1:2, ]))## 1
## 2 70.83784
(gamma <- 0.5 * (meuse$logZn[1] - meuse$logZn[2])^2)## [1] 0.001144082
Plot the experimental variogram of the log-Zn concentrations
library(gstat)
v <- variogram(logZn ~ 1, meuse, cutoff = 1300, width = 90)Plot the experimental variogram of the log-Zn concentrations
print(plot(v, plot.numbers = T))The trend in decreasing semivariance with decreasing separation seems to intersect the yaxis (i.e., at 0 separation) at about 0.01 log(mg kg-1)2; this is the nugget.
Variogram
The semivariance at a distance d = 0 should be zero, because there are no differences between points that are compared to themselves. However, as points are compared to increasingly distant points, the semivariance increases.
Display the variogram model forms which can be used in gstat
print(show.vgms())Fit a spherical variogram model
vm <- vgm(psill = 0.12,
model = "Sph",
range = 850,
nugget = 0.01)
print(plot(v, pl = T,
model = vm))Adjust the model with gstat automatic fit
vm## model psill range
## 1 Nug 0.01 0
## 2 Sph 0.12 850
vmf = fit.variogram(v,vm)
print(plot(v, pl = T, model = vmf))Ordinary Kriging
The theory of regionalised variables leads to an ”optimal“ prediction method, in the sense that the kriging variance is minimized.
What is so special about kriging?
Predicts at any point as the weighted average of the values at sampled points
Weights given to each sample point are optimal, given the spatial covariance structure as revealed by the variogram model (in this sense it is “best”)
The kriging variance at each point is automatically generated as part of the process of computing the weights
Ordinary kriging on a regular grid
Load the 40 m x 40 m interpolation grid covering the sample area and convert it to a spatial object
data(meuse.grid)
coordinates(meuse.grid) <- c("x", "y")
gridded(meuse.grid) <- TPredict the attribute value at all grid points using Ordinary Kriging
k40 <- krige(logZn ~ 1, locations = meuse, newdata = meuse.grid, model = vmf)## [using ordinary kriging]
Display the structure of the kriging prediction object
str(k40)## Formal class 'SpatialPixelsDataFrame' [package "sp"] with 7 slots
## ..@ data :'data.frame': 3103 obs. of 2 variables:
## .. ..$ var1.pred: num [1:3103] 2.83 2.88 2.83 2.78 2.94 ...
## .. ..$ var1.var : num [1:3103] 0.0596 0.0471 0.0509 0.055 0.0335 ...
## ..@ coords.nrs : int [1:2] 1 2
## ..@ grid :Formal class 'GridTopology' [package "sp"] with 3 slots
## .. .. ..@ cellcentre.offset: Named num [1:2] 178460 329620
## .. .. .. ..- attr(*, "names")= chr [1:2] "x" "y"
## .. .. ..@ cellsize : Named num [1:2] 40 40
## .. .. .. ..- attr(*, "names")= chr [1:2] "x" "y"
## .. .. ..@ cells.dim : Named int [1:2] 78 104
## .. .. .. ..- attr(*, "names")= chr [1:2] "x" "y"
## ..@ grid.index : int [1:3103] 69 146 147 148 223 224 225 226 300 301 ...
## ..@ coords : num [1:3103, 1:2] 181180 181140 181180 181220 181100 ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:3103] "1" "2" "3" "4" ...
## .. .. ..$ : chr [1:2] "x" "y"
## ..@ bbox : num [1:2, 1:2] 178440 329600 181560 333760
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:2] "x" "y"
## .. .. ..$ : chr [1:2] "min" "max"
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
## .. .. ..@ projargs: chr NA
Display the map of predicted values
print(spplot(k40, "var1.pred",
asp=1,
main="OK prediction, log-ppm Zn"))Display the map of kriging prediction variances
print(spplot(k40, "var1.var",
col.regions=cm.colors(64),
asp=1,
main="OK prediction variance,
log-ppm Zn^2"))Describe the variances map:
Where is the prediction variance lowest?
Does this depend on the data value?
Show the post-plot: value proportional to circle size
pts.s <- list("sp.points", meuse, col="white",
pch=1, cex=4*meuse$zinc/max(meuse$zinc))
print(spplot(k40, "var1.pred", asp=1,
col.regions=bpy.colors(64),
main="OK prediction, log-ppm Zn",
sp.layout = list(pts.s)))Show the observation locations on the kriging prediction variance
pts.s <- list("sp.points", meuse, col="black",
pch=20)
print(spplot(k40, zcol="var1.var",
col.regions=cm.colors(64),
asp=1,main="OK prediction variance,
log-ppm Zn^2",sp.layout = list(pts.s)))Evaluating the model
zn.okcv<-krige.cv(log(zinc)~1, meuse, meuse.grid, model = vmf)
RMSE.ok<-sqrt(sum(zn.okcv$residual^2)/length(zn.okcv$residual))
RMSE.ok## [1] 0.3973315
Cokriging
Co-kriging allows samples of an auxiliary variable (also called the co-variable), besides the target value of interest, to be used when predicting the target value at unsampled locations. The co-variable may be measured at the same points as the target (co-located samples), at other points, or both.
The most common application of co-kriging is when the co-variable is cheaper to measure, and so has been more densely sampled, than the target variable.
Eksplorasi Data
data(meuse.riv)# outline of the river
meuse.lst <- list(Polygons(list(Polygon(meuse.riv)), "meuse.riv"))
meuse.sr <- SpatialPolygons(meuse.lst)
image(meuse.grid["dist"])# one of the variables in meuse.grid
plot(meuse.sr, add=TRUE)
title("distance to river")zn.lm <- lm(log10(zinc) ~ sqrt(dist), meuse)
meuse$fitted.s <- predict(zn.lm,meuse)
-mean(predict(zn.lm,meuse))## [1] -2.55616
meuse$residuals <- residuals(zn.lm)
spplot(meuse,c("fitted.s","residuals"))Generate a new kriged surface with the covariate - distance to river
vm.ck <- variogram(log10(zinc)~sqrt(dist), meuse)
plot(vm.ck, plot.numbers = TRUE, pch = "+") Secara visual, berapakah nilai sill, range, dan nugget berdasarkan variogram di samping?
m.ck <- fit.variogram(vm.ck,vgm(.03,
"Sph", 800,.015))
plot(vm.ck, model=m.ck)ko.ck <- krige(log(zinc)~ sqrt(dist), meuse,
meuse.grid,model=m.ck)## [using universal kriging]
pts <- list("sp.points",meuse,pch=3,col="black")
meuse.layout <- list(pts)
spplot(ko.ck["var1.pred"], sp.layout=meuse.layout, main="cokriging predictions-Zn/distance river ")ko.ck$sek <- sqrt(ko.ck$var1.var)
spplot(ko.ck,zcol='sek',
sp.layout=meuse.layout, main =
"co-kriging se-Zn(covariate)")summary(ko.ck)## Object of class SpatialPixelsDataFrame
## Coordinates:
## min max
## x 178440 181560
## y 329600 333760
## Is projected: NA
## proj4string : [NA]
## Number of points: 3103
## Grid attributes:
## cellcentre.offset cellsize cells.dim
## x 178460 40 78
## y 329620 40 104
## Data attributes:
## var1.pred var1.var sek
## Min. :4.455 Min. :0.01903 Min. :0.1379
## 1st Qu.:5.212 1st Qu.:0.02187 1st Qu.:0.1479
## Median :5.590 Median :0.02322 Median :0.1524
## Mean :5.702 Mean :0.02450 Mean :0.1561
## 3rd Qu.:6.134 3rd Qu.:0.02595 3rd Qu.:0.1611
## Max. :7.477 Max. :0.03983 Max. :0.1996
Evaluasi
zn.ckcv<-krige.cv(log(zinc)~ sqrt(dist), meuse,meuse.grid,model=m.ck)
RMSE.ck<-sqrt(sum(zn.ckcv$residual^2)/length(zn.ckcv$residual))
RMSE.ck## [1] 0.3752662
Back Transform
ko.ck$predt <- 10^(ko.ck$var1.pred)
spplot(ko.ck["predt"], sp.layout=meuse.layout,
main = "Co-Kriging predictions-Meuse
zinc log/backtrans(Zn)")Target Variable
We select lead (abbreviation “Pb”) as the target variable, i.e. the one we want to map. This metal is a serious human health hazard.
It can be inhaled as dust from disturbed soil or taken up by plants and ingested.
The critical value for Pb in agricultural soils, according to the Berlin Digital Environmental Atlas2, is 600 mg kg-1 for agricultural fields: above this level grain crops can not be grown for human consumption.
Selecting the co-variables
Candidates for co-variables must have:
a feature-space correlation with the target variable;
a spatial structure (i.e. be modelled as a regional variable);
a spatial co-variance with the target variable.
Two main ways to select a co-variable:
theoretically, from knowledge of the spatial process that caused the observed spatial (co-)distribution;
empirically, by examining the feature-space correlations (scatterplots) and then the spatial co-variance (cross-correlograms or crossvariograms).
Kandidat Peubah Penjelas
- organic matter content (OM)
- zinc content (Zn)
Exercise 1
Lakukan prediksi Pb dengan menyertakan co-variables (OM atau Zn), diskusikan hasil yang Anda peroleh.
dataset: meuse
target variabel: lead (abbreviation “Pb”)
Kandidat peubah penjelas: organic matter content (OM) dan zinc content (Zn)
Membuat Variogram
library(gstat)
v <- variogram(log10(lead)~1,meuse)
vm <- vgm(model="Wav",nugget=0.05,range=700)
vmf <- fit.variogram(v,vm)
plot(v,pl=T,model=vmf,col="red",pch=19) Ordinary Kriging
library(RColorBrewer)data(meuse.grid)
coordinates(meuse.grid) <- c("x","y")
gridded(meuse.grid) <- T
k40 <- krige(log10(lead)~1,locations=meuse,newdata=meuse.grid,model=vmf) ## [using ordinary kriging]
pts.s <- list("sp.points",meuse,col="coral",pch=19,cex=4*meuse$lead/max(meuse$lead))
spplot(k40,"var1.pred",asp=1,cuts=8,col.regions=brewer.pal(n=9,name="Greens"),main="Ordinary kriging (log10(Pb))",sp.layout=list(pts.s)) Menampilkan Peta Penduga Varians Kriging
pts.s <- list("sp.points",meuse,col="red",pch=19)
spplot(k40,zcol="var1.var",col.regions=cm.colors(64),asp=1,
main="Ordinary kriging variance (log10(Pb))",sp.layout=list(pts.s)) Evaluasi Model
Pb.okcv<-krige.cv(log10(lead)~1,meuse,meuse.grid,model=vmf)
(RMSE.ok<-sqrt(sum(Pb.okcv$residual^2)/length(Pb.okcv$residual))) ## [1] 0.1789486
Co-Kriging
Eksplorasi Data
par(mfrow=c(2,2))
plot(lead ~ om, meuse)
abline(lm(lead ~ om, meuse),col=2)
plot(lead ~ zinc, meuse)
abline(lm(lead ~ zinc, meuse),col=2)
plot(log10(lead) ~ log10(om), meuse)
abline(lm(log10(lead) ~ log10(om), meuse),col=2)
plot(log10(lead) ~ log10(zinc), meuse)
abline(lm(log10(lead) ~ log10(zinc), meuse),col=2) Berdasarkan plot antara log10(Lead) dengan log10(Zinc) dan log10(Lead) dengan log10(Om), terlihat bahwa log10(Lead) memiliki hubungan yang erat dengan log10(Zinc) (titik-titiknya menyebar di sekitar garis linier).Hal yang sama juga terlihat pada data awal (tidak di log10). Dengan demikian, Zinc akan digunakan sebagai kovariat dalam model Co-Kriging untuk memprediksi Lead (Pb).
Regresi Antara Lead (Pb) dan Zinc
Pb.lm <- lm(log10(lead) ~ log10(zinc), meuse)
meuse$fitted.s <- predict(Pb.lm,meuse)-mean(predict(Pb.lm,meuse))
meuse$residuals <- residuals(Pb.lm)
spplot(meuse,c("fitted.s","residuals"),cuts=8,col.regions=brewer.pal(n=9,name="Reds")) Variogram untuk Co-Kriging
vm.ck <- variogram(log10(lead)~log10(zinc), meuse)
m.ck <- fit.variogram(vm.ck,vgm("Sph"))
plot(vm.ck, model=m.ck, col="red",pch=19) Interpolasi dengan Co-Kriging
data(meuse.grid)
coordinates(meuse.grid) <- c("x","y")
gridded(meuse.grid) <- T
k40 <- krige(zinc~1,locations=meuse,newdata=meuse.grid,model=vmf) #prediksi utk zinc## [using ordinary kriging]
meuse.grid$zinc <- k40$var1.pred #menambahkan prediksi zinc ke meuse.grid
ko.ck <- krige(log10(lead)~log10(zinc),meuse,meuse.grid,model=m.ck) ## [using universal kriging]
pts <- list("sp.points",meuse,pch=19,col="coral")
meuse.layout <- list(pts)
spplot(ko.ck["var1.pred"],sp.layout=meuse.layout,main="Co-kriging predictions-log10(Pb)",cuts=8,
col.regions=brewer.pal(n=9,name="Greens")) Peta Prediksi Standard Error
ko.ck$sek <- sqrt(ko.ck$var1.var)
spplot(ko.ck,zcol='sek',asp=1,sp.layout=meuse.layout,main="Co-kriging se-log10(Pb)",
col.regions=cm.colors(64)) Evaluasi Model Co-Kriging
Pb.ckcv<-krige.cv(log10(lead)~log10(zinc),meuse,meuse.grid,model=m.ck)
(RMSE.ck<-sqrt(sum(Pb.ckcv$residual^2)/length(Pb.ckcv$residual)))## [1] 0.05967417
Kesimpulan
Model Co-Kriging menghasilkan RMSE sebesar 0.05967417. Nilai ini lebih kecil daripada RMSE model Kriging sebesar 0.1789486, sehingga disimpulkan bahwa prediksi kandungan Pb dengan model Co-Kriging lebih baik dibandingkan model Kriging. Model Co-Kriging memanfaatkan informasi peubah lain yaitu Zinc sehingga prediksinya lebih akurat.
Back Transform Model Terbaik (Co-Kriging)
ko.ck$predict <- 10^(ko.ck$var1.pred)
spplot(ko.ck["predict"], sp.layout=meuse.layout,
main = "Co-Kriging predictions-Pb")Exercise 2
Perhatikan data yg tersedia pada alamat berikut: (https://raw.githubusercontent.com/raoy/Spatial-Statistics/master/database_Nickel.csv)
Keterangan variabel:
XCOLLAR: Longitude
YCOLLAR: Latitude
ZCOLLAR: Kedalaman
Ni: Kandungan Nikel (target variable)
Coba lakukan interpolasi dengan beberapa metode yang telah Anda pelajari, bandingkan hasilnya, manakah interpolasi yang paling baik menurut Anda?
Input Data
Nickel <- read.csv("https://raw.githubusercontent.com//raoy//Spatial-Statistics//master//database_Nickel.csv")library(MASS)
library(raster)
library(gstat)
library(sp)
library(RColorBrewer)str(Nickel)## 'data.frame': 2235 obs. of 4 variables:
## $ XCOLLAR: num 408226 408226 408226 408226 408226 ...
## $ YCOLLAR: num 87324 87324 87324 87324 87324 ...
## $ ZCOLLAR: num 496 495 494 493 492 ...
## $ Ni : num 1.19 1.75 0.36 2.56 2.51 2.02 1.31 1.25 1.23 0.91 ...
head(Nickel)## XCOLLAR YCOLLAR ZCOLLAR Ni
## 1 408225.8 87323.75 495.57 1.19
## 2 408225.8 87323.75 494.57 1.75
## 3 408225.8 87323.75 493.57 0.36
## 4 408225.8 87323.75 492.57 2.56
## 5 408225.8 87323.75 491.57 2.51
## 6 408225.8 87323.75 490.57 2.02
dim(Nickel)## [1] 2235 4
Data terdiri dari 2235 observasi dan 4 variabel.
Menghilangkan Duplikat Data
Karena terdapat duplikat data koordinat X dan Y, maka akan dilakukan remove duplikat koordinat ini terlebih dahulu.
Nickel <- Nickel[!duplicated(Nickel[,c("XCOLLAR","YCOLLAR")]),]
Nickel ## XCOLLAR YCOLLAR ZCOLLAR Ni
## 1 408225.8 87323.75 495.57 1.19
## 24 408225.7 87350.05 488.82 0.90
## 50 408224.9 87375.23 492.29 1.16
## 76 408225.5 87149.85 525.24 0.61
## 101 408224.8 87174.98 521.44 0.65
## 126 408224.9 87199.93 519.52 0.54
## 144 408225.1 87226.41 517.23 0.44
## 169 408225.4 87249.54 514.23 0.74
## 189 408225.1 87274.76 510.40 0.94
## 214 408224.9 87300.22 505.49 0.97
## 238 408251.6 87328.22 498.23 1.14
## 252 408249.9 87349.94 496.46 1.13
## 277 408250.7 87374.72 495.73 1.19
## 294 408250.5 87149.92 532.67 0.87
## 315 408247.6 87175.29 529.59 0.67
## 340 408249.8 87199.74 524.91 0.66
## 365 408249.6 87224.16 520.30 0.74
## 377 408250.8 87249.86 515.39 0.94
## 397 408253.2 87276.84 508.21 0.84
## 412 408250.1 87300.19 501.78 0.87
## 433 408275.7 87325.38 505.91 1.11
## 455 408275.2 87350.01 505.22 1.05
## 481 408274.7 87375.24 500.21 0.88
## 497 408275.8 87156.18 538.27 1.35
## 515 408274.6 87175.10 533.32 1.05
## 540 408274.8 87199.87 529.59 0.79
## 565 408274.4 87224.48 524.63 1.02
## 589 408275.0 87249.33 520.31 1.37
## 610 408275.2 87275.05 509.97 0.62
## 635 408274.4 87300.14 510.31 0.90
## 660 408304.0 87324.93 511.26 0.82
## 672 408300.6 87350.02 509.78 1.04
## 698 408298.9 87375.76 505.35 0.94
## 711 408300.2 87149.90 537.43 0.92
## 737 408297.9 87174.17 535.14 0.84
## 762 408300.0 87199.98 529.94 0.92
## 787 408298.9 87229.40 523.10 0.77
## 811 408300.6 87249.22 517.87 0.73
## 837 408299.6 87274.50 518.90 0.83
## 854 408300.2 87300.13 518.41 0.83
## 869 408325.2 87324.81 521.35 1.93
## 894 408325.4 87350.00 517.85 0.83
## 916 408324.8 87375.47 512.35 0.95
## 934 408325.2 87149.69 532.23 0.79
## 959 408325.0 87175.17 532.17 1.18
## 985 408325.5 87199.88 527.30 1.02
## 1011 408325.0 87225.00 524.69 0.75
## 1036 408324.8 87249.37 524.57 0.87
## 1061 408325.1 87274.68 525.49 0.65
## 1087 408324.6 87300.14 525.03 0.60
## 1113 408350.7 87325.83 526.26 0.67
## 1128 408350.3 87350.06 520.86 0.99
## 1148 408346.8 87376.31 512.27 0.96
## 1164 408350.5 87149.55 525.05 1.03
## 1189 408349.8 87176.18 526.79 0.92
## 1206 408349.6 87200.03 522.19 0.49
## 1231 408349.6 87223.75 527.07 0.33
## 1256 408349.3 87249.17 528.49 1.14
## 1281 408350.5 87275.27 529.87 0.45
## 1301 408349.3 87300.30 528.98 0.47
## 1326 408374.9 87325.04 528.73 0.74
## 1351 408375.3 87350.10 519.97 0.77
## 1377 408374.3 87375.51 514.47 1.21
## 1395 408375.0 87149.51 522.49 1.28
## 1420 408375.0 87175.37 519.89 1.03
## 1445 408374.4 87199.60 517.88 0.82
## 1470 408374.9 87225.00 524.91 0.93
## 1495 408375.0 87249.09 528.26 0.69
## 1512 408375.1 87275.04 530.77 0.83
## 1537 408375.0 87300.12 531.38 1.30
## 1553 408400.8 87327.05 530.11 0.55
## 1575 408399.5 87350.15 526.14 0.87
## 1600 408400.0 87377.16 519.90 0.65
## 1619 408399.7 87149.52 516.96 0.85
## 1644 408398.0 87176.48 512.97 2.66
## 1669 408398.6 87199.88 517.62 0.93
## 1694 408399.4 87225.96 522.82 1.04
## 1719 408400.0 87249.02 525.43 0.77
## 1744 408402.9 87276.28 529.38 0.96
## 1766 408400.2 87299.96 531.97 1.00
## 1784 408425.5 87324.96 533.12 0.79
## 1809 408425.4 87350.24 530.71 0.78
## 1828 408423.9 87375.02 525.02 0.66
## 1853 408424.9 87149.30 509.19 1.20
## 1879 408425.2 87175.51 506.70 0.87
## 1905 408425.0 87199.96 513.17 0.78
## 1930 408425.3 87224.99 519.44 0.86
## 1950 408424.9 87248.76 521.30 1.17
## 1969 408424.3 87274.86 526.17 0.87
## 1986 408424.7 87299.96 530.76 1.08
## 2007 408450.5 87324.14 533.97 0.96
## 2031 408450.0 87350.28 535.80 1.18
## 2054 408449.2 87376.84 528.12 0.99
## 2075 408450.2 87149.42 507.87 0.82
## 2100 408447.6 87174.37 511.08 1.26
## 2125 408450.1 87199.98 508.94 0.92
## 2144 408452.1 87224.64 512.06 0.84
## 2169 408449.6 87248.82 515.80 0.90
## 2194 408449.6 87273.68 520.76 1.17
## 2211 408449.9 87299.79 528.43 0.85
Mengubah Dataframe menjadi Spasial Point
Selanjutnya mendefinisikan koordinat pada data Nickel.
coordinates(Nickel)<-~XCOLLAR+YCOLLAR
plot(Nickel)1st-Order Trend Surface
Tahapan selanjutnya adalah mempersiapkan grid untuk interpolasi.
x=seq(408224,408453, by=10)
y=seq(87149,87378, by=10)
Nickel.grid <- expand.grid(x=x,y=y)
gridded(Nickel.grid) <- ~x+y
plot(Nickel.grid) Selanjutnya akan dilakukan 1st-order trend surface.
x1N <- gstat(formula=Nickel$Ni ~ 1, data=Nickel, degree=1)
ktN <- predict(x1N, newdata=Nickel.grid)## [ordinary or weighted least squares prediction]
spplot(ktN[1], contour=TRUE,main="Nickel trend surface (1)-gstat", cuts=8,
col.regions=brewer.pal(n=9,name="Reds"))Evaluasi Model
pred1 <- predict(x1N, newdata=Nickel)## [ordinary or weighted least squares prediction]
sqrt(sum((Nickel@data$Ni-pred1$var1.pred)^2)/length(Nickel@data$Ni))## [1] 0.2901443
2nd-Order Trend Surface
Berikut ini adalah contoh syntax untuk melakukan 2nd-order trend surface.
x2N <- gstat(formula=Nickel$Ni ~ 1, data=Nickel, degree=2)
kt2N <- predict(x2N, newdata=Nickel.grid)## [ordinary or weighted least squares prediction]
spplot(kt2N[1], contour=TRUE, main="Nickel trend surface (2)-gstat", cuts=8,
col.regions=brewer.pal(n=9,name="Reds"))Evaluasi Model
pred2 <- predict(x2N, newdata=Nickel)## [ordinary or weighted least squares prediction]
sqrt(sum((Nickel@data$Ni-pred2$var1.pred)^2)/length(Nickel@data$Ni))## [1] 0.274336
Inverse Distance Weighting (IDW)
krigN <- krige(Nickel$Ni ~ 1, Nickel, Nickel.grid, NULL)## [inverse distance weighted interpolation]
spplot(krigN["var1.pred"], main = "Nickel IDW",contour=TRUE, cuts=8,
col.regions=brewer.pal(n=9,name="Reds"))Evaluasi Model
idwcv<-krige.cv(Ni ~ 1,Nickel)
RMSE.idw<-sqrt(sum(idwcv$residual^2)/length(idwcv$residual))
RMSE.idw## [1] 0.2885053
Thiessen Polygons
Nickel.tp = krige(log(Ni) ~ 1, Nickel, Nickel.grid, nmax = 1)## [inverse distance weighted interpolation]
#for this search, the neighborhood is set to nmax=1
image(Nickel.tp["var1.pred"])
points(Nickel, pch = "+", cex = 0.65)ccc = coordinates(Nickel)Plot interpolasi dapat pula ditampilkan dengan package tripack.
library(tripack)
image(Nickel.tp["var1.pred"])
points(Nickel, pch = "+", cex = 0.65)
ccc = coordinates(Nickel)
plot(voronoi.mosaic(ccc[, 1],ccc[, 2]),do.points =FALSE, add =TRUE)Evaluasi Model
thiessen.cv <- krige.cv(Ni~1, Nickel, nmax=1)
RMSE.thiessen <-sqrt(sum(thiessen.cv$residual^2)/length(thiessen.cv$residual))
RMSE.thiessen## [1] 0.4063853
Kriging
Membuat Variogram
library(gstat)
v <- variogram(Ni~1,Nickel)
vm <- vgm(model="Wav")
vmfN <- fit.variogram(v,vm)## Warning in fit.variogram(v, vm): No convergence after 200 iterations: try
## different initial values?
plot(v,pl=T,model=vmfN,col="red",pch=19)Kriging
krig <- krige(Ni~1,locations=Nickel,newdata=Nickel.grid,model=vmfN) ## [using ordinary kriging]
pts.s <- list("sp.points",Nickel,col="coral",pch=19,cex=4*Nickel$Ni/max(Nickel$Ni))
spplot(krig,"var1.pred",asp=1,cuts=8,col.regions=brewer.pal(n=9,name="Blues"),main="Ordinary kriging (Ni)",sp.layout=list(pts.s)) Evaluasi Model
Nickel.KrigingCV <-krige.cv(Ni~1,Nickel,Nickel.grid,model=vmfN)
(RMSE.KringCV<-sqrt(sum(Nickel.KrigingCV$residual^2)/length(Nickel.KrigingCV$residual))) ## [1] 4.751409
Memilih Interpolasi yang Baik
Beberapa kriteria memilih interpolasi terbaik:
Kualitas set titik sampel dapat mempengaruhi pilihan metode interpolasi juga. Jika titik sampel tidak terdistribusi dengan baik atau jumlahnya sedikit, permukaan mungkin tidak mewakili medan sebenarnya dengan sangat baik. Jika kita memiliki terlalu sedikit titik sampel, tambahkan lebih banyak titik sampel di area di mana medan berubah secara tiba-tiba atau sering, lalu coba gunakan Kriging.
Pengetahuan kondisi sebenarnya akan mempengaruhi metode interpolasi yang akan digunakan. Jika diketahui bahwa beberapa fitur di permukaan melebihi nilai z dan IDW akan menghasilkan permukaan yang tidak melebihi nilai z tertinggi atau terendah dalam kumpulan titik sampel, pilih metode Spline.
Jika kita tahu bahwa permukaan splined mungkin berakhir dengan fitur yang kita tahu tidak ada karena interpolasi Spline tidak bekerja dengan baik dengan titik sampel yang berdekatan dan memiliki perbedaan nilai yang ekstrim, Anda mungkin memutuskan untuk mencoba IDW.
Bandingkan nilai RMSE
Dalam exercise ini digunakan nilai RMSE, metode yang memberikan nilai RMSE yang paling kecil yaitu: metode 2nd-Order Trend Surface.
Kesimpulan
Prediksi kandungan Nikel dengan model 2nd-Order Trend Surface lebih baik dibandingkan model interpolasi lainnya.
TERIMA KASIH
Referensi
Anisa, R. 2021.Sebaran Spasial Dua Tipe Titik. Retrieved from newlms.ipb.ac.id
Anisa, R. September 24, 2020.Interpolasi Spasial. Retrieved from newlms.ipb.ac.id
Anisa, R. 2021.Autokorelasi Spasial. Retrieved from newlms.ipb.ac.id
Anisa, R. 2021.Kriging dan CoKriging. Retrieved from newlms.ipb.ac.id
Mahasiswa Pascasarjana Statistika dan Sains Data, IPB University, reniamelia@apps.ipb.ac.id↩︎