Suatu pemodelan dapat bersifat global maupun lokal. Regresi linier klasik merupakan salah satu model global. Dikatakan global karena terdapat satu model yang berlaku umum untuk semua pengamatan.
Suatu model lokal bersifat lebih fleksibel, yang dalam konteks spasial, artinya setiap daerah/lokasi dapat memiliki model masing-masing.
Geographically Weighted Regression (GWR) merupakan salah satu model yang bersifat lokal. Beberapa keuntungan dengen menggunakan model ini, diantaranya adalah kita dapat:
menduga galat baku lokal
menghitung ukuran leverage lokal
melakukan pengujian terhadap signifikansi keragaman spasial pada penduga parameter lokal
menguji apakah model lokal lebih baik daripada model global
Terdapat salah satu stand-alone software untuk melakukan GWR, yaitu software GWR yang dapat diakses melalui http://ncg.nuim.ie/ncg/GWR/. Selain itu, pada R software, terdapat beberapa package yang dapat digunakan untuk membangun model GWR, yaitu:
GWmodel
spgwr
gwrr
Pada modul ini akan dibahas pemodelan GWR menggunakan package spgwr
.
library(spgwr)
data(columbus)
attach(columbus)
colex0 <- lm(CRIME ~ (INC + HOVAL))
summary(colex0)
##
## Call:
## lm(formula = CRIME ~ (INC + HOVAL))
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.418 -6.388 -1.580 9.052 28.649
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 68.6190 4.7355 14.490 < 2e-16 ***
## INC -1.5973 0.3341 -4.780 1.83e-05 ***
## HOVAL -0.2739 0.1032 -2.654 0.0109 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.43 on 46 degrees of freedom
## Multiple R-squared: 0.5524, Adjusted R-squared: 0.5329
## F-statistic: 28.39 on 2 and 46 DF, p-value: 9.341e-09
Berikut ini apabila pemeriksaan sisaan dilakukan secara visual.
resid<-residuals(colex0)
par(mfrow=c(2,2))
qqnorm(resid); qqline(resid, col="red");
plot(resid~fitted(colex0),xlab = "Predicted Values",ylab = "Residuals")
abline(h=0, col="red")
hist(resid) #histogram utk residual
plot(1:nrow(columbus), resid, pch=20,type="b")
abline(h=0, col="red")
Berikut ini salah satu contoh untuk memeriksa asumsi pada sisaan menggunakan uji formal.
shapiro.test(resid)
##
## Shapiro-Wilk normality test
##
## data: resid
## W = 0.97708, p-value = 0.4497
lmtest::bptest(colex0)
##
## studentized Breusch-Pagan test
##
## data: colex0
## BP = 7.2166, df = 2, p-value = 0.0271
library(spdep)
## Loading required package: sf
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
coords<-columbus[c("X","Y")]
jarak<-as.matrix(1/dist(coords))
lm.morantest(colex0,listw=mat2listw(jarak), alternative="two.sided")
##
## Global Moran I for regression residuals
##
## data:
## model: lm(formula = CRIME ~ (INC + HOVAL))
## weights: mat2listw(jarak)
##
## Moran I statistic standard deviate = 5.3215, p-value = 1.029e-07
## alternative hypothesis: two.sided
## sample estimates:
## Observed Moran I Expectation Variance
## 0.090950110 -0.024788597 0.000473023
colex <- lm(CRIME ~ (INC + HOVAL)*(X + Y))
summary(colex)
##
## Call:
## lm(formula = CRIME ~ (INC + HOVAL) * (X + Y))
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.5556 -7.6351 -0.6181 7.8363 30.1948
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 108.97559 69.36676 1.571 0.1241
## INC -5.82949 3.84408 -1.516 0.1373
## HOVAL 0.27337 0.82049 0.333 0.7407
## X -0.76287 1.13692 -0.671 0.5061
## Y -0.26332 1.21420 -0.217 0.8294
## INC:X -0.01854 0.05396 -0.344 0.7329
## INC:Y 0.13949 0.08004 1.743 0.0891 .
## HOVAL:X 0.03159 0.01549 2.040 0.0480 *
## HOVAL:Y -0.05034 0.02196 -2.293 0.0272 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.04 on 40 degrees of freedom
## Multiple R-squared: 0.6375, Adjusted R-squared: 0.5649
## F-statistic: 8.791 on 8 and 40 DF, p-value: 7.663e-07
Perhatikan bahwa R-square pada model ini sudah lebih baik dibandingkan dengan model regresi klasik, namun perhatikan pula bahwa peubah yang signifikan hanyalah sedikit.
Selanjutnya, misalkan kita ingin mengekstrak koefisien pada model tersebut.
b <- colex$coefficients
b[3]
## HOVAL
## 0.2733714
b[8]
## HOVAL:X
## 0.03159375
b[9]
## HOVAL:Y
## -0.05034417
Koefisien pada setiap titik lokasi.
bihoval <- b[3] + b[8] * X + b[9] * Y
bihoval
## [1] -0.71945869 -0.73484522 -0.54173838 -0.61340248 -0.37564117 -0.32192096
## [7] -0.60638062 -0.51564482 -0.16255859 -0.05598467 -0.35829853 -0.37198398
## [13] -0.42336113 -0.39035215 -0.23271508 -0.25443306 0.07333582 -0.29218860
## [19] -0.34176508 0.14326490 -0.18138317 -0.04092354 0.19700015 -0.18584181
## [25] -0.13841490 -0.09558335 0.07053751 0.02287442 -0.01427011 -0.04484941
## [31] -0.34667265 0.35074015 0.13619297 -0.19143232 0.18497546 -0.28527600
## [37] 0.03312495 0.12107370 -0.30416609 0.39765999 0.49266738 -0.14792599
## [43] 0.18759738 0.26427001 0.21424089 -0.21628436 0.61049050 0.27143584
## [49] 0.36488625
summary(bihoval)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.7348 -0.3418 -0.1479 -0.1096 0.1362 0.6105
Berikut ini kita akan cobe memvisualisasikan koefisien tersebut. Sebelumnya Anda harus mendownload data peta columbus
melalui link berikut: https://github.com/raoy/Spatial-Statistics/blob/master/columbus.rar. Silahkan simpan pada directory yang Anda inginkan, dan impor data shp tersebut dengan fungsi berikut:
library(rgdal)
col.shp<-readOGR(dsn=“your directory", layer=“your shp filename")
col.shp@data$bi<-bihoval
spplot(col.shp, zcol="bi")
library(spgwr)
colg1 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),bandwidth=20)
colg1
## Call:
## gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X,
## Y), bandwidth = 20)
## Kernel function: gwr.Gauss
## Fixed bandwidth: 20
## Summary of GWR coefficient estimates at data points:
## Min. 1st Qu. Median 3rd Qu. Max. Global
## X.Intercept. 68.41172 68.84934 69.06064 69.24627 69.95574 68.6190
## INC -1.65427 -1.62266 -1.60883 -1.58070 -1.52803 -1.5973
## HOVAL -0.30704 -0.29166 -0.27768 -0.26118 -0.23198 -0.2739
colg2 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),bandwidth=3)
colg2
## Call:
## gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X,
## Y), bandwidth = 3)
## Kernel function: gwr.Gauss
## Fixed bandwidth: 3
## Summary of GWR coefficient estimates at data points:
## Min. 1st Qu. Median 3rd Qu. Max. Global
## X.Intercept. 24.823915 58.337575 66.888807 70.734368 77.517040 68.6190
## INC -2.910848 -2.055262 -1.370336 -0.489550 0.605579 -1.5973
## HOVAL -0.938862 -0.370668 -0.072330 -0.016077 0.417796 -0.2739
colg3 <- gwr(CRIME ~ INC + HOVAL, data=columbus,
coords=cbind(X,Y),bandwidth=2)
colg3
## Call:
## gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X,
## Y), bandwidth = 2)
## Kernel function: gwr.Gauss
## Fixed bandwidth: 2
## Summary of GWR coefficient estimates at data points:
## Min. 1st Qu. Median 3rd Qu. Max. Global
## X.Intercept. 22.991392 51.964036 62.876904 69.090317 81.631781 68.6190
## INC -3.386156 -1.956824 -0.702189 -0.325575 1.339862 -1.5973
## HOVAL -1.209784 -0.372961 -0.115808 0.038842 0.879194 -0.2739
hovg3 <- colg2$SDF$HOVAL
hovg20 <- colg1$SDF$HOVAL
hovg2 <- colg3$SDF$HOVAL
boxplot(bihoval,hovg20,hovg3,hovg2,
names=c("Expansion","bw=20","bw=3","bw=2"))
Perhatikan bahwa sebaran pada bandwidth 20 terkonsentrasi di sekitar median, sedangkan pada bandwidth 2 menunjukkan beberapa pencilan. Sebaran pada model linear expansion mirip dengan bandwidth 3, namun median-nya lebih rendah.
bw1 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y))
## Bandwidth: 12.65221 CV score: 7432.209
## Bandwidth: 20.45127 CV score: 7462.704
## Bandwidth: 7.83213 CV score: 7323.545
## Bandwidth: 4.853154 CV score: 7307.57
## Bandwidth: 5.125504 CV score: 7322.796
## Bandwidth: 3.012046 CV score: 6461.764
## Bandwidth: 1.874179 CV score: 6473.378
## Bandwidth: 2.475485 CV score: 6109.995
## Bandwidth: 2.447721 CV score: 6098.372
## Bandwidth: 2.228647 CV score: 6064.1
## Bandwidth: 2.264538 CV score: 6060.774
## Bandwidth: 2.280666 CV score: 6060.649
## Bandwidth: 2.274969 CV score: 6060.601
## Bandwidth: 2.2751 CV score: 6060.601
## Bandwidth: 2.27506 CV score: 6060.601
## Bandwidth: 2.275019 CV score: 6060.601
## Bandwidth: 2.27506 CV score: 6060.601
colg4 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),bandwidth=bw1)
bw2 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),method="aic")
## Bandwidth: 12.65221 AIC: 383.2507
## Bandwidth: 20.45127 AIC: 383.5182
## Bandwidth: 7.83213 AIC: 382.7555
## Bandwidth: 4.853154 AIC: 381.4751
## Bandwidth: 3.012046 AIC: 384.5411
## Bandwidth: 5.991021 AIC: 382.3503
## Bandwidth: 4.149913 AIC: 380.7132
## Bandwidth: 3.715287 AIC: 380.7565
## Bandwidth: 3.980563 AIC: 380.6324
## Bandwidth: 3.955538 AIC: 380.6289
## Bandwidth: 3.927578 AIC: 380.6281
## Bandwidth: 3.934794 AIC: 380.628
## Bandwidth: 3.935053 AIC: 380.628
## Bandwidth: 3.934987 AIC: 380.628
## Bandwidth: 3.934946 AIC: 380.628
## Bandwidth: 3.934987 AIC: 380.628
colg5 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),bandwidth=bw2)
bwbs1 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),gweight=gwr.bisquare)
## Bandwidth: 12.65221 CV score: 8180.619
## Bandwidth: 20.45127 CV score: 7552.85
## Bandwidth: 25.27136 CV score: 7508.227
## Bandwidth: 23.68132 CV score: 7519.864
## Bandwidth: 28.25033 CV score: 7491.85
## Bandwidth: 30.09144 CV score: 7486.673
## Bandwidth: 31.69353 CV score: 7483.663
## Bandwidth: 31.08159 CV score: 7484.706
## Bandwidth: 32.21945 CV score: 7482.846
## Bandwidth: 32.54449 CV score: 7482.371
## Bandwidth: 32.74538 CV score: 7482.088
## Bandwidth: 32.86953 CV score: 7481.916
## Bandwidth: 32.94626 CV score: 7481.812
## Bandwidth: 32.99368 CV score: 7481.748
## Bandwidth: 33.02299 CV score: 7481.708
## Bandwidth: 33.04111 CV score: 7481.684
## Bandwidth: 33.0523 CV score: 7481.669
## Bandwidth: 33.05922 CV score: 7481.659
## Bandwidth: 33.0635 CV score: 7481.654
## Bandwidth: 33.06614 CV score: 7481.65
## Bandwidth: 33.06777 CV score: 7481.648
## Bandwidth: 33.06878 CV score: 7481.647
## Bandwidth: 33.06941 CV score: 7481.646
## Bandwidth: 33.06979 CV score: 7481.645
## Bandwidth: 33.07003 CV score: 7481.645
## Bandwidth: 33.07018 CV score: 7481.645
## Bandwidth: 33.07027 CV score: 7481.645
## Bandwidth: 33.07032 CV score: 7481.645
## Bandwidth: 33.07037 CV score: 7481.645
## Bandwidth: 33.07037 CV score: 7481.645
## Warning in gwr.sel(CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X, :
## Bandwidth converged to upper bound:33.0704149683672
colg6 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),gweight=gwr.bisquare,
bandwidth=bwbs1)
bwbs2 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),gweight=gwr.bisquare,method="aic")
## Bandwidth: 12.65221 AIC: 382.7242
## Bandwidth: 20.45127 AIC: 384.3786
## Bandwidth: 7.83213 AIC: 386.7498
## Bandwidth: 15.27372 AIC: 384.0654
## Bandwidth: 10.81111 AIC: 381.6533
## Bandwidth: 9.673238 AIC: 383.0491
## Bandwidth: 11.25258 AIC: 381.6384
## Bandwidth: 11.07023 AIC: 381.6049
## Bandwidth: 11.05193 AIC: 381.6044
## Bandwidth: 11.04548 AIC: 381.6044
## Bandwidth: 11.04647 AIC: 381.6044
## Bandwidth: 11.04651 AIC: 381.6044
## Bandwidth: 11.04655 AIC: 381.6044
## Bandwidth: 11.04651 AIC: 381.6044
colg7 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
coords=cbind(X,Y),gweight=gwr.bisquare,
bandwidth=bwbs2)
Selanjutnya, kita dapat memperoleh penduga koefisien dan prediksi dengan syntax berikut.
bihoval<-colg7$SDF$HOVAL
prediction<-colg7$SDF$pred
col.shp@data$bi7<-bihoval
spplot(col.shp, zcol="bi7")
To practice, plot or map the coefficient vector for the other coefficients in the model. Alternatively, check for continuous spatial heterogeneity in the BOSTON or BALTIMORE data sets. Compare the insights provided by the expansion method to those from GWR, and carry out sensitivity analysis for the choice of bandwidth and kernel function.
Brunsdon, C. 2015. Geographically Weighted Regression. https://rstudio-pubs-static.s3.amazonaws.com/176883_06a3fa1fc77444be85e94dcd97ba9a34.html [18 Desember 2017]
Anonymous. Local regression. http://rspatial.org/analysis/rst/6-local_regression.html#local-regression [18 Desember 2017]