Introduction

Suatu pemodelan dapat bersifat global maupun lokal. Regresi linier klasik merupakan salah satu model global. Dikatakan global karena terdapat satu model yang berlaku umum untuk semua pengamatan.

Suatu model lokal bersifat lebih fleksibel, yang dalam konteks spasial, artinya setiap daerah/lokasi dapat memiliki model masing-masing.

Geographically Weighted Regression (GWR) merupakan salah satu model yang bersifat lokal. Beberapa keuntungan dengen menggunakan model ini, diantaranya adalah kita dapat:

menduga galat baku lokal
menghitung ukuran leverage lokal
melakukan pengujian terhadap signifikansi keragaman spasial pada penduga parameter lokal
menguji apakah model lokal lebih baik daripada model global

Terdapat salah satu stand-alone software untuk melakukan GWR, yaitu software GWR yang dapat diakses melalui http://ncg.nuim.ie/ncg/GWR/. Selain itu, pada R software, terdapat beberapa package yang dapat digunakan untuk membangun model GWR, yaitu:

GWmodel
spgwr
gwrr

Pada modul ini akan dibahas pemodelan GWR menggunakan package spgwr.

Ilustrasi menggunakan Data Columbus

library(spgwr)

data(columbus)
attach(columbus)

Standard Regression

colex0 <- lm(CRIME ~ (INC + HOVAL))
summary(colex0)

## 
## Call:
## lm(formula = CRIME ~ (INC + HOVAL))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -34.418  -6.388  -1.580   9.052  28.649 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  68.6190     4.7355  14.490  < 2e-16 ***
## INC          -1.5973     0.3341  -4.780 1.83e-05 ***
## HOVAL        -0.2739     0.1032  -2.654   0.0109 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.43 on 46 degrees of freedom
## Multiple R-squared:  0.5524, Adjusted R-squared:  0.5329 
## F-statistic: 28.39 on 2 and 46 DF,  p-value: 9.341e-09

Diagnostik Model

Berikut ini apabila pemeriksaan sisaan dilakukan secara visual.

resid<-residuals(colex0)
par(mfrow=c(2,2))
qqnorm(resid); qqline(resid, col="red"); 
plot(resid~fitted(colex0),xlab = "Predicted Values",ylab = "Residuals")
    abline(h=0, col="red")
hist(resid) #histogram utk residual
plot(1:nrow(columbus), resid, pch=20,type="b")
    abline(h=0, col="red")

Berikut ini salah satu contoh untuk memeriksa asumsi pada sisaan menggunakan uji formal.

shapiro.test(resid)

## 
##  Shapiro-Wilk normality test
## 
## data:  resid
## W = 0.97708, p-value = 0.4497

lmtest::bptest(colex0)

## 
##  studentized Breusch-Pagan test
## 
## data:  colex0
## BP = 7.2166, df = 2, p-value = 0.0271

library(spdep)

## Loading required package: sf

## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1

coords<-columbus[c("X","Y")]
jarak<-as.matrix(1/dist(coords))
lm.morantest(colex0,listw=mat2listw(jarak), alternative="two.sided")

## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = CRIME ~ (INC + HOVAL))
## weights: mat2listw(jarak)
## 
## Moran I statistic standard deviate = 5.3215, p-value = 1.029e-07
## alternative hypothesis: two.sided
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.090950110     -0.024788597      0.000473023

Spatially Dissagregated Model

colex <- lm(CRIME ~ (INC + HOVAL)*(X + Y))
summary(colex)

## 
## Call:
## lm(formula = CRIME ~ (INC + HOVAL) * (X + Y))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -18.5556  -7.6351  -0.6181   7.8363  30.1948 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 108.97559   69.36676   1.571   0.1241  
## INC          -5.82949    3.84408  -1.516   0.1373  
## HOVAL         0.27337    0.82049   0.333   0.7407  
## X            -0.76287    1.13692  -0.671   0.5061  
## Y            -0.26332    1.21420  -0.217   0.8294  
## INC:X        -0.01854    0.05396  -0.344   0.7329  
## INC:Y         0.13949    0.08004   1.743   0.0891 .
## HOVAL:X       0.03159    0.01549   2.040   0.0480 *
## HOVAL:Y      -0.05034    0.02196  -2.293   0.0272 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.04 on 40 degrees of freedom
## Multiple R-squared:  0.6375, Adjusted R-squared:  0.5649 
## F-statistic: 8.791 on 8 and 40 DF,  p-value: 7.663e-07

Perhatikan bahwa R-square pada model ini sudah lebih baik dibandingkan dengan model regresi klasik, namun perhatikan pula bahwa peubah yang signifikan hanyalah sedikit.

Selanjutnya, misalkan kita ingin mengekstrak koefisien pada model tersebut.

b <- colex$coefficients
b[3]

##     HOVAL 
## 0.2733714

b[8]

##    HOVAL:X 
## 0.03159375

b[9]

##     HOVAL:Y 
## -0.05034417

Koefisien pada setiap titik lokasi.

bihoval <- b[3] + b[8] * X + b[9] * Y
bihoval

##  [1] -0.71945869 -0.73484522 -0.54173838 -0.61340248 -0.37564117 -0.32192096
##  [7] -0.60638062 -0.51564482 -0.16255859 -0.05598467 -0.35829853 -0.37198398
## [13] -0.42336113 -0.39035215 -0.23271508 -0.25443306  0.07333582 -0.29218860
## [19] -0.34176508  0.14326490 -0.18138317 -0.04092354  0.19700015 -0.18584181
## [25] -0.13841490 -0.09558335  0.07053751  0.02287442 -0.01427011 -0.04484941
## [31] -0.34667265  0.35074015  0.13619297 -0.19143232  0.18497546 -0.28527600
## [37]  0.03312495  0.12107370 -0.30416609  0.39765999  0.49266738 -0.14792599
## [43]  0.18759738  0.26427001  0.21424089 -0.21628436  0.61049050  0.27143584
## [49]  0.36488625

summary(bihoval)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.7348 -0.3418 -0.1479 -0.1096  0.1362  0.6105

Berikut ini kita akan cobe memvisualisasikan koefisien tersebut. Sebelumnya Anda harus mendownload data peta columbus melalui link berikut: https://github.com/raoy/Spatial-Statistics/blob/master/columbus.rar. Silahkan simpan pada directory yang Anda inginkan, dan impor data shp tersebut dengan fungsi berikut:

library(rgdal)
col.shp<-readOGR(dsn=“your directory", layer=“your shp filename")

col.shp@data$bi<-bihoval

spplot(col.shp, zcol="bi")

Basic GWR

library(spgwr)
colg1 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),bandwidth=20)
colg1

## Call:
## gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X, 
##     Y), bandwidth = 20)
## Kernel function: gwr.Gauss 
## Fixed bandwidth: 20 
## Summary of GWR coefficient estimates at data points:
##                  Min.  1st Qu.   Median  3rd Qu.     Max.  Global
## X.Intercept. 68.41172 68.84934 69.06064 69.24627 69.95574 68.6190
## INC          -1.65427 -1.62266 -1.60883 -1.58070 -1.52803 -1.5973
## HOVAL        -0.30704 -0.29166 -0.27768 -0.26118 -0.23198 -0.2739

Using Different Bandwidth

colg2 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),bandwidth=3)
colg2

## Call:
## gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X, 
##     Y), bandwidth = 3)
## Kernel function: gwr.Gauss 
## Fixed bandwidth: 3 
## Summary of GWR coefficient estimates at data points:
##                   Min.   1st Qu.    Median   3rd Qu.      Max.  Global
## X.Intercept. 24.823915 58.337575 66.888807 70.734368 77.517040 68.6190
## INC          -2.910848 -2.055262 -1.370336 -0.489550  0.605579 -1.5973
## HOVAL        -0.938862 -0.370668 -0.072330 -0.016077  0.417796 -0.2739

colg3 <- gwr(CRIME ~ INC + HOVAL, data=columbus,
    coords=cbind(X,Y),bandwidth=2)
colg3

## Call:
## gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X, 
##     Y), bandwidth = 2)
## Kernel function: gwr.Gauss 
## Fixed bandwidth: 2 
## Summary of GWR coefficient estimates at data points:
##                   Min.   1st Qu.    Median   3rd Qu.      Max.  Global
## X.Intercept. 22.991392 51.964036 62.876904 69.090317 81.631781 68.6190
## INC          -3.386156 -1.956824 -0.702189 -0.325575  1.339862 -1.5973
## HOVAL        -1.209784 -0.372961 -0.115808  0.038842  0.879194 -0.2739

Membandingkan antar Model Lokal

hovg3 <- colg2$SDF$HOVAL
hovg20 <- colg1$SDF$HOVAL
hovg2 <- colg3$SDF$HOVAL
boxplot(bihoval,hovg20,hovg3,hovg2,
    names=c("Expansion","bw=20","bw=3","bw=2"))

Perhatikan bahwa sebaran pada bandwidth 20 terkonsentrasi di sekitar median, sedangkan pada bandwidth 2 menunjukkan beberapa pencilan. Sebaran pada model linear expansion mirip dengan bandwidth 3, namun median-nya lebih rendah.

Menentukan Bandwidth Optimal

bw1 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y))

## Bandwidth: 12.65221 CV score: 7432.209 
## Bandwidth: 20.45127 CV score: 7462.704 
## Bandwidth: 7.83213 CV score: 7323.545 
## Bandwidth: 4.853154 CV score: 7307.57 
## Bandwidth: 5.125504 CV score: 7322.796 
## Bandwidth: 3.012046 CV score: 6461.764 
## Bandwidth: 1.874179 CV score: 6473.378 
## Bandwidth: 2.475485 CV score: 6109.995 
## Bandwidth: 2.447721 CV score: 6098.372 
## Bandwidth: 2.228647 CV score: 6064.1 
## Bandwidth: 2.264538 CV score: 6060.774 
## Bandwidth: 2.280666 CV score: 6060.649 
## Bandwidth: 2.274969 CV score: 6060.601 
## Bandwidth: 2.2751 CV score: 6060.601 
## Bandwidth: 2.27506 CV score: 6060.601 
## Bandwidth: 2.275019 CV score: 6060.601 
## Bandwidth: 2.27506 CV score: 6060.601

colg4 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),bandwidth=bw1)

bw2 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),method="aic")

## Bandwidth: 12.65221 AIC: 383.2507 
## Bandwidth: 20.45127 AIC: 383.5182 
## Bandwidth: 7.83213 AIC: 382.7555 
## Bandwidth: 4.853154 AIC: 381.4751 
## Bandwidth: 3.012046 AIC: 384.5411 
## Bandwidth: 5.991021 AIC: 382.3503 
## Bandwidth: 4.149913 AIC: 380.7132 
## Bandwidth: 3.715287 AIC: 380.7565 
## Bandwidth: 3.980563 AIC: 380.6324 
## Bandwidth: 3.955538 AIC: 380.6289 
## Bandwidth: 3.927578 AIC: 380.6281 
## Bandwidth: 3.934794 AIC: 380.628 
## Bandwidth: 3.935053 AIC: 380.628 
## Bandwidth: 3.934987 AIC: 380.628 
## Bandwidth: 3.934946 AIC: 380.628 
## Bandwidth: 3.934987 AIC: 380.628

colg5 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),bandwidth=bw2)

bwbs1 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),gweight=gwr.bisquare)

## Bandwidth: 12.65221 CV score: 8180.619 
## Bandwidth: 20.45127 CV score: 7552.85 
## Bandwidth: 25.27136 CV score: 7508.227 
## Bandwidth: 23.68132 CV score: 7519.864 
## Bandwidth: 28.25033 CV score: 7491.85 
## Bandwidth: 30.09144 CV score: 7486.673 
## Bandwidth: 31.69353 CV score: 7483.663 
## Bandwidth: 31.08159 CV score: 7484.706 
## Bandwidth: 32.21945 CV score: 7482.846 
## Bandwidth: 32.54449 CV score: 7482.371 
## Bandwidth: 32.74538 CV score: 7482.088 
## Bandwidth: 32.86953 CV score: 7481.916 
## Bandwidth: 32.94626 CV score: 7481.812 
## Bandwidth: 32.99368 CV score: 7481.748 
## Bandwidth: 33.02299 CV score: 7481.708 
## Bandwidth: 33.04111 CV score: 7481.684 
## Bandwidth: 33.0523 CV score: 7481.669 
## Bandwidth: 33.05922 CV score: 7481.659 
## Bandwidth: 33.0635 CV score: 7481.654 
## Bandwidth: 33.06614 CV score: 7481.65 
## Bandwidth: 33.06777 CV score: 7481.648 
## Bandwidth: 33.06878 CV score: 7481.647 
## Bandwidth: 33.06941 CV score: 7481.646 
## Bandwidth: 33.06979 CV score: 7481.645 
## Bandwidth: 33.07003 CV score: 7481.645 
## Bandwidth: 33.07018 CV score: 7481.645 
## Bandwidth: 33.07027 CV score: 7481.645 
## Bandwidth: 33.07032 CV score: 7481.645 
## Bandwidth: 33.07037 CV score: 7481.645 
## Bandwidth: 33.07037 CV score: 7481.645

## Warning in gwr.sel(CRIME ~ INC + HOVAL, data = columbus, coords = cbind(X, :
## Bandwidth converged to upper bound:33.0704149683672

colg6 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),gweight=gwr.bisquare,
    bandwidth=bwbs1)

bwbs2 <- gwr.sel(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),gweight=gwr.bisquare,method="aic")

## Bandwidth: 12.65221 AIC: 382.7242 
## Bandwidth: 20.45127 AIC: 384.3786 
## Bandwidth: 7.83213 AIC: 386.7498 
## Bandwidth: 15.27372 AIC: 384.0654 
## Bandwidth: 10.81111 AIC: 381.6533 
## Bandwidth: 9.673238 AIC: 383.0491 
## Bandwidth: 11.25258 AIC: 381.6384 
## Bandwidth: 11.07023 AIC: 381.6049 
## Bandwidth: 11.05193 AIC: 381.6044 
## Bandwidth: 11.04548 AIC: 381.6044 
## Bandwidth: 11.04647 AIC: 381.6044 
## Bandwidth: 11.04651 AIC: 381.6044 
## Bandwidth: 11.04655 AIC: 381.6044 
## Bandwidth: 11.04651 AIC: 381.6044

colg7 <- gwr(CRIME ~ INC + HOVAL,data=columbus,
    coords=cbind(X,Y),gweight=gwr.bisquare,
    bandwidth=bwbs2)

Selanjutnya, kita dapat memperoleh penduga koefisien dan prediksi dengan syntax berikut.

bihoval<-colg7$SDF$HOVAL
prediction<-colg7$SDF$pred

col.shp@data$bi7<-bihoval

spplot(col.shp, zcol="bi7")

Practice

To practice, plot or map the coefficient vector for the other coefficients in the model. Alternatively, check for continuous spatial heterogeneity in the BOSTON or BALTIMORE data sets. Compare the insights provided by the expansion method to those from GWR, and carry out sensitivity analysis for the choice of bandwidth and kernel function.

Reference

Brunsdon, C. 2015. Geographically Weighted Regression. https://rstudio-pubs-static.s3.amazonaws.com/176883_06a3fa1fc77444be85e94dcd97ba9a34.html [18 Desember 2017]

Anonymous. Local regression. http://rspatial.org/analysis/rst/6-local_regression.html#local-regression [18 Desember 2017]

Geographically Weighted Regression (GWR)

rahmaanisa@apps.ipb.ac.id

Praktikum 13 | Statistika Spasial