1 Matriks Bobot Spasial

1.1 Definisi

Matriks bobot spasial (W) menggambarkan hubungan spasial antar unit geografis. Elemen Wij > 0 jika unit i dan j bertetangga.

1.2 Jenis Bobot

  • Queen Contiguity: berbagi sisi atau sudut.
  • Rook Contiguity: berbagi sisi.
  • Distance-based: jarak Euclidean ≤ threshold.
  • k-Nearest Neighbors: k terdekat.

1.3 Visualisasi Queen & k-NN

bandung_sf <- st_read("D:/LALA/SMT 2 S2/Spatial/Tugas RMD/bandung_kota_bandung.shp")
## Reading layer `bandung_kota_bandung' from data source 
##   `D:\LALA\SMT 2 S2\Spatial\Tugas RMD\bandung_kota_bandung.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 30 features and 7 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 107.5459 ymin: -6.969735 xmax: 107.7393 ymax: -6.836885
## Geodetic CRS:  WGS 84
coords      <- st_coordinates(st_centroid(bandung_sf))

nb_q   <- poly2nb(bandung_sf)
w_q    <- nb2listw(nb_q, style = "W")

knn4   <- knearneigh(coords, k = 4)
bb_nb  <- knn2nb(knn4)
w_knn <- nb2listw(bb_nb, style = "W")

par(mar = c(0,0,2,0))
plot(st_geometry(bandung_sf), main = "Queen vs k-NN (k=4)")
plot(nb_q, coords, add = TRUE, col = "red", lwd = 1.5)
plot(bb_nb, coords, add = TRUE, col = "blue", lwd = 1.5)
legend("topright", legend = c("Queen","k-NN"), col = c("red","blue"), lwd = 1.5, bty = "n")

2 Data & Variabel Penelitian

2.1 Load Data Riil & Normalisasi

shp_path   <- "D:/LALA/SMT 2 S2/Spatial/Tugas RMD/bandung_kota_bandung.shp"
csv_path   <- "D:/LALA/SMT 2 S2/Spatial/Tugas RMD/Dataset.csv"
bandung_sf <- st_read(shp_path)
## Reading layer `bandung_kota_bandung' from data source 
##   `D:\LALA\SMT 2 S2\Spatial\Tugas RMD\bandung_kota_bandung.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 30 features and 7 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 107.5459 ymin: -6.969735 xmax: 107.7393 ymax: -6.836885
## Geodetic CRS:  WGS 84
data_bps   <- read_csv(csv_path)

clean_name <- function(x) str_to_upper(x) %>% str_replace_all("[^A-Z0-9]", "")
bandung_sf <- bandung_sf %>% mutate(Kec_clean = clean_name(Kecamatan))
data_bps   <- data_bps   %>% mutate(Kec_clean = clean_name(Kecamatan))

bandung_sf <- left_join(bandung_sf, data_bps, by = "Kec_clean")

cat("NA pada Y:", sum(is.na(bandung_sf$Y)), "\n")
## NA pada Y: 0

2.2 Definisi Variabel

  • Y : Jumlah Penderita Hipertensi ≥ 15 tahun per kecamatan
  • X1: Jumlah Penderita Obesitas per kecamatan
  • X2: Jumlah Pelayanan Kesehatan usia 15–59 tahun per kecamatan

3 Analisis Data Simulasi

3.1 Persamaan Model

SAR: \(Y = (I - \rho W)^{-1}(X\beta + \varepsilon)\), \(\varepsilon \sim N(0, \sigma^2 I)\)

SEM: \(Y = X\beta + u,\; u = (I - \lambda W)^{-1}\xi,\; \xi \sim N(0, \sigma^2 I)\)

3.2 Simulasi Data & Estimasi

set.seed(123)
bc        <- bandung_sf %>% filter(!is.na(X1) & !is.na(X2))
n         <- nrow(bc)

nb_sim    <- poly2nb(bc)
w_sim     <- nb2listw(nb_sim, style = "W")
Wmat_sim  <- listw2mat(w_sim)

gamma_rho <- 0.4
gamma_lam <- 0.3
beta_vec  <- c(5, 1.2, -0.8)

dat_sim   <- bc %>% mutate(
  X1_sim = rnorm(n, mean = mean(X1, na.rm = TRUE)),
  X2_sim = rnorm(n, mean = mean(X2, na.rm = TRUE)),
  eps     = rnorm(n),
  xi      = rnorm(n)
)
Xmat      <- cbind(1, dat_sim$X1_sim, dat_sim$X2_sim)
I_n       <- diag(n)

Y_sar     <- solve(I_n - gamma_rho * Wmat_sim) %*% (Xmat %*% beta_vec + dat_sim$eps)
Y_sem     <- Xmat %*% beta_vec + solve(I_n - gamma_lam * Wmat_sim) %*% dat_sim$xi

sim_sf    <- bc %>% mutate(
  Y_sar = as.numeric(Y_sar),
  Y_sem = as.numeric(Y_sem),
  X1_sim = dat_sim$X1_sim,
  X2_sim = dat_sim$X2_sim
)

darima_sar <- lagsarlm(Y_sar ~ X1_sim + X2_sim, data = sim_sf, listw = w_sim, method = "eigen")
darima_sem <- errorsarlm(Y_sem ~ X1_sim + X2_sim, data = sim_sf, listw = w_sim, method = "eigen")

print(summary(darima_sar))
## 
## Call:lagsarlm(formula = Y_sar ~ X1_sim + X2_sim, data = sim_sf, listw = w_sim, 
##     method = "eigen")
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.48466 -0.41047  0.22523  0.52849  1.77836 
## 
## Type: lag 
## Coefficients: (numerical Hessian approximate standard errors) 
##               Estimate Std. Error     z value  Pr(>|z|)
## (Intercept) 167.368532  18.051154      9.2719 < 2.2e-16
## X1_sim        1.337910   0.146993      9.1019 < 2.2e-16
## X2_sim       -1.002339   0.000045 -22274.1300 < 2.2e-16
## 
## Rho: 0.25064, LR test value: 2.2163, p-value: 0.13656
## Approximate (numerical Hessian) standard error: 9.551e-05
##     z-value: 2624.3, p-value: < 2.22e-16
## Wald statistic: 6886900, p-value: < 2.22e-16
## 
## Log likelihood: -36.75023 for lag model
## ML residual variance (sigma squared): 0.66864, (sigma: 0.8177)
## Number of observations: 30 
## Number of parameters estimated: 5 
## AIC: 83.5, (AIC for lm: 83.717)
print(summary(darima_sem))
## 
## Call:errorsarlm(formula = Y_sem ~ X1_sim + X2_sim, data = sim_sf, 
##     listw = w_sim, method = "eigen")
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -1.485755 -0.612013 -0.075146  0.558774  2.251504 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##                Estimate  Std. Error z value  Pr(>|z|)
## (Intercept) -3074.87084  8414.89935 -0.3654  0.714807
## X1_sim          1.23498     0.17262  7.1543 8.411e-13
## X2_sim         -0.73079     0.18928 -3.8609  0.000113
## 
## Lambda: 0.25869, LR test value: 1.0942, p-value: 0.29555
## Approximate (numerical Hessian) standard error: NaN
##     z-value: NaN, p-value: NA
## Wald statistic: NaN, p-value: NA
## 
## Log likelihood: -39.3667 for error model
## ML residual variance (sigma squared): 0.79526, (sigma: 0.89178)
## Number of observations: 30 
## Number of parameters estimated: 5 
## AIC: 88.733, (AIC for lm: 87.828)
sim_sf$res_sar <- residuals(darima_sar)
sim_sf$res_sem <- residuals(darima_sem)
par(mfrow = c(1,2), mar = c(0,0,2,0))
plot(sim_sf["res_sar"], main = "Residual SAR (Simulasi)")

plot(sim_sf["res_sem"], main = "Residual SEM (Simulasi)")

4 Analisis Data Riil

4.1 Uji Autokorelasi & Hipotesis

bc_clean <- bandung_sf %>% filter(!is.na(Y) & !is.na(X1) & !is.na(X2))
bn_r     <- poly2nb(bc_clean)
w_r      <- nb2listw(bn_r, style = "W")

print(moran.test(bc_clean$Y, w_r))
## 
##  Moran I test under randomisation
## 
## data:  bc_clean$Y  
## weights: w_r    
## 
## Moran I statistic standard deviate = 2.3514, p-value = 0.009351
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##        0.23574008       -0.03448276        0.01320629
print(lm.LMtests(lm(Y ~ X1 + X2, data = bc_clean), w_r, test = c("LMerr","LMlag")))
## 
##  Rao's score (a.k.a Lagrange multiplier) diagnostics for spatial
##  dependence
## 
## data:  
## model: lm(formula = Y ~ X1 + X2, data = bc_clean)
## test weights: listw
## 
## RSerr = 0.41307, df = 1, p-value = 0.5204
## 
## 
##  Rao's score (a.k.a Lagrange multiplier) diagnostics for spatial
##  dependence
## 
## data:  
## model: lm(formula = Y ~ X1 + X2, data = bc_clean)
## test weights: listw
## 
## RSlag = 4.4012, df = 1, p-value = 0.03591

4.2 Estimasi SAR & SEM (Riil)

sar_real <- lagsarlm(Y ~ X1 + X2, data = bc_clean, listw = w_r, method = "eigen")
sem_real <- errorsarlm(Y ~ X1 + X2, data = bc_clean, listw = w_r, method = "eigen")

summary(sar_real)
## 
## Call:lagsarlm(formula = Y ~ X1 + X2, data = bc_clean, listw = w_r, 
##     method = "eigen")
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -1808.813  -624.276   -91.494   338.403  3540.413 
## 
## Type: lag 
## Coefficients: (numerical Hessian approximate standard errors) 
##                Estimate  Std. Error z value Pr(>|z|)
## (Intercept) -1641.12751   935.74266 -1.7538  0.07946
## X1              0.43463     0.90724  0.4791  0.63189
## X2              0.15861     0.01259 12.5986  < 2e-16
## 
## Rho: 0.28208, LR test value: 4.6334, p-value: 0.031355
## Approximate (numerical Hessian) standard error: 0.12404
##     z-value: 2.2741, p-value: 0.02296
## Wald statistic: 5.1715, p-value: 0.02296
## 
## Log likelihood: -248.8836 for lag model
## ML residual variance (sigma squared): 923190, (sigma: 960.83)
## Number of observations: 30 
## Number of parameters estimated: 5 
## AIC: 507.77, (AIC for lm: 510.4)
summary(sem_real)
## 
## Call:errorsarlm(formula = Y ~ X1 + X2, data = bc_clean, listw = w_r, 
##     method = "eigen")
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1571.48  -669.46  -297.48   418.62  3580.15 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept) 234.280503 665.329034  0.3521   0.7247
## X1            0.789532   0.956422  0.8255   0.4091
## X2            0.163734   0.013062 12.5347   <2e-16
## 
## Lambda: 0.2126, LR test value: 0.52343, p-value: 0.46938
## Asymptotic standard error: 0.24364
##     z-value: 0.87257, p-value: 0.3829
## Wald statistic: 0.76138, p-value: 0.3829
## 
## Log likelihood: -250.9386 for error model
## ML residual variance (sigma squared): 1067600, (sigma: 1033.3)
## Number of observations: 30 
## Number of parameters estimated: 5 
## AIC: 511.88, (AIC for lm: 510.4)
AIC(sar_real, sem_real)
##          df      AIC
## sar_real  5 507.7671
## sem_real  5 511.8771

4.3 Visualisasi Residual (Riil)

bc_clean <- bc_clean %>% mutate(res_sar = residuals(sar_real), res_sem = residuals(sem_real))
par(mfrow = c(1,2), mar = c(0,0,2,0))
plot(bc_clean["res_sar"], main = "Residual SAR (Riil)")

plot(bc_clean["res_sem"], main = "Residual SEM (Riil)")

5 Referensi

  • Anselin, L. (1988). Spatial Econometrics: Methods and Models. Kluwer Academic Publishers.
  • LeSage, J., & Pace, R. K. (2009). Introduction to Spatial Econometrics. CRC Press.
  • Bivand, R., Pebesma, E., & Gómez‐Rubio, V. (2013). Applied Spatial Data Analysis with R. Springer.