Berikut ini adalah penjelasan tentang Spatial Durbin Model yang dirujuk dari Zhukov (2010).
Like the SEM, the Spatial Durbin Model can be motivated by concern over omitted variables. Recall the DGP for the SEM:
\[ y = X\beta+(I_n - \lambda W)^{-1}u \] Now suppose that \(X\) an \(u\) are correlated. One way to account for this correlation would be to conceive of \(u\) as a linear combination of \(X\) an error term \(v\) that is independent of \(X\). \[ u = X\gamma +v \] \[ V \sim N(0,\sigma^2I_n) \] where the scalar parameter \(\gamma\) and \(\sigma^2\) govern the strength of the relationship between \(X\) and \(z = (I_n - \lambda W)^{-1}\) Substituting this expression for \(u\), we have the following DGP: \[ y = X\beta +(I_n - \lambda W)^{-1}(\gamma X + v) \] \[ y = X\beta +(I_n - \lambda W)^{-1}\gamma X + (I_n-\lambda W)^{-1}v \] \[ (I_n-\lambda W) y = (I_n - \lambda W)X\beta + \gamma X + v \] \[ y = \lambda Wy + X(\beta + \gamma) + WX(-\lambda\beta) + v \] This is the Spatial Durbin Model (SDM), which includes a spatial lag of the dependent variable \(y\), as well as the explanatory variables \(X\). The Spatial Durbin Model can be also motivated by concern over Spatial heterogenity. Recall the vector of intercepts \(a\): \[ a = (I_n - \lambda W)^{-1}\in \] Now suppose that \(X\) and \(\in\) are correlated. As before, letโs restate \(\in\) as a linear combination of \(X\) and random noise \(v\). \[ a = X \gamma + v \] Substituting this back into SEM yields the same expression of SDM as before: \[ y = \lambda W y + X (\beta + \lambda) + W X (-\lambda \beta) + v \]
Source: Zukov(2010)
rm(list=ls())
library(rgdal);library (spdep); library(spatialreg);
data(columbus)
col.listw <- nb2listw(col.gal.nb)
columbus.lm<- lm(CRIME ~ INC + HOVAL, data=columbus)
summary(columbus.lm)
##
## Call:
## lm(formula = CRIME ~ INC + HOVAL, data = columbus)
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.418 -6.388 -1.580 9.052 28.649
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 68.6190 4.7355 14.490 < 2e-16 ***
## INC -1.5973 0.3341 -4.780 1.83e-05 ***
## HOVAL -0.2739 0.1032 -2.654 0.0109 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.43 on 46 degrees of freedom
## Multiple R-squared: 0.5524, Adjusted R-squared: 0.5329
## F-statistic: 28.39 on 2 and 46 DF, p-value: 9.341e-09
col.moran <- lm.morantest(columbus.lm, col.listw)
col.moran
##
## Global Moran I for regression residuals
##
## data:
## model: lm(formula = CRIME ~ INC + HOVAL, data = columbus)
## weights: col.listw
##
## Moran I statistic standard deviate = 2.681, p-value = 0.00367
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I Expectation Variance
## 0.212374153 -0.033268284 0.008394853
columbus.lagrange <- lm.LMtests(columbus.lm, col.listw, test=c("LMerr","RLMerr","LMlag","RLMlag","SARMA"))
summary(columbus.lagrange)
## Lagrange multiplier diagnostics for spatial dependence
## data:
## model: lm(formula = CRIME ~ INC + HOVAL, data = columbus)
## weights: col.listw
##
## statistic parameter p.value
## LMerr 4.611126 1 0.031765 *
## RLMerr 0.033514 1 0.854744
## LMlag 7.855675 1 0.005066 **
## RLMlag 3.278064 1 0.070212 .
## SARMA 7.889190 2 0.019359 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
columbus.lag <- lagsarlm(CRIME ~ INC + HOVAL,data=columbus, col.listw)
summary(columbus.lag)
##
## Call:
## lagsarlm(formula = CRIME ~ INC + HOVAL, data = columbus, listw = col.listw)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.4497093 -5.4565567 0.0016387 6.7159553 24.7107978
##
## Type: lag
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 46.851431 7.314754 6.4051 1.503e-10
## INC -1.073533 0.310872 -3.4533 0.0005538
## HOVAL -0.269997 0.090128 -2.9957 0.0027381
##
## Rho: 0.40389, LR test value: 8.4179, p-value: 0.0037154
## Asymptotic standard error: 0.12071
## z-value: 3.3459, p-value: 0.00082027
## Wald statistic: 11.195, p-value: 0.00082027
##
## Log likelihood: -183.1683 for lag model
## ML residual variance (sigma squared): 99.164, (sigma: 9.9581)
## Number of observations: 49
## Number of parameters estimated: 5
## AIC: 376.34, (AIC for lm: 382.75)
## LM test for residual autocorrelation
## test value: 0.19184, p-value: 0.66139
columbus.err <- errorsarlm(CRIME ~ INC + HOVAL,data=columbus,col.listw)
summary(columbus.err)
##
## Call:
## errorsarlm(formula = CRIME ~ INC + HOVAL, data = columbus, listw = col.listw)
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.45950 -6.21730 -0.69775 7.65256 24.23631
##
## Type: error
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 61.053618 5.314875 11.4873 < 2.2e-16
## INC -0.995473 0.337025 -2.9537 0.0031398
## HOVAL -0.307979 0.092584 -3.3265 0.0008794
##
## Lambda: 0.52089, LR test value: 6.4441, p-value: 0.011132
## Asymptotic standard error: 0.14129
## z-value: 3.6868, p-value: 0.00022713
## Wald statistic: 13.592, p-value: 0.00022713
##
## Log likelihood: -184.1552 for error model
## ML residual variance (sigma squared): 99.98, (sigma: 9.999)
## Number of observations: 49
## Number of parameters estimated: 5
## AIC: 378.31, (AIC for lm: 382.75)
columbus.sarma <- sacsarlm(CRIME ~ INC + HOVAL, data=columbus,col.listw)
summary(columbus.sarma)
##
## Call:
## sacsarlm(formula = CRIME ~ INC + HOVAL, data = columbus, listw = col.listw)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.1121 -4.6324 -0.3040 7.0306 24.6929
##
## Type: sac
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 49.051431 10.054986 4.8783 1.07e-06
## INC -1.068781 0.332839 -3.2111 0.001322
## HOVAL -0.283114 0.091526 -3.0933 0.001980
##
## Rho: 0.35326
## Asymptotic standard error: 0.19669
## z-value: 1.796, p-value: 0.072494
## Lambda: 0.13199
## Asymptotic standard error: 0.29905
## z-value: 0.44138, p-value: 0.65894
##
## LR test value: 8.6082, p-value: 0.013513
##
## Log likelihood: -183.0731 for sac model
## ML residual variance (sigma squared): 99.423, (sigma: 9.9711)
## Number of observations: 49
## Number of parameters estimated: 6
## AIC: 378.15, (AIC for lm: 382.75)
Model: \[y = \rho W y + X\beta+ W X \theta + \epsilon \]
columbus.durbin <- lagsarlm(CRIME ~ INC+HOVAL, data=columbus, col.listw, type="mixed");
summary(columbus.durbin)
##
## Call:
## lagsarlm(formula = CRIME ~ INC + HOVAL, data = columbus, listw = col.listw,
## type = "mixed")
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.15904 -6.62594 -0.39823 6.57561 23.62757
##
## Type: mixed
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 45.592893 13.128679 3.4728 0.0005151
## INC -0.939088 0.338229 -2.7765 0.0054950
## HOVAL -0.299605 0.090843 -3.2980 0.0009736
## lag.INC -0.618375 0.577052 -1.0716 0.2838954
## lag.HOVAL 0.266615 0.183971 1.4492 0.1472760
##
## Rho: 0.38251, LR test value: 4.1648, p-value: 0.041272
## Asymptotic standard error: 0.16237
## z-value: 2.3557, p-value: 0.018488
## Wald statistic: 5.5493, p-value: 0.018488
##
## Log likelihood: -182.0161 for mixed model
## ML residual variance (sigma squared): 95.051, (sigma: 9.7494)
## Number of observations: 49
## Number of parameters estimated: 7
## AIC: 378.03, (AIC for lm: 380.2)
## LM test for residual autocorrelation
## test value: 0.101, p-value: 0.75063
Model: \[y = X\beta+ W X \theta + \epsilon \]
SLX model dapat dilakukan dengan fungsi lmSLX()
pada package spatialreg
. Perhatikan bahwa model SLX adalah bentuk khusus dari model Spatial Durbin.
columbus.SLX <- lmSLX(CRIME ~ INC+HOVAL, data=columbus, col.listw, Durbin = TRUE);
summary(columbus.SLX)
##
## Call:
## lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1], collapse = "+"))),
## data = as.data.frame(x), weights = weights)
##
## Residuals:
## Min 1Q Median 3Q Max
## -36.245 -7.613 0.188 7.863 25.982
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.0290 6.7218 11.013 3.13e-14 ***
## INC -1.1081 0.3750 -2.955 0.00501 **
## HOVAL -0.2949 0.1014 -2.910 0.00565 **
## lag.INC -1.3834 0.5592 -2.474 0.01729 *
## lag.HOVAL 0.2262 0.2026 1.116 0.27041
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.94 on 44 degrees of freedom
## Multiple R-squared: 0.6085, Adjusted R-squared: 0.5729
## F-statistic: 17.09 on 4 and 44 DF, p-value: 1.581e-08
Dengan mengatur argumen Durbin
pada fungsi tersebut, kita dapat memodifikasi model menjadi model SLX.
columbus.SLX <- lmSLX(CRIME ~ INC+HOVAL, data=columbus, col.listw, Durbin = ~INC);
summary(columbus.SLX)
##
## Call:
## lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1], collapse = "+"))),
## data = as.data.frame(x), weights = weights)
##
## Residuals:
## Min 1Q Median 3Q Max
## -36.708 -7.019 1.415 8.467 26.791
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 77.43354 6.00623 12.892 < 2e-16 ***
## INC -1.18157 0.37018 -3.192 0.00258 **
## HOVAL -0.26925 0.09898 -2.720 0.00924 **
## lag.INC -1.01554 0.45293 -2.242 0.02993 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.97 on 45 degrees of freedom
## Multiple R-squared: 0.5974, Adjusted R-squared: 0.5705
## F-statistic: 22.26 on 3 and 45 DF, p-value: 5.495e-09
Model: \[y = X\beta+ W X \theta + u \] dimana : \[u= \lambda Wu \epsilon \]
columbus.errX <- errorsarlm(CRIME ~ INC+HOVAL, data=columbus, col.listw, etype="mixed");
summary(columbus.errX)
##
## Call:
## errorsarlm(formula = CRIME ~ INC + HOVAL, data = columbus, listw = col.listw,
## etype = "mixed")
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.02060 -6.68585 -0.15142 6.51557 24.18199
##
## Type: error
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 73.258655 8.528043 8.5903 < 2.2e-16
## INC -1.069530 0.324719 -3.2937 0.0009887
## HOVAL -0.280344 0.091809 -3.0535 0.0022615
## lag.INC -1.196774 0.568968 -2.1034 0.0354297
## lag.HOVAL 0.146758 0.200872 0.7306 0.4650196
##
## Lambda: 0.37613, LR test value: 3.7313, p-value: 0.053403
## Asymptotic standard error: 0.16554
## z-value: 2.2721, p-value: 0.023079
## Wald statistic: 5.1626, p-value: 0.023079
##
## Log likelihood: -182.2329 for error model
## ML residual variance (sigma squared): 96.022, (sigma: 9.7991)
## Number of observations: 49
## Number of parameters estimated: 7
## AIC: 378.47, (AIC for lm: 380.2)
Ilustrasi dan penjelasan tentang SLX dan SDEM dirujuk dari Mendez (2020).
Lakukan pemodelan untuk memprediksi GRP pada data yang tersedia pada: https://github.com/raoy/Spatial-Statistics . Data yang digunakan adalah:
China29.zip
Data China (spasial).xlsx
Silahkan coba lakukan pemodelan dependensi spasial pada data Jawa Barat yang tersedia di modul 10. Atau silahkan akses pada: https://github.com/raoy/SpatialReg
Mendez C. (2020). Spatial regression analysis in R. R Studio/RPubs. Available at https://rpubs.com/quarcs-lab/tutorial-spatial-regression
Zhukov, Y. M. (2010, January 19). Applied Spatial Statistics in R, Section 6, Spatial Regression [PDF slides.]. IQSS, Harvard University. http://www.people.fas.harvard.edu/~zhukov/Spatial6.pdf