Marginal Effects

Definisi yang diambil dari materi kuliah yang disusun oleh Dr. Anik Djuraidah menyatakan bahwa efek marginal atau limpahan (spill-over) adalah besarnya dampak perubahan pada peubah dependen pada wilayah-\(i\), akibat perubahan prediktor di wilayah-\(j\).

Efek marginal terdapat pada model dependensi spasial SAR, GSM, SDM, SDEM, dan SLX. Efek ini dapat dibedakan menjadi tiga, yaitu efek langsung (direct effect), efek tidak langsung (indirect effect), dan efek total (total effect).

Illustration on R

rm(list=ls())
library(rgdal);library (spdep); library(spatialreg)
data(columbus)
col.listw <- nb2listw(col.gal.nb)

OLS Regression

columbus.lm<- lm(CRIME ~ INC + HOVAL, data=columbus)
summary(columbus.lm)
## 
## Call:
## lm(formula = CRIME ~ INC + HOVAL, data = columbus)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -34.418  -6.388  -1.580   9.052  28.649 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  68.6190     4.7355  14.490  < 2e-16 ***
## INC          -1.5973     0.3341  -4.780 1.83e-05 ***
## HOVAL        -0.2739     0.1032  -2.654   0.0109 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.43 on 46 degrees of freedom
## Multiple R-squared:  0.5524, Adjusted R-squared:  0.5329 
## F-statistic: 28.39 on 2 and 46 DF,  p-value: 9.341e-09

Moran test

col.moran <- lm.morantest(columbus.lm, col.listw)
col.moran
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = CRIME ~ INC + HOVAL, data = columbus)
## weights: col.listw
## 
## Moran I statistic standard deviate = 2.681, p-value = 0.00367
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.212374153     -0.033268284      0.008394853

LM-test

columbus.lagrange <- lm.LMtests(columbus.lm, col.listw, test=c("LMerr","RLMerr","LMlag","RLMlag","SARMA"))
summary(columbus.lagrange)
##  Lagrange multiplier diagnostics for spatial dependence
## data:  
## model: lm(formula = CRIME ~ INC + HOVAL, data = columbus)
## weights: col.listw
##  
##        statistic parameter  p.value   
## LMerr   4.611126         1 0.031765 * 
## RLMerr  0.033514         1 0.854744   
## LMlag   7.855675         1 0.005066 **
## RLMlag  3.278064         1 0.070212 . 
## SARMA   7.889190         2 0.019359 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Spatial Lag Model

columbus.lag <- lagsarlm(CRIME ~ INC + HOVAL,data=columbus, col.listw)
summary(columbus.lag)
## 
## Call:
## lagsarlm(formula = CRIME ~ INC + HOVAL, data = columbus, listw = col.listw)
## 
## Residuals:
##         Min          1Q      Median          3Q         Max 
## -37.4497093  -5.4565567   0.0016387   6.7159553  24.7107978 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##              Estimate Std. Error z value  Pr(>|z|)
## (Intercept) 46.851431   7.314754  6.4051 1.503e-10
## INC         -1.073533   0.310872 -3.4533 0.0005538
## HOVAL       -0.269997   0.090128 -2.9957 0.0027381
## 
## Rho: 0.40389, LR test value: 8.4179, p-value: 0.0037154
## Asymptotic standard error: 0.12071
##     z-value: 3.3459, p-value: 0.00082027
## Wald statistic: 11.195, p-value: 0.00082027
## 
## Log likelihood: -183.1683 for lag model
## ML residual variance (sigma squared): 99.164, (sigma: 9.9581)
## Number of observations: 49 
## Number of parameters estimated: 5 
## AIC: 376.34, (AIC for lm: 382.75)
## LM test for residual autocorrelation
## test value: 0.19184, p-value: 0.66139

Terlihat pada output di atas bahwa koefisien \(\rho\) signifikan pada model SAR. Selanjutnya, marginal effect dapat diperoleh dengan fungsi impacts() seperti pada syntax berikut ini.

Interpretasi Efek Marginal

impacts(columbus.lag, listw = col.listw)
## Impact measures (lag, exact):
##           Direct   Indirect      Total
## INC   -1.1225156 -0.6783818 -1.8008973
## HOVAL -0.2823163 -0.1706152 -0.4529315

Terlihat bahwa pengaruh langsung dari peubah INC adalah sebesar -1.12, artinya jika rata-rata pendapatan rumah tangga di wilayah-\(i\) meningkat $ 1,000, maka rata-rata kejadian kriminal di wilayah tersebut akan berkurang sebesar 11.2 per 100 rumah tangga, jika nilai rumahnya tetap. Sedangkan efek tak langsung dari peubah tersebut bernilai -0.678. Artinya, jika rata-rata pendapatan rumah tangga di wilayah-\(i\) meningkat sebesar $1,000, maka rata-rata kejadian kriminal di wilayah-\(j\) akan berkurang sebesar 6.78 per 100 rumah tangga, jika nilai rumahnya tetap. Interpretasi serupa juga dapat dilakukan terhadap peubah HOVAL.

Ilustration 2

Ilustrasi ini diambil dari materi workshop yang disusun oleh Sarmiento-Barbieri (2016). Data yang digunakan terdapat pada http://www.econ.uiuc.edu/~lab/workshop/foreclosures/. Silahkan download semua data yang terdapat pada link tersebut.

Data Import

Impor data shapefile menggunakan fungsi readOGR() pada package rgdal. Setelah itu, kita dapat menggunakan fungsi str() untuk melihat struktur datanya.

chi.poly<-readOGR(dsn="folder tempat menyimpan shape file", layer="nama shape file")
str(slot(chi.poly,"data"))
## 'data.frame':    897 obs. of  16 variables:
##  $ SP_ID     : chr  "1" "2" "3" "4" ...
##  $ fips      : chr  "17031010100" "17031010200" "17031010300" "17031010400" ...
##  $ est_fcs   : int  43 129 55 21 64 56 107 43 7 51 ...
##  $ est_mtgs  : int  904 2122 1151 574 1427 1241 1959 830 208 928 ...
##  $ est_fcs_rt: num  4.76 6.08 4.78 3.66 4.48 4.51 5.46 5.18 3.37 5.5 ...
##  $ res_addr  : int  2530 3947 3204 2306 5485 2994 3701 1694 443 1552 ...
##  $ est_90d_va: num  12.61 12.36 10.46 5.03 8.44 ...
##  $ bls_unemp : num  8.16 8.16 8.16 8.16 8.16 8.16 8.16 8.16 8.16 8.16 ...
##  $ county    : chr  "Cook County" "Cook County" "Cook County" "Cook County" ...
##  $ fips_num  : num  1.7e+10 1.7e+10 1.7e+10 1.7e+10 1.7e+10 ...
##  $ totpop    : int  5391 10706 6649 5325 10944 7178 10799 5403 1089 3634 ...
##  $ tothu     : int  2557 3981 3281 2464 5843 3136 3875 1768 453 1555 ...
##  $ huage     : int  61 53 56 60 54 58 48 57 61 48 ...
##  $ oomedval  : int  169900 147000 119800 151500 143600 145900 153400 170500 215900 114700 ...
##  $ property  : num  646 914 478 509 641 612 678 332 147 351 ...
##  $ violent   : num  433 421 235 159 240 266 272 146 78 84 ...

Berikut adalah penjelasan mengenai peubah yang ada pada data tersebut:

  • est_fcs: estimated count of foreclosure starts from Jan. 2007 through June 2008

  • est_mtgs: estimated number of active mortgages from Jan. 2007 through June 2008

  • est_fcs_rt: number of foreclosure starts divided by number of mortgages times 100

  • bls_unemp: June 2008 place or county unemployment rate

  • totpop: total population from 2000 Census

  • violent: number of violent crimes reported between Jan. 2007 through December 2008

  • property: number of property crimes reported between Jan. 2007 through December 2008

(Sarmiento-Barbieri, 2016)

Visualisasi Data

plot(chi.poly)

library(leaflet)
leaflet(chi.poly) %>%
  addPolygons(stroke = FALSE, fillOpacity = 0.5, smoothFactor = 0.5) %>%
  addTiles()
require(RColorBrewer)
## Loading required package: RColorBrewer
qpal<-colorQuantile("OrRd", chi.poly@data$violent, n=9) 

leaflet(chi.poly) %>%
  addPolygons(stroke = FALSE, fillOpacity = .8, smoothFactor = 0.2, color = ~qpal(violent)
  ) %>%
  addTiles()

OLS

chi.ols<-lm(violent~est_fcs_rt+bls_unemp, data=chi.poly@data)
summary(chi.ols)
## 
## Call:
## lm(formula = violent ~ est_fcs_rt + bls_unemp, data = chi.poly@data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -892.02  -77.02  -23.73   41.90 1238.22 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -18.627     45.366  -0.411    0.681    
## est_fcs_rt    28.298      1.435  19.720   <2e-16 ***
## bls_unemp     -0.308      5.770  -0.053    0.957    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 157.3 on 894 degrees of freedom
## Multiple R-squared:  0.3141, Adjusted R-squared:  0.3126 
## F-statistic: 204.7 on 2 and 894 DF,  p-value: < 2.2e-16

Modeling Spatial Dependence

list.queen<-poly2nb(chi.poly, queen=TRUE)
W<-nb2listw(list.queen, style="W", zero.policy=TRUE)
W
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 897 
## Number of nonzero links: 6140 
## Percentage nonzero weights: 0.7631036 
## Average number of links: 6.845039 
## 
## Weights style: W 
## Weights constants summary:
##     n     nn  S0       S1       S2
## W 897 804609 897 274.4893 3640.864
coords<-coordinates(chi.poly)
W_dist<-dnearneigh(coords,0,1,longlat = TRUE)

Checking the Spatial Autocorrelation

moran.lm<-lm.morantest(chi.ols, W, alternative="two.sided")
print(moran.lm)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = violent ~ est_fcs_rt + bls_unemp, data =
## chi.poly@data)
## weights: W
## 
## Moran I statistic standard deviate = 11.785, p-value < 2.2e-16
## alternative hypothesis: two.sided
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.2142252370    -0.0020099108     0.0003366648

LM Test

LM<-lm.LMtests(chi.ols, W, test="all")
summary(LM)
##  Lagrange multiplier diagnostics for spatial dependence
## data:  
## model: lm(formula = violent ~ est_fcs_rt + bls_unemp, data =
## chi.poly@data)
## weights: W
##  
##         statistic parameter   p.value    
## LMerr  1.3452e+02         1 < 2.2e-16 ***
## LMlag  1.8218e+02         1 < 2.2e-16 ***
## RLMerr 6.6762e-04         1    0.9794    
## RLMlag 4.7653e+01         1 5.089e-12 ***
## SARMA  1.8218e+02         2 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Fitting Spatial Regressions

SAR

sar.chi<-lagsarlm(violent~est_fcs_rt+bls_unemp, data=chi.poly@data, W)
summary(sar.chi)
## 
## Call:
## lagsarlm(formula = violent ~ est_fcs_rt + bls_unemp, data = chi.poly@data, 
##     listw = W)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -519.127  -65.003  -15.226   36.423 1184.193 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept) -93.7885    41.3162  -2.270  0.02321
## est_fcs_rt   15.6822     1.5600  10.053  < 2e-16
## bls_unemp     8.8949     5.2447   1.696  0.08989
## 
## Rho: 0.49037, LR test value: 141.33, p-value: < 2.22e-16
## Asymptotic standard error: 0.039524
##     z-value: 12.407, p-value: < 2.22e-16
## Wald statistic: 153.93, p-value: < 2.22e-16
## 
## Log likelihood: -5738.047 for lag model
## ML residual variance (sigma squared): 20200, (sigma: 142.13)
## Number of observations: 897 
## Number of parameters estimated: 5 
## AIC: 11486, (AIC for lm: 11625)
## LM test for residual autocorrelation
## test value: 8.1464, p-value: 0.0043146
impacts(sar.chi, listw=W)
## Impact measures (lag, exact):
##               Direct  Indirect    Total
## est_fcs_rt 16.434479 14.336896 30.77137
## bls_unemp   9.321585  8.131842 17.45343

SEM

errorsalm.chi<-errorsarlm(violent~est_fcs_rt+bls_unemp, data=chi.poly@data, W)
summary(errorsalm.chi)
## 
## Call:
## errorsarlm(formula = violent ~ est_fcs_rt + bls_unemp, data = chi.poly@data, 
##     listw = W)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -650.506  -64.355  -22.646   35.461 1206.346 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -1.2624    43.0509 -0.0293   0.9766
## est_fcs_rt   19.4620     1.9450 10.0062   <2e-16
## bls_unemp     4.0380     5.5134  0.7324   0.4639
## 
## Lambda: 0.52056, LR test value: 109.68, p-value: < 2.22e-16
## Asymptotic standard error: 0.042291
##     z-value: 12.309, p-value: < 2.22e-16
## Wald statistic: 151.51, p-value: < 2.22e-16
## 
## Log likelihood: -5753.875 for error model
## ML residual variance (sigma squared): 20796, (sigma: 144.21)
## Number of observations: 897 
## Number of parameters estimated: 5 
## AIC: 11518, (AIC for lm: 11625)

Excercise (1)

Lakukan pemodelan menggunakan data chi.poly.

  • periksa multikolineritas antar peubah bebas yang digunakan berdasarkan VIF

  • eksplorasi autokorelasi spasial pada model menggunakan jarak W_dist

  • lakukan pemodelan yang menurut Anda paling tepat, interpretasikan.

Exercise (2)

Pelajari artikel yang ditulis oleh Guliyev (2020), yang tersedia pada link berikut: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7139267/. Jika memungkinkan, silahkan lakukan pemodelan regresi spasial dan interpretasikan efek marjinal pada kasus ini, berdasarkan data yang tersedia pada artikel tersebut. Catatan: peta China dapat diakses pada https://data.humdata.org/dataset/china-administrative-boundaries.

References

Guliyev, H. (2020). Determining the spatial effects of COVID-19 using the spatial panel data model. Spatial Statistics, 100443. doi:10.1016/j.spasta.2020.100443. Retrieved from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7139267/

Sarmiento-Barbieri, I. (2016, April 24). An introduction to spatial econometrics in R. Spatial Econometric Workshop, University of Illinois. Retrieved from: https://www.econ.uiuc.edu/~lab/workshop/Spatial_in_R.html#modeling-spatial-dependence

Zhukov, Y. M. (2010, January 19). Applied Spatial Statistics in R, Section 6, Spatial Regression [PDF slides.]. IQSS, Harvard University. Retrieved from: http://www.people.fas.harvard.edu/~zhukov/Spatial6.pdf