El propósito de esta actividad es poner en práctica lo estudiado hasta ahora en los temas de Análisis Exploratorio de Datos Espaciales, así como Modelos de Regresión Lineal Espaciales.
Primeramente aquí se proporcionan las librerías principales que deberás considerar. De todas maneras si deseas agregar cualquier otra librería lo puedes hacer y recomendaría hacerlo de una vez dentro del siguiente chunk de código:
#install.packages("quantmod")
#install.packages("forecast")
#install.packages("tseries")
#install.packages("corrplot")
#install.packages("leaflet.extras2")
#install.packages("tinytex")
#tinytex::install_tinytex()
library("leaflet.extras2")
library(corrplot)
library(tseries)
library(forecast)
library(quantmod)
library(foreign)
library(dplyr)
library(spdep)
library(tigris)
library(rgeoda)
library(RColorBrewer)
library(viridis)
library(ggplot2)
library(tmap)
library(sf)
library(sp)
library(spatialreg)
library(stargazer)
library(plm)
library(splm)
library(pspatreg)
library(regclass)
library(mctest)
library(lmtest)
library(spData)
library(mapview)
library(naniar)
library(dlookr)
library(caret)
library(e1071)
library(SparseM)
library(Metrics)
library(randomForest)
library(rpart.plot)
library(insight)
library(jtools)
library(xgboost)
library(DiagrammeR)
library(effects)
library(leaflet)
A continuación se carga la base de datos a ser analizada: ‘boston’ de la librería ‘spData’
data(boston, package="spData")
head(boston.c)
## TOWN TOWNNO TRACT LON LAT MEDV CMEDV CRIM ZN INDUS CHAS
## 1 Nahant 0 2011 -70.9550 42.2550 24.0 24.0 0.00632 18 2.31 0
## 2 Swampscott 1 2021 -70.9500 42.2875 21.6 21.6 0.02731 0 7.07 0
## 3 Swampscott 1 2022 -70.9360 42.2830 34.7 34.7 0.02729 0 7.07 0
## 4 Marblehead 2 2031 -70.9280 42.2930 33.4 33.4 0.03237 0 2.18 0
## 5 Marblehead 2 2032 -70.9220 42.2980 36.2 36.2 0.06905 0 2.18 0
## 6 Marblehead 2 2033 -70.9165 42.3040 28.7 28.7 0.02985 0 2.18 0
## NOX RM AGE DIS RAD TAX PTRATIO B LSTAT
## 1 0.538 6.575 65.2 4.0900 1 296 15.3 396.90 4.98
## 2 0.469 6.421 78.9 4.9671 2 242 17.8 396.90 9.14
## 3 0.469 7.185 61.1 4.9671 2 242 17.8 392.83 4.03
## 4 0.458 6.998 45.8 6.0622 3 222 18.7 394.63 2.94
## 5 0.458 7.147 54.2 6.0622 3 222 18.7 396.90 5.33
## 6 0.458 6.430 58.7 6.0622 3 222 18.7 394.12 5.21
glimpse(boston.c)
## Rows: 506
## Columns: 20
## $ TOWN <fct> Nahant, Swampscott, Swampscott, Marblehead, Marblehead, Marble…
## $ TOWNNO <int> 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,…
## $ TRACT <int> 2011, 2021, 2022, 2031, 2032, 2033, 2041, 2042, 2043, 2044, 20…
## $ LON <dbl> -70.9550, -70.9500, -70.9360, -70.9280, -70.9220, -70.9165, -7…
## $ LAT <dbl> 42.2550, 42.2875, 42.2830, 42.2930, 42.2980, 42.3040, 42.2970,…
## $ MEDV <dbl> 24.0, 21.6, 34.7, 33.4, 36.2, 28.7, 22.9, 27.1, 16.5, 18.9, 15…
## $ CMEDV <dbl> 24.0, 21.6, 34.7, 33.4, 36.2, 28.7, 22.9, 22.1, 16.5, 18.9, 15…
## $ CRIM <dbl> 0.00632, 0.02731, 0.02729, 0.03237, 0.06905, 0.02985, 0.08829,…
## $ ZN <dbl> 18.0, 0.0, 0.0, 0.0, 0.0, 0.0, 12.5, 12.5, 12.5, 12.5, 12.5, 1…
## $ INDUS <dbl> 2.31, 7.07, 7.07, 2.18, 2.18, 2.18, 7.87, 7.87, 7.87, 7.87, 7.…
## $ CHAS <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ NOX <dbl> 0.538, 0.469, 0.469, 0.458, 0.458, 0.458, 0.524, 0.524, 0.524,…
## $ RM <dbl> 6.575, 6.421, 7.185, 6.998, 7.147, 6.430, 6.012, 6.172, 5.631,…
## $ AGE <dbl> 65.2, 78.9, 61.1, 45.8, 54.2, 58.7, 66.6, 96.1, 100.0, 85.9, 9…
## $ DIS <dbl> 4.0900, 4.9671, 4.9671, 6.0622, 6.0622, 6.0622, 5.5605, 5.9505…
## $ RAD <int> 1, 2, 2, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4,…
## $ TAX <int> 296, 242, 242, 222, 222, 222, 311, 311, 311, 311, 311, 311, 31…
## $ PTRATIO <dbl> 15.3, 17.8, 17.8, 18.7, 18.7, 18.7, 15.2, 15.2, 15.2, 15.2, 15…
## $ B <dbl> 396.90, 396.90, 392.83, 394.63, 396.90, 394.12, 395.60, 396.90…
## $ LSTAT <dbl> 4.98, 9.14, 4.03, 2.94, 5.33, 5.21, 12.43, 19.15, 29.93, 17.10…
?boston
## starting httpd help server ... done
Seguidamente, se cargan las geocercas asociadas a la data transversal previamente cargada
boston.tr<-st_read(system.file("shapes/boston_tracts.gpkg", package="spData")[1])
## Reading layer `boston_tracts' from data source
## `C:\Users\jmpo2\AppData\Local\R\win-library\4.5\spData\shapes\boston_tracts.gpkg'
## using driver `GPKG'
## Simple feature collection with 506 features and 36 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -71.52311 ymin: 42.00305 xmax: -70.63823 ymax: 42.67307
## Geodetic CRS: NAD27
Ejemplo de primer visualización en Mapas utilizando los datos ya cargados y visualizando los valores utilizando una variedad de paletas de colores.
boston.trSP<-as(boston.tr, "Spatial")
boston_nb<-poly2nb(boston.trSP)
mapview(boston.trSP, zcol="MEDV", col.regions = colorRampPalette(c('lightgrey', 'darkblue')))
mapview(boston.trSP, zcol="MEDV", col.regions = brewer.pal(7,'Blues'))
## Warning: Found less unique colors (7) than unique zcol values (229)!
## Interpolating color vector to match number of zcol values.
mapview(boston.trSP, zcol="MEDV", col.regions = viridisLite::magma(20))
## Warning: Found less unique colors (20) than unique zcol values (229)!
## Interpolating color vector to match number of zcol values.
?mapview
?colorRampPalette
?brewer.pal
?viridis
# Carga de geocercas
boston.tr <- st_read(system.file("shapes/boston_tracts.gpkg", package = "spData")[1])
## Reading layer `boston_tracts' from data source
## `C:\Users\jmpo2\AppData\Local\R\win-library\4.5\spData\shapes\boston_tracts.gpkg'
## using driver `GPKG'
## Simple feature collection with 506 features and 36 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -71.52311 ymin: 42.00305 xmax: -70.63823 ymax: 42.67307
## Geodetic CRS: NAD27
# Convertir a Spatial
boston.trSP <- as(boston.tr, "Spatial")
# Unir datos boston.c al shapefile
boston.tr <- cbind(boston.tr, boston.c)
# ANÁLISIS EXPLORATORIO DE DATOS TRADICIONAL (EDA)
# --- Histograma de CRIM ---
ggplot(boston.c, aes(x = CRIM)) +
geom_histogram(fill = "steelblue", color = "white", bins = 40) +
labs(title = "Distribución de CRIM", x = "Tasa de Criminalidad", y = "Frecuencia") +
theme_minimal()
# Histograma con transformación log (CRIM es muy sesgada)
ggplot(boston.c, aes(x = log(CRIM + 1))) +
geom_histogram(fill = "tomato", color = "white", bins = 40) +
labs(title = "Distribución de log(CRIM+1)", x = "log(CRIM+1)", y = "Frecuencia") +
theme_minimal()
# Histogramas de variables independientes ---
vars_indep <- c("ZN","INDUS","NOX","RM","AGE","DIS","RAD","TAX","PTRATIO","B","LSTAT")
boston.c %>%
select(all_of(vars_indep)) %>%
tidyr::pivot_longer(everything()) %>%
ggplot(aes(x = value)) +
geom_histogram(fill = "steelblue", color = "white", bins = 30) +
facet_wrap(~name, scales = "free") +
labs(title = "Histogramas de Variables Independientes") +
theme_minimal()
# --- Boxplots ---
boston.c %>%
select(CRIM, all_of(vars_indep)) %>%
tidyr::pivot_longer(-CRIM) %>%
ggplot(aes(x = name, y = value)) +
geom_boxplot(fill = "lightblue") +
facet_wrap(~name, scales = "free") +
labs(title = "Boxplots de Variables") +
theme_minimal()
# Boxplot de CRIM
ggplot(boston.c, aes(y = CRIM)) +
geom_boxplot(fill = "tomato") +
labs(title = "Boxplot de CRIM") +
theme_minimal()
# --- Matriz de Correlaciones ---
boston_num <- boston.c %>%
select(CRIM, ZN, INDUS, NOX, RM, AGE, DIS, RAD, TAX, PTRATIO, B, LSTAT) %>%
mutate(across(everything(), as.numeric))
cor_matrix <- cor(boston_num, use = "complete.obs")
corrplot(cor_matrix, method = "color", type = "upper",
tl.cex = 0.8, addCoef.col = "black", number.cex = 0.6,
title = "Matriz de Correlaciones", mar = c(0,0,1,0))
# ANÁLISIS EXPLORATORIO DE DATOS ESPACIALES (ESDA)
# --- Matriz de conectividad espacial (estructura Reina) ---
boston_nb <- poly2nb(boston.trSP, queen = TRUE) # vecinos Reina
boston_lw <- nb2listw(boston_nb, style = "W", zero.policy = TRUE)
# Graficar la matriz de conectividad
plot(boston.trSP, border = "grey60", main = "Matriz de Conectividad Espacial (Reina)")
plot(boston_nb, coordinates(boston.trSP), add = TRUE, col = "red", lwd = 0.5)
# --- Mapas coloreados ---
# CRIM
mapview(boston.trSP, zcol = "CRIM",
col.regions = viridisLite::magma(20),
main = "Mapa de CRIM")
## Warning: Found less unique colors (20) than unique zcol values (504)!
## Interpolating color vector to match number of zcol values.
# NOX (variable independiente 1)
mapview(boston.trSP, zcol = "NOX",
col.regions = colorRampPalette(c("lightgreen", "darkred"))(20),
main = "Mapa de NOX")
## Warning: Found less unique colors (20) than unique zcol values (81)!
## Interpolating color vector to match number of zcol values.
# LSTAT (variable independiente 2)
mapview(boston.trSP, zcol = "LSTAT",
col.regions = colorRampPalette(c("lightyellow", "darkblue"))(20),
main = "Mapa de LSTAT")
## Warning: Found less unique colors (20) than unique zcol values (455)!
## Interpolating color vector to match number of zcol values.
# --- Lag espacial de CRIM ---
boston.trSP$CRIM_lag <- lag.listw(boston_lw, boston.trSP$CRIM, zero.policy = TRUE)
# Mapa original vs lag espacial
p1 <- mapview(boston.trSP, zcol = "CRIM",
col.regions = viridisLite::magma(20),
layer.name = "CRIM Original")
## Warning: Found less unique colors (20) than unique zcol values (504)!
## Interpolating color vector to match number of zcol values.
p2 <- mapview(boston.trSP, zcol = "CRIM_lag",
col.regions = viridisLite::magma(20),
layer.name = "CRIM Lag Espacial")
## Warning: Found less unique colors (20) than unique zcol values (506)!
## Interpolating color vector to match number of zcol values.
p1 | p2 # visualizar lado a lado
# --- Clusters Espaciales LISA (HotSpots, ColdSpots, HL, LH) ---
# Usando rgeoda
boston_sf <- st_as_sf(boston.trSP)
queen_w <- queen_weights(boston_sf)
lisa <- local_moran(queen_w, boston_sf["CRIM"])
# Extraer clasificación de clusters
boston_sf$lisa_cluster <- as.factor(lisa_clusters(lisa))
boston_sf$lisa_pval <- lisa_pvalues(lisa)
# Etiquetas de clusters
levels(boston_sf$lisa_cluster) <- c("Not Significant","High-High (HotSpot)",
"Low-Low (ColdSpot)","Low-High (Atípico)",
"High-Low (Atípico)","Isolated")
# Mapa de clusters LISA
ggplot(boston_sf) +
geom_sf(aes(fill = lisa_cluster), color = NA) +
scale_fill_manual(values = c("Not Significant" = "grey80",
"High-High (HotSpot)" = "red",
"Low-Low (ColdSpot)" = "blue",
"Low-High (Atípico)" = "lightblue",
"High-Low (Atípico)" = "pink",
"Isolated" = "white")) +
labs(title = "Clusters LISA - CRIM", fill = "Tipo de Cluster") +
theme_minimal()
# --- Índice Global de Moran ---
# CRIM
moran_CRIM <- moran.test(boston.trSP$CRIM, boston_lw, zero.policy = TRUE)
print(moran_CRIM)
##
## Moran I test under randomisation
##
## data: boston.trSP$CRIM
## weights: boston_lw
##
## Moran I statistic standard deviate = 20.398, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.5274633075 -0.0019801980 0.0006736983
moran.plot(boston.trSP$CRIM, boston_lw, zero.policy = TRUE,
main = "Diagrama de Dispersión de Moran - CRIM",
xlab = "CRIM", ylab = "Lag espacial de CRIM")
# NOX
moran_NOX <- moran.test(boston.trSP$NOX, boston_lw, zero.policy = TRUE)
print(moran_NOX)
##
## Moran I test under randomisation
##
## data: boston.trSP$NOX
## weights: boston_lw
##
## Moran I statistic standard deviate = 33.692, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.9065225426 -0.0019801980 0.0007271085
moran.plot(boston.trSP$NOX, boston_lw, zero.policy = TRUE,
main = "Diagrama de Dispersión de Moran - NOX",
xlab = "NOX", ylab = "Lag espacial de NOX")
# LSTAT
moran_LSTAT <- moran.test(boston.trSP$LSTAT, boston_lw, zero.policy = TRUE)
print(moran_LSTAT)
##
## Moran I test under randomisation
##
## data: boston.trSP$LSTAT
## weights: boston_lw
##
## Moran I statistic standard deviate = 25.841, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.6944265724 -0.0019801980 0.0007263073
moran.plot(boston.trSP$LSTAT, boston_lw, zero.policy = TRUE,
main = "Diagrama de Dispersión de Moran - LSTAT",
xlab = "LSTAT", ylab = "Lag espacial de LSTAT")
# MODELOS DE REGRESIÓN
# ============================================================
# Fórmula base (variables con mayor correlación con CRIM)
formula_crim <- CRIM ~ ZN + INDUS + NOX + RM + AGE + DIS + RAD + TAX + PTRATIO + B + LSTAT
# --- 1. Modelo de Regresión Lineal Tradicional (OLS) ---
modelo_ols <- lm(formula_crim, data = boston.c)
summary(modelo_ols)
##
## Call:
## lm(formula = formula_crim, data = boston.c)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.673 -1.854 -0.329 0.911 77.645
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.8100896 6.9788620 1.406 0.160446
## ZN 0.0361869 0.0187438 1.931 0.054104 .
## INDUS -0.0792771 0.0837688 -0.946 0.344418
## NOX -7.1418051 5.2229588 -1.367 0.172126
## RM -0.3555174 0.5723546 -0.621 0.534788
## AGE 0.0003225 0.0180835 0.018 0.985777
## DIS -0.7059136 0.2715103 -2.600 0.009603 **
## RAD 0.5293735 0.0872206 6.069 2.56e-09 ***
## TAX -0.0006638 0.0051165 -0.130 0.896833
## PTRATIO -0.0648908 0.1785141 -0.364 0.716383
## B -0.0098300 0.0036501 -2.693 0.007321 **
## LSTAT 0.2408326 0.0685421 3.514 0.000483 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.504 on 494 degrees of freedom
## Multiple R-squared: 0.4406, Adjusted R-squared: 0.4282
## F-statistic: 35.38 on 11 and 494 DF, p-value: < 2.2e-16
# Diagnóstico de multicolinealidad
VIF(modelo_ols)
## ZN INDUS NOX RM AGE DIS RAD TAX
## 2.281047 3.942099 4.372226 1.930358 3.092830 3.901573 6.884474 8.875931
## PTRATIO B LSTAT
## 1.782825 1.325511 2.859635
# Test de Lagrange (para decidir si se necesita modelo espacial)
lm.LMtests(modelo_ols, boston_lw, test = c("LMlag","LMerr","RLMlag","RLMerr"), zero.policy = TRUE)
## Please update scripts to use lm.RStests in place of lm.LMtests
##
## Rao's score (a.k.a Lagrange multiplier) diagnostics for spatial
## dependence
##
## data:
## model: lm(formula = formula_crim, data = boston.c)
## test weights: listw
##
## RSlag = 20.205, df = 1, p-value = 6.957e-06
##
##
## Rao's score (a.k.a Lagrange multiplier) diagnostics for spatial
## dependence
##
## data:
## model: lm(formula = formula_crim, data = boston.c)
## test weights: listw
##
## RSerr = 16.498, df = 1, p-value = 4.869e-05
##
##
## Rao's score (a.k.a Lagrange multiplier) diagnostics for spatial
## dependence
##
## data:
## model: lm(formula = formula_crim, data = boston.c)
## test weights: listw
##
## adjRSlag = 3.7273, df = 1, p-value = 0.05353
##
##
## Rao's score (a.k.a Lagrange multiplier) diagnostics for spatial
## dependence
##
## data:
## model: lm(formula = formula_crim, data = boston.c)
## test weights: listw
##
## adjRSerr = 0.02055, df = 1, p-value = 0.886
# --- 2. Modelo SAR (Spatial AutoRegressive / Spatial Lag) ---
modelo_sar <- lagsarlm(formula_crim, data = boston.c,
listw = boston_lw, zero.policy = TRUE)
summary(modelo_sar)
##
## Call:lagsarlm(formula = formula_crim, data = boston.c, listw = boston_lw,
## zero.policy = TRUE)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.55301 -1.71272 -0.22048 0.82210 77.35354
##
## Type: lag
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 11.0867205 6.7318544 1.6469 0.099578
## ZN 0.0287898 0.0180996 1.5906 0.111692
## INDUS -0.0743519 0.0806996 -0.9213 0.356872
## NOX -6.0621729 5.0319625 -1.2047 0.228306
## RM -0.5142855 0.5515005 -0.9325 0.351068
## AGE -0.0010008 0.0174250 -0.0574 0.954198
## DIS -0.6466768 0.2618382 -2.4698 0.013520
## RAD 0.4261737 0.0867539 4.9124 8.995e-07
## TAX -0.0012364 0.0049362 -0.2505 0.802225
## PTRATIO -0.0880795 0.1722070 -0.5115 0.609019
## B -0.0085584 0.0035183 -2.4325 0.014993
## LSTAT 0.1835554 0.0663340 2.7671 0.005655
##
## Rho: 0.26439, LR test value: 18.937, p-value: 1.3512e-05
## Asymptotic standard error: 0.058308
## z-value: 4.5343, p-value: 5.7793e-06
## Wald statistic: 20.56, p-value: 5.7793e-06
##
## Log likelihood: -1649.921 for lag model
## ML residual variance (sigma squared): 39.264, (sigma: 6.2661)
## Number of observations: 506
## Number of parameters estimated: 14
## AIC: 3327.8, (AIC for lm: 3344.8)
## LM test for residual autocorrelation
## test value: 0.041176, p-value: 0.8392
modelo_sem <- errorsarlm(formula_crim, data = boston.c,
listw = boston_lw, zero.policy = TRUE)
summary(modelo_sem)
##
## Call:errorsarlm(formula = formula_crim, data = boston.c, listw = boston_lw,
## zero.policy = TRUE)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.89696 -1.66243 -0.22936 0.78021 78.15911
##
## Type: error
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 9.27120290 7.46602456 1.2418 0.21432
## ZN 0.03206513 0.01946820 1.6471 0.09955
## INDUS -0.06346004 0.09089151 -0.6982 0.48505
## NOX -5.33948275 5.91679698 -0.9024 0.36683
## RM -0.59927801 0.56791286 -1.0552 0.29132
## AGE 0.00209713 0.01866801 0.1123 0.91056
## DIS -0.65643626 0.29643072 -2.2145 0.02680
## RAD 0.50521289 0.09622242 5.2505 1.517e-07
## TAX 0.00070805 0.00540410 0.1310 0.89576
## PTRATIO -0.02353069 0.20398776 -0.1154 0.90816
## B -0.00854575 0.00388247 -2.2011 0.02773
## LSTAT 0.16976563 0.07077858 2.3985 0.01646
##
## Lambda: 0.29025, LR test value: 16.846, p-value: 4.0537e-05
## Asymptotic standard error: 0.063387
## z-value: 4.579, p-value: 4.6724e-06
## Wald statistic: 20.967, p-value: 4.6724e-06
##
## Log likelihood: -1650.966 for error model
## ML residual variance (sigma squared): 39.314, (sigma: 6.2701)
## Number of observations: 506
## Number of parameters estimated: 14
## AIC: 3329.9, (AIC for lm: 3344.8)
# --- 4. Modelo Durbin Espacial (SDM) ---
modelo_sdm <- lagsarlm(formula_crim, data = boston.c,
listw = boston_lw, type = "mixed", zero.policy = TRUE)
summary(modelo_sdm)
##
## Call:lagsarlm(formula = formula_crim, data = boston.c, listw = boston_lw,
## type = "mixed", zero.policy = TRUE)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.75671 -1.78495 -0.35528 0.87761 76.42245
##
## Type: mixed
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 6.53744429 11.62917678 0.5622 0.574008
## ZN 0.02649404 0.02098596 1.2625 0.206781
## INDUS -0.03660323 0.11243330 -0.3256 0.744761
## NOX -1.69178022 8.04736221 -0.2102 0.833490
## RM -0.86424230 0.57531659 -1.5022 0.133045
## AGE 0.00234894 0.01966913 0.1194 0.904941
## DIS -0.58262333 0.37295224 -1.5622 0.118243
## RAD 0.41962145 0.13281391 3.1595 0.001581
## TAX 0.00183903 0.00603622 0.3047 0.760621
## PTRATIO 0.06059794 0.29149241 0.2079 0.835316
## B -0.00830169 0.00440396 -1.8850 0.059423
## LSTAT 0.10125488 0.07453376 1.3585 0.174302
## lag.ZN 0.00733911 0.03610984 0.2032 0.838944
## lag.INDUS -0.12237649 0.18358249 -0.6666 0.505026
## lag.NOX -6.60851573 10.98116087 -0.6018 0.547304
## lag.RM 1.32930050 1.07446121 1.2372 0.216021
## lag.AGE -0.01564653 0.03448381 -0.4537 0.650019
## lag.DIS -0.12461843 0.56287763 -0.2214 0.824785
## lag.RAD 0.01064504 0.18875071 0.0564 0.955025
## lag.TAX -0.00386754 0.01011986 -0.3822 0.702333
## lag.PTRATIO -0.19623108 0.39148759 -0.5012 0.616199
## lag.B -0.00068754 0.00721380 -0.0953 0.924069
## lag.LSTAT 0.31590878 0.12622062 2.5028 0.012320
##
## Rho: 0.25618, LR test value: 13.725, p-value: 0.00021164
## Asymptotic standard error: 0.064502
## z-value: 3.9718, p-value: 7.1344e-05
## Wald statistic: 15.775, p-value: 7.1344e-05
##
## Log likelihood: -1644.795 for mixed model
## ML residual variance (sigma squared): 38.509, (sigma: 6.2056)
## Number of observations: 506
## Number of parameters estimated: 25
## AIC: 3339.6, (AIC for lm: 3351.3)
## LM test for residual autocorrelation
## test value: 1.7502, p-value: 0.18586
# COMPARATIVA DE MODELOS
# AIC de cada modelo
aic_tabla <- data.frame(
Modelo = c("OLS", "SAR", "SEM", "SDM"),
AIC = c(AIC(modelo_ols), AIC(modelo_sar), AIC(modelo_sem), AIC(modelo_sdm)),
LogLik = c(logLik(modelo_ols), logLik(modelo_sar), logLik(modelo_sem), logLik(modelo_sdm))
)
print(aic_tabla)
## Modelo AIC LogLik
## 1 OLS 3344.778 -1659.389
## 2 SAR 3327.841 -1649.921
## 3 SEM 3329.932 -1650.966
## 4 SDM 3339.591 -1644.795
# Tabla comparativa con stargazer (OLS como base)
stargazer(modelo_ols, type = "text",
title = "Modelo OLS - Variable Dependiente: CRIM")
##
## Modelo OLS - Variable Dependiente: CRIM
## ===============================================
## Dependent variable:
## ---------------------------
## CRIM
## -----------------------------------------------
## ZN 0.036*
## (0.019)
##
## INDUS -0.079
## (0.084)
##
## NOX -7.142
## (5.223)
##
## RM -0.356
## (0.572)
##
## AGE 0.0003
## (0.018)
##
## DIS -0.706***
## (0.272)
##
## RAD 0.529***
## (0.087)
##
## TAX -0.001
## (0.005)
##
## PTRATIO -0.065
## (0.179)
##
## B -0.010***
## (0.004)
##
## LSTAT 0.241***
## (0.069)
##
## Constant 9.810
## (6.979)
##
## -----------------------------------------------
## Observations 506
## R2 0.441
## Adjusted R2 0.428
## Residual Std. Error 6.504 (df = 494)
## F Statistic 35.375*** (df = 11; 494)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
# Moran test sobre residuales OLS (confirmar autocorrelación espacial)
moran.test(residuals(modelo_ols), boston_lw, zero.policy = TRUE)
##
## Moran I test under randomisation
##
## data: residuals(modelo_ols)
## weights: boston_lw
##
## Moran I statistic standard deviate = 4.4652, p-value = 4.001e-06
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.1101394239 -0.0019801980 0.0006305078
# Moran test sobre residuales SAR y SEM (deben ser no significativos)
moran.test(residuals(modelo_sar), boston_lw, zero.policy = TRUE)
##
## Moran I test under randomisation
##
## data: residuals(modelo_sar)
## weights: boston_lw
##
## Moran I statistic standard deviate = -0.01728, p-value = 0.5069
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## -0.0024117281 -0.0019801980 0.0006236381
moran.test(residuals(modelo_sem), boston_lw, zero.policy = TRUE)
##
## Moran I test under randomisation
##
## data: residuals(modelo_sem)
## weights: boston_lw
##
## Moran I statistic standard deviate = -0.059786, p-value = 0.5238
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## -0.0034712310 -0.0019801980 0.0006219878
# Impacts del modelo SAR (efectos directos e indirectos)
impacts_sar <- impacts(modelo_sar, listw = boston_lw, R = 500, zero.policy = TRUE)
summary(impacts_sar, zstats = TRUE, short = TRUE)
## Impact measures (lag, exact):
## Direct Indirect Total
## ZN dy/dx 0.029188105 0.0099489387 0.039137044
## INDUS dy/dx -0.075380607 -0.0256939271 -0.101074534
## NOX dy/dx -6.146047006 -2.0949165912 -8.240963598
## RM dy/dx -0.521400950 -0.1777226076 -0.699123557
## AGE dy/dx -0.001014662 -0.0003458536 -0.001360516
## DIS dy/dx -0.655623987 -0.2234733261 -0.879097313
## RAD dy/dx 0.432070059 0.1472736433 0.579343702
## TAX dy/dx -0.001253468 -0.0004272521 -0.001680720
## PTRATIO dy/dx -0.089298102 -0.0304377878 -0.119735889
## B dy/dx -0.008676860 -0.0029575592 -0.011634419
## LSTAT dy/dx 0.186095044 0.0634316000 0.249526644
## ========================================================
## Simulation results ( variance matrix):
## ========================================================
## Simulated standard errors
## Direct Indirect Total
## ZN dy/dx 0.018257285 0.007558002 0.025145490
## INDUS dy/dx 0.081104292 0.031210318 0.110482998
## NOX dy/dx 5.167093102 2.019100184 7.047160387
## RM dy/dx 0.546516417 0.213010713 0.747009828
## AGE dy/dx 0.016781921 0.006115416 0.022700346
## DIS dy/dx 0.261355301 0.117075355 0.358921168
## RAD dy/dx 0.083747484 0.050499993 0.114706907
## TAX dy/dx 0.004960899 0.001873915 0.006778098
## PTRATIO dy/dx 0.172656658 0.066595333 0.236477354
## B dy/dx 0.003737074 0.001692151 0.005181033
## LSTAT dy/dx 0.068658142 0.030562938 0.093271632
##
## Simulated z-values:
## Direct Indirect Total
## ZN dy/dx 1.51652200 1.28751642 1.4880850
## INDUS dy/dx -0.98144166 -0.90718436 -0.9767353
## NOX dy/dx -1.19349027 -1.08208861 -1.1851186
## RM dy/dx -0.93491751 -0.86859526 -0.9316716
## AGE dy/dx -0.08135033 -0.05177071 -0.0740876
## DIS dy/dx -2.52123670 -1.96569388 -2.4770701
## RAD dy/dx 5.21197381 3.00434768 5.1279321
## TAX dy/dx -0.28108575 -0.29363080 -0.2869061
## PTRATIO dy/dx -0.58836603 -0.55565367 -0.5860572
## B dy/dx -2.38692661 -1.85537525 -2.3276624
## LSTAT dy/dx 2.71397423 2.11423406 2.6905676
##
## Simulated p-values:
## Direct Indirect Total
## ZN dy/dx 0.1293874 0.1979144 0.1367285
## INDUS dy/dx 0.3263750 0.3643093 0.3287002
## NOX dy/dx 0.2326774 0.2792132 0.2359706
## RM dy/dx 0.3498308 0.3850686 0.3515063
## AGE dy/dx 0.9351634 0.9587114 0.9409407
## DIS dy/dx 0.0116943 0.0493340 0.0132466
## RAD dy/dx 1.8684e-07 0.0026615 2.9294e-07
## TAX dy/dx 0.7786446 0.7690400 0.7741843
## PTRATIO dy/dx 0.5562866 0.5784476 0.5578371
## B dy/dx 0.0169899 0.0635427 0.0199300
## LSTAT dy/dx 0.0066481 0.0344953 0.0071331
# Impacts del modelo SDM
impacts_sdm <- impacts(modelo_sdm, listw = boston_lw, R = 500, zero.policy = TRUE)
summary(impacts_sdm, zstats = TRUE, short = TRUE)
## Impact measures (mixed, exact):
## Direct Indirect Total
## ZN dy/dx 0.027207013 0.018278938 0.045485951
## INDUS dy/dx -0.043252902 -0.170482559 -0.213735461
## NOX dy/dx -2.047189443 -9.111891505 -11.159080948
## RM dy/dx -0.808326319 1.433559709 0.625233390
## AGE dy/dx 0.001589623 -0.019467158 -0.017877535
## DIS dy/dx -0.596446057 -0.354383656 -0.950829713
## RAD dy/dx 0.425584319 0.152874404 0.578458723
## TAX dy/dx 0.001667608 -0.004394786 -0.002727178
## PTRATIO dy/dx 0.051477579 -0.233825438 -0.182347860
## B dy/dx -0.008443725 -0.003641573 -0.012085298
## LSTAT dy/dx 0.118508149 0.442334868 0.560843017
## ========================================================
## Simulation results ( variance matrix):
## ========================================================
## Simulated standard errors
## Direct Indirect Total
## ZN dy/dx 0.020347464 0.04758671 0.045980766
## INDUS dy/dx 0.102635809 0.21175451 0.182289685
## NOX dy/dx 7.991160256 12.39778953 9.842466219
## RM dy/dx 0.581827292 1.38611763 1.504578175
## AGE dy/dx 0.019892333 0.04376280 0.043839298
## DIS dy/dx 0.371170637 0.68491169 0.574923883
## RAD dy/dx 0.134119919 0.20864481 0.170998141
## TAX dy/dx 0.005683419 0.01197105 0.011161167
## PTRATIO dy/dx 0.278349322 0.43707172 0.342325677
## B dy/dx 0.004402845 0.00856256 0.007658235
## LSTAT dy/dx 0.074188184 0.15210142 0.151342794
##
## Simulated z-values:
## Direct Indirect Total
## ZN dy/dx 1.37381509 0.3631384 0.9837639
## INDUS dy/dx -0.53011894 -0.7618535 -1.1834739
## NOX dy/dx -0.14944953 -0.8041200 -1.1342265
## RM dy/dx -1.42368961 1.0864133 0.4503290
## AGE dy/dx 0.05579434 -0.4961726 -0.4699898
## DIS dy/dx -1.70043264 -0.4784848 -1.6678217
## RAD dy/dx 3.12115466 0.7725775 3.3906994
## TAX dy/dx 0.24503007 -0.3409027 -0.2408668
## PTRATIO dy/dx 0.17087888 -0.5148393 -0.5183884
## B dy/dx -1.94636634 -0.4281371 -1.5976917
## LSTAT dy/dx 1.54923432 2.9870422 3.7614493
##
## Simulated p-values:
## Direct Indirect Total
## ZN dy/dx 0.1694991 0.7165015 0.32523163
## INDUS dy/dx 0.5960295 0.4461475 0.23662137
## NOX dy/dx 0.8811989 0.4213277 0.25669956
## RM dy/dx 0.1545363 0.2772962 0.65247323
## AGE dy/dx 0.9555056 0.6197726 0.63836233
## DIS dy/dx 0.0890496 0.6323052 0.09535113
## RAD dy/dx 0.0018014 0.4397725 0.00069715
## TAX dy/dx 0.8064332 0.7331768 0.80965834
## PTRATIO dy/dx 0.8643190 0.6066653 0.60418728
## B dy/dx 0.0516107 0.6685513 0.11011160
## LSTAT dy/dx 0.1213254 0.0028169 0.00016893
Similar a las exposiciones de las Actividades 01 y 02, esta se llevará a cabo al iniciar la clase del Martes 21 de Abril y para esto pueden utilizar únicamente el archivo HTML kniteado a partir de su código RMD.
Sin embargo, la entrega final deberá incluir tanto tu código completamente ejecutado en archivo HTML, así como una breve presentación en la que se mencionen claramente los principales patrones o insights detectados en su análisis, así como sus conclusiones. Los archivos se deberán subir en Canvas por equipo a más tardar el día Viernes 24 de Abril en la sección habilitada por el profesor para esto, junto con los archivos correspondientes a las actividades 01 y 02.