Estadística espacial aplicada en Agronomía y ciencias afines

Ricardo Alves de Olinda

ricardo.estat@yahoo.com.br

http://lattes.cnpq.br/7767223263366578

Universidad del Estado de Paraíba

http://departamentos.uepb.edu.br/estatistica/corpo-docente/

Universidad de San Carlos de Guatemala

Centro Universitario de Oriente

CUNORI

Use R!

R Markdown

Sobre Programación básica en R, puede ser consultada Aqui

Comienza cargando las librerías y preparando los datos a utilizar.

## funciones para la importación y exportación y la manipulación de mapas y datos espaciales
require(maptools)

## Loading required package: maptools

## Loading required package: sp

## Checking rgeos availability: TRUE

## función para habilitar la licencia de uso
gpclibPermit()

## [1] FALSE

## (clases) para la representación de datos espaciales en el R
require(sp)  
## funciones de análisis de datos de área
require(spdep)

## Loading required package: spdep

## Loading required package: Matrix

## funciones para facilitar la división de datos en clases por varios criterios
require(classInt)

## Loading required package: classInt

## funciones utilizadas para crear paletas de colores en visualización de mapas
require(RColorBrewer)

## Loading required package: RColorBrewer

## funciones para establecer o consultar parámetros gráficos
par.ori <- par(no.readonly=TRUE)

Descripción de los datos

A seguir serán mostrados comandos útiles para el análisis de datos de área.

El Índice de Desarrollo Humano (IDH) se creó para hacer hincapié en que la ampliación de las oportunidades de las personas debería ser el criterio más importante para evaluar los resultados en materia de desarrollo. El crecimiento económico es un medio que contribuye a ese proceso, pero no es un objetivo en sí mismo.

El IDH mide el progreso conseguido por un país en tres dimensiones básicas del desarrollo humano: disfrutar de una vida larga y saludable, acceso a educación y nivel de vida digno.

La base de datos posee informaciones sobre el Índice de Desarrollo Humano (IDH) de 2006, 2011 y 2014, de los departamentos en Guatemala

Análisis exploratório de datos de área

Iniciamos con la lectura del shape del Guatemala por departamentos.

Análisis de datos de área

##--------------------------------------------------------------------------
##
##      Aula práctica: Análisis de datos de área
## 
##  Ricardo Alves de Olinda, Ezequiel López y Byron González
## 
##--------------------------------------------------------------------------
## Activando el paquete para lectura de los shapes.
require(maptools)

Argumentos utilizados por la función require() anterior:

package - maptools (nombre del paquete)

## Cree una pasta de trabajo e informe el directorio al R usando la función setwd(). Recuerde que las barras deben ser invertidas y debe estar entre comillas, por ejemplo: "/Documents/Minha Pasta"
setwd("...")

Argumentos utilizados por la función setwd() anterior:

dir - “.” (directorio para la pasta de trabajo)

Haciendo la lectura del shape por departamento.

ShapeGuate <- readShapePoly("Limites_Departamentales_Guatemala.shp",             
                            IDvar="cod_dep")

## Warning: use rgdal::readOGR or sf::st_read

class(ShapeGuate)

## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

dim(ShapeGuate)

## [1] 22  6

names(ShapeGuate)

## [1] "id"         "objectid"   "departamen" "cod_dep"    "shape_leng"
## [6] "shape_area"

# Nombre de los departamentos de Guatemala
ShapeGuate$departamen

##  [1] Guatemala      El Progreso    Sacatepéquez   Chimaltenango 
##  [5] Escuintla      Santa Rosa     Sololá         Totonicapán   
##  [9] Quetzaltenango Suchitepéquez  Retalhuleu     San Marcos    
## [13] Huehuetenango  Quiché         Baja Verapaz   Alta Verapaz  
## [17] Petén          Izabal         Zacapa         Chiquimula    
## [21] Jalapa         Jutiapa       
## 22 Levels: Alta Verapaz Baja Verapaz Chimaltenango ... Zacapa

plot(ShapeGuate)

Argumentos utilizados por la función readShapePoly() anterior:

fn - Limites_Departamentales_Guatemala.shp (nombre del archivo shape com extensão .shp)

Um plot simples do shape, pode ser obtido com o comando plot.

# aumentando los márgenes para la visualización del gráfico
par(mar=c(0,0,0,0))
plot(ShapeGuate)

#Localizando un determinado departamento en Guatemala
Verapaz <- ShapeGuate[ShapeGuate$departamen == "Alta Verapaz", ]
plot(Verapaz,col="gray90",add=TRUE)

#locator()
text(534362.9, 1738518,col="black", c("Alta Verapaz"),cex=0.8,lwd=3)

par(mar=c(0,0,0,0))
plot(ShapeGuate)

#Localizando un determinado departamento en Guatemala
Chiquimula <- ShapeGuate[ShapeGuate$departamen == "Chiquimula", ]
plot(Chiquimula,col="gray90",add=TRUE)


#locator()
text(616869.1, 1625910, col="black", c("Chiquimula"),cex=0.8,lwd=3)

A seguir, se hará la lectura del archivo que contiene el Índice de Desarrollo Humano (IDH) de 2006, 2011 y 2014, de los departamentos en Guatemala.

IDH <- read.table("IDH_Depertamental.txt",header=TRUE)
dim(IDH)

## [1] 22  6

names(IDH)

## [1] "cod_dep" "IDH_06"  "IDH_11"  "IDH_14"  "Simul"   "mort"

str(IDH)

## 'data.frame':    22 obs. of  6 variables:
##  $ cod_dep: int  16 15 4 20 2 5 1 13 18 21 ...
##  $ IDH_06 : num  0.341 0.395 0.477 0.421 0.49 0.482 0.648 0.377 0.458 0.373 ...
##  $ IDH_11 : num  0.372 0.454 0.459 0.429 0.502 0.528 0.629 0.397 0.471 0.414 ...
##  $ IDH_14 : num  0.37 0.457 0.487 0.408 0.518 0.516 0.614 0.399 0.481 0.426 ...
##  $ Simul  : int  22 46 48 41 92 90 99 21 85 43 ...
##  $ mort   : num  0.98 0.15 0.14 0.12 0.09 0.07 0.02 0.97 0.06 0.25 ...

IDH[,1]

##  [1] 16 15  4 20  2  5  1 13 18 21 22 17  9 14 11  3 12  6  7 10  8 19

summary(IDH[,-1])

##      IDH_06          IDH_11          IDH_14           Simul      
##  Min.   :0.332   Min.   :0.372   Min.   :0.3700   Min.   :20.00  
##  1st Qu.:0.392   1st Qu.:0.411   1st Qu.:0.4368   1st Qu.:43.50  
##  Median :0.433   Median :0.448   Median :0.4640   Median :46.50  
##  Mean   :0.440   Mean   :0.454   Mean   :0.4716   Mean   :55.09  
##  3rd Qu.:0.480   3rd Qu.:0.474   3rd Qu.:0.5050   3rd Qu.:76.50  
##  Max.   :0.648   Max.   :0.629   Max.   :0.6140   Max.   :99.00  
##       mort       
##  Min.   :0.0200  
##  1st Qu.:0.0975  
##  Median :0.1800  
##  Mean   :0.2836  
##  3rd Qu.:0.2775  
##  Max.   :0.9900

source("functions.R")

Argumentos utilizados pela función read.table() anterior:

file - IDH_Depertamental.txt (nombre del archivo a ser leído con extensión .txt) Para adicionar el Índice de Desarrollo Humano (IDH) de 2006, 2011 y 2014 al data.frame del shape de departamentos, utilizaremos la función merge() - unión de dos data.frames.

Ordenando los datos correctamente para compatibilizar el orden de los departamentos en la forma y en la tabla de atributos.

head(ShapeGuate@data)

##                           id objectid    departamen cod_dep shape_leng
## 1  limites_departamentales.7        7     Guatemala       1   310158.8
## 2  limites_departamentales.5        5   El Progreso       2   225159.3
## 3 limites_departamentales.16       16  Sacatepéquez       3   132537.6
## 4  limites_departamentales.3        3 Chimaltenango       4   232605.8
## 5  limites_departamentales.6        6     Escuintla       5   399797.8
## 6 limites_departamentales.18       18    Santa Rosa       6   352212.5
##   shape_area
## 1 2189392863
## 2 1834558688
## 3  535976576
## 4 1863379968
## 5 4504127543
## 6 3159312493

head(IDH)

##   cod_dep IDH_06 IDH_11 IDH_14 Simul mort
## 1      16  0.341  0.372  0.370    22 0.98
## 2      15  0.395  0.454  0.457    46 0.15
## 3       4  0.477  0.459  0.487    48 0.14
## 4      20  0.421  0.429  0.408    41 0.12
## 5       2  0.490  0.502  0.518    92 0.09
## 6       5  0.482  0.528  0.516    90 0.07

ind <- match(ShapeGuate@data$cod_dep, IDH$cod_dep)
ind

##  [1]  7  5 16  3  6 18 19 21 13 20 15 17  8 14  2  1 12  9 22  4 10 11

ShapeGuate@data <- merge(ShapeGuate@data, IDH, by.x="cod_dep",                                     by.y="cod_dep", sort=FALSE)

Argumentos utilizados por la función merge() anterior:

data.frame 1 - ShapeGuate@$data (nombre del data.frame)
data.frame 2 - IDH (nombre del data.frame)
by.x - cod_dep (entre comillas, nombre de la variable en el data.frame 1)
by.y - cod_dep (entre comillas, nombre de la variable en el data.frame 2)
sort - FALSE (el ordenamiento quedará igual al del data.frame 1)

Vea como quedó el data.frame que se encuentra dentro del archivo mapa1 después del merge.

ShapeGuate@data

##    cod_dep                         id objectid     departamen shape_leng
## 1        1  limites_departamentales.7        7      Guatemala   310158.8
## 2        2  limites_departamentales.5        5    El Progreso   225159.3
## 3        3 limites_departamentales.16       16   Sacatepéquez   132537.6
## 4        4  limites_departamentales.3        3  Chimaltenango   232605.8
## 5        5  limites_departamentales.6        6      Escuintla   399797.8
## 6        6 limites_departamentales.18       18     Santa Rosa   352212.5
## 7        7 limites_departamentales.19       19         Sololá   271897.7
## 8        8 limites_departamentales.21       21    Totonicapán   190706.5
## 9        9 limites_departamentales.13       13 Quetzaltenango   334448.4
## 10      10 limites_departamentales.20       20  Suchitepéquez   339187.6
## 11      11 limites_departamentales.15       15     Retalhuleu   290129.6
## 12      12 limites_departamentales.17       17     San Marcos   383408.7
## 13      13  limites_departamentales.8        8  Huehuetenango   456946.4
## 14      14 limites_departamentales.14       14         Quiché   763252.4
## 15      15  limites_departamentales.2        2   Baja Verapaz   324376.1
## 16      16  limites_departamentales.1        1   Alta Verapaz   703968.9
## 17      17 limites_departamentales.12       12          Petén  1095828.4
## 18      18  limites_departamentales.9        9         Izabal   879316.5
## 19      19 limites_departamentales.22       22         Zacapa   314542.0
## 20      20  limites_departamentales.4        4     Chiquimula   276903.3
## 21      21 limites_departamentales.10       10         Jalapa   244343.1
## 22      22 limites_departamentales.11       11        Jutiapa   456577.2
##     shape_area IDH_06 IDH_11 IDH_14 Simul mort
## 1   2189392863  0.648  0.629  0.614    99 0.02
## 2   1834558688  0.490  0.502  0.518    92 0.09
## 3    535976576  0.553  0.547  0.567    93 0.06
## 4   1863379968  0.477  0.459  0.487    48 0.14
## 5   4504127543  0.482  0.528  0.516    90 0.07
## 6   3159312493  0.439  0.446  0.470    47 0.18
## 7   1039559866  0.391  0.428  0.455    45 0.26
## 8   1077120743  0.378  0.409  0.432    43 0.45
## 9   2133140379  0.495  0.479  0.529    95 0.05
## 10  2150327251  0.438  0.450  0.471    47 0.18
## 11  1943747132  0.458  0.437  0.476    47 0.27
## 12  3552843303  0.418  0.410  0.451    45 0.28
## 13  7358367985  0.377  0.397  0.399    21 0.97
## 14  7280695780  0.332  0.376  0.424    20 0.99
## 15  3016711413  0.395  0.454  0.457    46 0.15
## 16 10636673091  0.341  0.372  0.370    22 0.98
## 17 35838748247  0.406  0.410  0.458    46 0.28
## 18  7491017132  0.458  0.471  0.481    85 0.06
## 19  2699668757  0.481  0.475  0.511    51 0.12
## 20  2402286250  0.421  0.429  0.408    41 0.12
## 21  2029688920  0.373  0.414  0.426    43 0.25
## 22  3316155918  0.428  0.466  0.455    46 0.27

Un plot del Índice de Desarrollo Humano (IDH) por departamento en el año de 2006, 2011 y 2014 puede ser obtenido al utilizar los comandos siguientes.

intervalos=quantile(c(ShapeGuate@data$IDH_06, ShapeGuate@data$IDH_11,
                      ShapeGuate@data$IDH_14), 
                    probs = seq(0,1,0.125)) + c(rep(0,8),0.001)
round(intervalos,2)

##    0% 12.5%   25% 37.5%   50% 62.5%   75% 87.5%  100% 
##  0.33  0.39  0.41  0.43  0.45  0.47  0.48  0.52  0.65

Argumentos utilizados por la función quantile() anterior:

x - ShapeGuate@data$IDH_06 (variable de interés)
probs - seq(0,1,0.125)) + c(rep(0,8),0.001) (informando los cuantiles a calcular)
Notem que estamos sumando el valor 0.001 al último valor, pues los intervalos son abiertos a la derecha

spplot(ShapeGuate,c("IDH_06","IDH_11","IDH_14"),names.attr=c("2006","2011","2014"),
                   at=intervalos,col.regions =brewer.pal(9, "Reds"))

Gráfico de la desviación estándar para el IDH de 2006

INT1 <- classIntervals(ShapeGuate$IDH_06, n=5, style="sd")
COLORES1 <- brewer.pal(9, "Reds")
COL1 <-  findColours(INT1, COLORES1)
plot(ShapeGuate, col=COL1)
TB1 <- attr(COL1, "table")
legtext <- paste(names(TB1))
legend("bottomright", fill=attr(COL1, "palette"), legend=legtext, 
       bty="n",cex=0.78)

Será que existe una estructura de dependencia espacial para los datos anteriores?

I de Moran Global

Para verificar la existencia de autocorrelación espacial, utilizaremos el I de Moran.

Inicialmente creamos una estructura de vecindad necesaria con los comandos a seguir. Los siguientes comandos fueron vistos en la primera clase, con excepción del comando nb2listw(), que crea una lista de vecindad con pesos.

Creando una estructura de vecindad con los vecinos de frontera.

require(RANN)

## Loading required package: RANN

nobs <- length(ShapeGuate)
Guate.nb <- poly2nb(ShapeGuate)
coords <- coordinates(ShapeGuate)
par(mar=c(0,0,0,0))
plot(ShapeGuate, border="grey")
plot(Guate.nb, coords, add=TRUE)

vizinhanca = nb2listw(Guate.nb, style="B")
vizinhanca

## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 22 
## Number of nonzero links: 102 
## Percentage nonzero weights: 21.07438 
## Average number of links: 4.636364 
## 
## Weights style: B 
## Weights constants summary:
##    n  nn  S0  S1   S2
## B 22 484 102 204 2072

Argumentos utilizados por la función nb2listw() anterior:

neighbours - Guate.nb (un objeto de la clase nb) style - “B” (pesos de la matriz de vecindad. Puede asumir los valores “W”,“B”, “C”, “U”, “minimax” y “S”, para mayores detalles ver el help de la función).

A seguir son mostrados los cálculos del índice I de Moran para las variables Índice de Desarrollo Humano en 2006, 2011 y 2014.

Imoran1 = moran.test(ShapeGuate@data$IDH_06, vizinhanca)
Imoran1

## 
##  Moran I test under randomisation
## 
## data:  ShapeGuate@data$IDH_06  
## weights: vizinhanca  
## 
## Moran I statistic standard deviate = 2.5653, p-value = 0.005154
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##        0.24641488       -0.04761905        0.01313719

Imoran2 = moran.test(ShapeGuate@data$IDH_11, vizinhanca)
Imoran2

## 
##  Moran I test under randomisation
## 
## data:  ShapeGuate@data$IDH_11  
## weights: vizinhanca  
## 
## Moran I statistic standard deviate = 3.3303, p-value = 0.0004338
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##        0.33203681       -0.04761905        0.01299627

Imoran3 = moran.test(ShapeGuate@data$IDH_14, vizinhanca)
Imoran3

## 
##  Moran I test under randomisation
## 
## data:  ShapeGuate@data$IDH_14  
## weights: vizinhanca  
## 
## Moran I statistic standard deviate = 2.1285, p-value = 0.01665
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##        0.20374058       -0.04761905        0.01394526

Utilizando simulación Monte Carlo

moran.mc(ShapeGuate@data$IDH_06, vizinhanca, nsim=999)

## 
##  Monte-Carlo simulation of Moran I
## 
## data:  ShapeGuate@data$IDH_06 
## weights: vizinhanca  
## number of simulations + 1: 1000 
## 
## statistic = 0.24641, observed rank = 992, p-value = 0.008
## alternative hypothesis: greater

moran.mc(ShapeGuate@data$IDH_11, vizinhanca, nsim=999)

## 
##  Monte-Carlo simulation of Moran I
## 
## data:  ShapeGuate@data$IDH_11 
## weights: vizinhanca  
## number of simulations + 1: 1000 
## 
## statistic = 0.33204, observed rank = 1000, p-value = 0.001
## alternative hypothesis: greater

moran.mc(ShapeGuate@data$IDH_14, vizinhanca, nsim=999)

## 
##  Monte-Carlo simulation of Moran I
## 
## data:  ShapeGuate@data$IDH_14 
## weights: vizinhanca  
## number of simulations + 1: 1000 
## 
## statistic = 0.20374, observed rank = 975, p-value = 0.025
## alternative hypothesis: greater

Argumentos utilizados por la función moran.test() anterior:

x - ShapeGuate@data$IDH_... (vector con la variable de interés)
listw - vizinhanca (un objeto de la clase nb2listw).

I de Moran Local

ShapeGuate.mloc1 <- localmoran(ShapeGuate$IDH_06, nb2listw(Guate.nb))

round(ShapeGuate.mloc1,2) # La función "round", redondea los valores a dos decimales

##       Ii  E.Ii Var.Ii  Z.Ii Pr(z > 0)
## 1   0.79 -0.05   0.09  2.78      0.00
## 2   0.35 -0.05   0.18  0.93      0.18
## 3   2.22 -0.05   0.25  4.49      0.00
## 4   0.17 -0.05   0.09  0.73      0.23
## 5   0.61 -0.05   0.14  1.77      0.04
## 6  -0.01 -0.05   0.18  0.09      0.46
## 7   0.16 -0.05   0.14  0.56      0.29
## 8   0.52 -0.05   0.18  1.33      0.09
## 9  -0.34 -0.05   0.11 -0.87      0.81
## 10 -0.01 -0.05   0.14  0.11      0.46
## 11  0.04 -0.05   0.25  0.17      0.43
## 12 -0.02 -0.05   0.25  0.06      0.47
## 13  0.44 -0.05   0.18  1.14      0.13
## 14  0.99 -0.05   0.09  3.47      0.00
## 15 -0.20 -0.05   0.11 -0.45      0.67
## 16  0.52 -0.05   0.14  1.51      0.07
## 17  0.44 -0.05   0.25  0.96      0.17
## 18 -0.11 -0.05   0.25 -0.13      0.55
## 19 -0.23 -0.05   0.11 -0.54      0.70
## 20  0.05 -0.05   0.25  0.19      0.42
## 21 -0.61 -0.05   0.11 -1.69      0.95
## 22  0.07 -0.05   0.25  0.23      0.41
## attr(,"call")
## localmoran(x = ShapeGuate$IDH_06, listw = nb2listw(Guate.nb))
## attr(,"class")
## [1] "localmoran" "matrix"

range(ShapeGuate.mloc1[,1]) # amplitud

## [1] -0.6111554  2.2171153

intervalos2=quantile(ShapeGuate.mloc1[,1], probs = seq(0,1,0.125)) + c(rep(0,8),0.001)
round(intervalos2, 2)

##    0% 12.5%   25% 37.5%   50% 62.5%   75% 87.5%  100% 
## -0.61 -0.21 -0.01  0.03  0.12  0.36  0.50  0.68  2.22

INT <- classIntervals(ShapeGuate.mloc1[,1], style="fixed",
                       fixedBreaks=c(-0.61, -0.01, 0.12, 0.50, 2.22))

## Warning in classIntervals(ShapeGuate.mloc1[, 1], style = "fixed",
## fixedBreaks = c(-0.61, : variable range greater than fixedBreaks

COLORES <- brewer.pal(9, "Reds")
COL <- findColours(INT, COLORES)

par(mar=c(0,0,0,0))
plot(ShapeGuate, col=COL)

TB <- attr(COL, "table")
legtext <- paste(names(TB))
legend("bottomright", fill=attr(COL, "palette"), legend=legtext, bty="n", cex=1)

Moran Scatterplot

Otra herramienta importante, el diagrama de esparcimiento de Moran, puede ser obtenida por medio del comando moran.plot(). Solo será mostrado para el IDH en 2014.

El diagrama de esparcimiento de Moran es una manera adicional de visualizar la dependencia espacial. Los cuadrantes (Q1 - superior derecho, Q2 - superior isquierdo, Q3 -inferior isquierdo y Q4 - inferior derecho) pueden ser interpretados como:

Q1 y Q3 - indican puntos de asociación espacial positiva, en el sentido de que una localización posee vecinos con valores semejantes;
Q2 y Q4 - indican puntos de asociación espacial negativa, en el sentido de que una localización posee vecinos con valores distintos, indicando puntos de transición entre diferentes padrones espaciales.

par(mar=c(4,4,1.5,0.5))
moran.plot(ShapeGuate$IDH_14, listw=vizinhanca, zero.policy=TRUE, 
           pch=16, col="black",cex=.5, quiet=TRUE,
           xlim=c(0.33,0.63),ylim=c(1.3,3.5),
           labels=as.character(ShapeGuate$departamen),
           xlab="Índice de Desarrollo Humano de Guatemala", 
           ylab="Índice de Desarrollo Humano (Spatial Lag)")

Test= poly2nb(ShapeGuate)
Guate.nb1.mat <- nb2mat(Test)
Guate_SD <- scale(ShapeGuate$IDH_14)
Guate_W <- Guate.nb1.mat %*% Guate_SD
Q <- ifelse((Guate_SD>=0 & Guate_W >=0), 1, 0)
Q[(Guate_SD<0  & Guate_W < 0)] <- 2
Q[(Guate_SD>=0 & Guate_W < 0)] <- 3
Q[(Guate_SD<0  & Guate_W >= 0)]<- 4
table(Q)

## Q
## 1 2 3 4 
## 6 9 3 4

CORES.5 <- brewer.pal(9, "Reds")
plot(ShapeGuate, col=CORES.5[Q])
title("Mapa LISA")
legend("bottomright", c("Q(+/+)", "Q(-/-)", "Q(+/-)", "Q(-/+)"), 
       fill=CORES.5)

Modelo de Regresión para datos espaciales

2Parte

## Clear the workspace
rm(list=ls())
## Install packages
#setRepositories(graphics = TRUE)

## Load packages
library(maptools)
library(proj4)
library(classInt)
library(spdep)
library(spgwr)

## Loading required package: spData

## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge')`

## NOTE: This package does not constitute approval of GWR
## as a method of spatial analysis; see example(gwr)

library(gstat)
library(sm)

## Package 'sm', version 2.2-5.4: type help(sm) for summary information

library(png)

## Set working directory (if you already downloaded the files)
#setwd("~/Projetos/Pibic's/Pibic_2016/Rotine_R")
source("functions.R")

A seguir, se hará la lectura del archivo que contiene el Índice de Desarrollo Humano (IDH) de 2006, 2011 y 2014, de los departamentos en Guatemala con dos variables independientes

IDH <- read.table("IDH_Depertamental.txt",header=TRUE)

require(sp)
ShapeGuate <- readShapePoly("Limites_Departamentales_Guatemala.shp")

## Warning: use rgdal::readOGR or sf::st_read

ShapeTest2 <- sp.merge(sp_data=ShapeGuate,tab_data=IDH,
                       id="cod_dep")
names(ShapeTest2)

##  [1] "id"         "objectid"   "departamen" "cod_dep"    "shape_leng"
##  [6] "shape_area" "IDH_06"     "IDH_11"     "IDH_14"     "Simul"     
## [11] "mort"

library(spdep)
data <- ShapeTest2
class(data)

## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

rd <- coordinates(data)

head(rd)

##       [,1]    [,2]
## 0 542777.6 1729593
## 1 512373.4 1671503
## 2 454500.0 1624946
## 3 615684.7 1624317
## 4 544783.8 1648261
## 5 445044.6 1566099

W_cont <- poly2nb(data, queen=T)
W_cont_mat <- nb2listw(W_cont, style="W", zero.policy=TRUE)

Ajuste del modelo de regresión clasico

###############
## OLS Model ##
###############

mod.lm <- lm(IDH_14 ~ Simul+mort, data=IDH)
summary(mod.lm)

## 
## Call:
## lm(formula = IDH_14 ~ Simul + mort, data = IDH)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.04802 -0.02054 -0.00032  0.01640  0.05964 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.382323   0.027495  13.905 2.07e-11 ***
## Simul        0.001743   0.000362   4.813 0.000121 ***
## mort        -0.023737   0.030049  -0.790 0.439299    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02817 on 19 degrees of freedom
## Multiple R-squared:  0.7702, Adjusted R-squared:  0.746 
## F-statistic: 31.83 on 2 and 19 DF,  p-value: 8.578e-07

# Analisis de los resíduos del modelo de regresión clasico

shapiro.test(mod.lm$residuals)

## 
##  Shapiro-Wilk normality test
## 
## data:  mod.lm$residuals
## W = 0.9839, p-value = 0.9655

Ajuste del modelo SAR

###############
## SAR Model ##
###############

mod.sar <- lagsarlm(IDH_14 ~ Simul+mort,
                    data = IDH, listw=W_cont_mat, zero.policy=T,
                    tol.solve=1e-15)
summary(mod.sar)

## 
## Call:
## lagsarlm(formula = IDH_14 ~ Simul + mort, data = IDH, listw = W_cont_mat, 
##     zero.policy = T, tol.solve = 1e-15)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0437205 -0.0210579  0.0014661  0.0149462  0.0598715 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                Estimate  Std. Error z value  Pr(>|z|)
## (Intercept)  0.33525770  0.09924776  3.3780 0.0007302
## Simul        0.00170611  0.00033478  5.0962 3.466e-07
## mort        -0.02209012  0.02816518 -0.7843 0.4328606
## 
## Rho: 0.10254, LR test value: 0.29068, p-value: 0.58979
## Asymptotic standard error: 0.20384
##     z-value: 0.50305, p-value: 0.61493
## Wald statistic: 0.25306, p-value: 0.61493
## 
## Log likelihood: 49.06672 for lag model
## ML residual variance (sigma squared): 0.00067499, (sigma: 0.025981)
## Number of observations: 22 
## Number of parameters estimated: 5 
## AIC: -88.133, (AIC for lm: -89.843)
## LM test for residual autocorrelation
## test value: 4.6777, p-value: 0.030556

## Plot residuals
res1 <- mod.sar$residuals

summary(res1)

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -0.043720 -0.021058  0.001466  0.000000  0.014946  0.059871

## Plot residuals
res1 <- mod.sar$residuals
cols1 <- c()
cols1[res1<(-0.05)] <- "green4"
cols1[res1>=(-0.05)&res1<(-0.02)] <- "green1"
cols1[res1>=(-0.02)&res1<0.04] <- "white"
cols1[res1<0.04&res1>0.06] <- "red1"
cols1[res1>=0.06] <- "red4"

par(mar=rep(0,4))
plot(data,col=cols1, border="grey")
legend(x="bottom",cex=.8,fill=c("green4","green1","white","red1","red4"),
       bty="n",legend=c("(-0.08,-0.06)","[-0.06, -0.02)","[-0.02, 0.02)",
                        "[0.02,0.06)","[0.06,0.17 )"),ncol=5)

dev.off()

## null device 
##           1

###############
## SEM Model ##
###############

mod.sem <- errorsarlm(IDH_14 ~ Simul+mort,
                       data =IDH, listw=W_cont_mat, zero.policy=T, 
                       tol.solve=1e-15)
summary(mod.sem)

## 
## Call:
## errorsarlm(formula = IDH_14 ~ Simul + mort, data = IDH, listw = W_cont_mat, 
##     zero.policy = T, tol.solve = 1e-15)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0514927 -0.0181049  0.0009782  0.0149607  0.0435258 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##               Estimate Std. Error z value  Pr(>|z|)
## (Intercept)  0.3624735  0.0220120 16.4671 < 2.2e-16
## Simul        0.0020704  0.0002941  7.0400 1.923e-12
## mort        -0.0130525  0.0231131 -0.5647    0.5723
## 
## Lambda: -0.66999, LR test value: 2.474, p-value: 0.11574
## Asymptotic standard error: 0.32354
##     z-value: -2.0708, p-value: 0.03838
## Wald statistic: 4.2881, p-value: 0.03838
## 
## Log likelihood: 50.1584 for error model
## ML residual variance (sigma squared): 0.00055948, (sigma: 0.023653)
## Number of observations: 22 
## Number of parameters estimated: 5 
## AIC: -90.317, (AIC for lm: -89.843)

###############
## SDM Model ##
###############


mod.sdm <- lagsarlm(IDH_14 ~ Simul+mort, 
                     data = IDH, listw=W_cont_mat, zero.policy=T, 
                     type="mixed", tol.solve=1e-15)
summary(mod.sdm)

## 
## Call:
## lagsarlm(formula = IDH_14 ~ Simul + mort, data = IDH, listw = W_cont_mat, 
##     type = "mixed", zero.policy = T, tol.solve = 1e-15)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0264274 -0.0118541 -0.0035307  0.0098520  0.0410739 
## 
## Type: mixed 
## Coefficients: (asymptotic standard errors) 
##                Estimate  Std. Error z value  Pr(>|z|)
## (Intercept)  0.59026639  0.11447993  5.1561 2.522e-07
## Simul        0.00191562  0.00024792  7.7267 1.110e-14
## mort        -0.00201426  0.02037804 -0.0988    0.9213
## lag.Simul    0.00360158  0.00072822  4.9458 7.585e-07
## lag.mort     0.02675884  0.04010901  0.6672    0.5047
## 
## Rho: -0.90519, LR test value: 6.9079, p-value: 0.0085814
## Asymptotic standard error: 0.27746
##     z-value: -3.2624, p-value: 0.0011046
## Wald statistic: 10.643, p-value: 0.0011046
## 
## Log likelihood: 54.72554 for mixed model
## ML residual variance (sigma squared): 0.00034176, (sigma: 0.018487)
## Number of observations: 22 
## Number of parameters estimated: 7 
## AIC: -95.451, (AIC for lm: -90.543)
## LM test for residual autocorrelation
## test value: 2.554, p-value: 0.11001