2. Download the county-level police data for the
state of Mississippi. Compute Moran’s I and Geary C statistics for the
percentage of whites (WHITE) and test for statistical
significance against the null hypothesis of no spatial autocorrelation.
Create a Moran’s scatter plot. Explain your results.
setwd("C:/Spatial Statistics")
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(rgdal))
suppressPackageStartupMessages(library(spdep))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(knitr))
download.file(url = "http://myweb.fsu.edu/jelsner/temp/data/police.zip","police.zip")
unzip("police.zip")
podf = readOGR("C:/Spatial Statistics/police/police.shp")
## Warning: OGR support is provided by the sf and terra packages among others
## Warning: OGR support is provided by the sf and terra packages among others
## Warning: OGR support is provided by the sf and terra packages among others
## Warning: OGR support is provided by the sf and terra packages among others
## Warning: OGR support is provided by the sf and terra packages among others
## Warning: OGR support is provided by the sf and terra packages among others
## Warning: OGR support is provided by the sf and terra packages among others
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Spatial Statistics\police\police.shp", layer: "police"
## with 82 features
## It has 21 fields
## Integer64 fields read as strings: CNTY_ CNTY_ID FIPSNO
names(podf)
## [1] "AREA" "PERIMETER" "CNTY_" "CNTY_ID" "NAME"
## [6] "STATE_NAME" "STATE_FIPS" "CNTY_FIPS" "FIPS" "FIPSNO"
## [11] "POLICE" "POP" "TAX" "TRANSFER" "INC"
## [16] "CRIME" "UNEMP" "OWN" "COLLEGE" "WHITE"
## [21] "COMMUTE"
Morgan’s I and Geary C Statistics:
neighbors = poly2nb(podf)
neighbors
## Neighbour list object:
## Number of regions: 82
## Number of nonzero links: 434
## Percentage nonzero weights: 6.454491
## Average number of links: 5.292683
summary(neighbors)
## Neighbour list object:
## Number of regions: 82
## Number of nonzero links: 434
## Percentage nonzero weights: 6.454491
## Average number of links: 5.292683
## Link number distribution:
##
## 3 4 5 6 7 8 9
## 11 14 20 18 16 2 1
## 11 least connected regions:
## 0 1 3 4 20 64 71 72 74 79 81 with 3 links
## 1 most connected region:
## 36 with 9 links
wtsnbs = nb2listw(neighbors)
class(wtsnbs)
## [1] "listw" "nb"
summary(wtsnbs)
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 82
## Number of nonzero links: 434
## Percentage nonzero weights: 6.454491
## Average number of links: 5.292683
## Link number distribution:
##
## 3 4 5 6 7 8 9
## 11 14 20 18 16 2 1
## 11 least connected regions:
## 0 1 3 4 20 64 71 72 74 79 81 with 3 links
## 1 most connected region:
## 36 with 9 links
##
## Weights style: W
## Weights constants summary:
## n nn S0 S1 S2
## W 82 6724 82 32.62703 333.0236
m = length(podf$WHITE)
s = Szero(wtsnbs)
moran(podf$WHITE, listw = wtsnbs, n = m, S0= s)
## $I
## [1] 0.5634778
##
## $K
## [1] 2.300738
geary(podf$WHITE, listw = wtsnbs, n = m, S0 = s, n1 = m - 1, zero.policy = NULL)
## $C
## [1] 0.4123818
##
## $K
## [1] 2.300738
Both the Morgan’s I and Geary C Statistics suggests that there is a spatial autocorrelation of percentages of white in the state of Mississippi. The null hypothesis of no spatial autocorrelation can be rejected.
pw = podf$WHITE
lpw = lag.listw(wtsnbs, pw)
data.frame(pw, lpw) %>%
ggplot(., aes(x = pw, y = lpw)) +
geom_point() +
geom_smooth(method = lm) +
xlab("White") + ylab("Spatial Lag of % White")
## `geom_smooth()` using formula = 'y ~ x'
Slope of the regression line:
lm(lpw ~ pw)
##
## Call:
## lm(formula = lpw ~ pw)
##
## Coefficients:
## (Intercept) pw
## 27.2333 0.5635
The scatter plot confirms that there is spatial autocorrelation of percentages of white in the state of Mississippi. It shows that the counties with high percent of white population is neighbored by the counties with high percent of white population.