IS415 Geospatial Analytics and Application

Instructor: Dr. Kam Tin Seong.

Assoc. Professor of Information Systems (Practice)

1. Overview

In this hands-on exercise, we will learn how to compute Spatial Autocorrelation statistics using the spdep package.

1.1 Dataset

  • Hunan county shapefile containing the county’s boundary layer

  • Hunan_2012.csv containing selected Hunan’s local development indicators

2. Load/Install necessary R packages

We will be using these R packages:

  • sf

  • tmap

  • spdep

  • rgdal

  • tidyverse

packages = c('sf','tmap','spdep','rgdal', 'tidyverse')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

3. Importing Data into R

We import the shapefile using readOGR() function of rgdal package.

Then, we import the csv file using read_csv() function of readr package.

3.1 Importing geospatial data (shapefile)

hunan <- readOGR(dsn = "data/shapefile", layer = "Hunan")
## OGR data source with driver: ESRI Shapefile 
## Source: "D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Week 6\Hands-on_Ex06\data\shapefile", layer: "Hunan"
## with 88 features
## It has 7 fields

It has 88 features meaning 88 polygons (counties).

3.2 Importing aspatial data (csv file)

hunan2012 <- read_csv("data/attribute/Hunan_2012.csv")
## Parsed with column specification:
## cols(
##   .default = col_double(),
##   County = col_character(),
##   City = col_character()
## )
## See spec(...) for full column specifications.

3.3 Combining geospatial and aspatial data

We will use left_join() function of dplyr package to combine hunan (SpatialPolygonsDataFrame) and hunan2012 (tbl dataframe)

hunan@data <- left_join(hunan@data, hunan2012)
## Joining, by = "County"
knitr::kable(head(hunan, n=5))
NAME_2 ID_3 NAME_3 ENGTYPE_3 Shape_Leng Shape_Area County City avg_wage deposite FAI Gov_Rev Gov_Exp GDP GDPPC GIO Loan NIPCR Bed Emp EmpR EmpRT Pri_Stu Sec_Stu Household Household_R NOIP Pop_R RSCG Pop_T Agri Service Disp_Inc RORP ROREmp
Changde 21098 Anxiang County 1.869074 0.1005619 Anxiang Changde 31935 5517.2 3541.0 243.64 1779.5 12482.0 23667 5108.9 2806.9 7693.7 1931 336.39 270.5 205.9 19.584 17.819 148.1 135.4 53 346.0 3957.9 528.3 4524.41 14100 16610 0.6549309 0.8041262
Changde 21100 Hanshou County 2.360691 0.1997875 Hanshou Changde 32265 7979.0 8665.0 386.13 2062.4 15788.0 20981 13491.0 4550.0 8269.9 2560 456.78 388.8 246.7 42.097 33.029 240.2 208.7 95 553.2 4460.5 804.6 6545.35 17727 18925 0.6875466 0.8511756
Changde 21101 Jinshi County City 1.425620 0.0530241 Jinshi Changde 28692 4581.7 4777.0 373.31 1148.4 8706.9 34592 10935.0 2242.0 8169.9 848 122.78 82.1 61.7 8.723 7.592 81.9 43.7 77 92.4 3683.0 251.8 2562.46 7525 19498 0.3669579 0.6686757
Changde 21102 Li County 3.474324 0.1890812 Li Changde 32541 13487.0 16066.0 709.61 2459.5 20322.0 24473 18402.0 6748.0 8377.0 2038 513.44 426.8 227.1 38.975 33.938 268.5 256.0 96 539.7 7110.2 832.5 7562.34 53160 18985 0.6482883 0.8312558
Changde 21103 Linli County 2.289506 0.1145036 Linli Changde 32667 564.1 7781.2 336.86 1538.7 10355.0 25554 8214.0 358.0 8143.1 1440 307.36 272.2 100.8 23.286 18.943 129.1 157.2 99 246.6 3604.9 409.3 3583.91 7031 18604 0.6024921 0.8856065

3.4 Visualizing Regional Development Indicator

We use qtm() function from tmap package, it stands for quick thematic map and is an easy way to plot maps.

In this map, we plot the “GPDPC” variable which is the GPD per capita of each county because it relates to the economy and economy relates to the development of the county.

qtm(hunan, "GDPPC")

DIY: With reference to the figure above, define the null (H0) and alternative (H1) hypotheses. Select the confident interval for the statistical test.

H0: There distribution of GDPPC per county are randomly distributed

H1: The distribution of GDPPC per county are not randomly distributed, there are clusters (positive autocorrelation) or outliers (negative autocorrelation)

Significance/confidence interval: 95% or 0.05

4. Modelling Spatial Neighbours

In class, we have learnt about contiguity (common boundary) based neighbours and how to create spatial weights for either distance-based neighbours or adjacency-based neighbours.

In this section, we will use poly2nb() function of spded package to compute contiguity weight matrices for the study area.

This function builds a neighbours list based on regions with contiguous boundaries.

Default ‘queen’ argument is TRUE so if we do not specify, it will return a list of first order neighbours using the Queen criteria

4.1 Creating Queen Contiguity based neighbours

wm_q <- poly2nb(hunan, queen = TRUE)

summary(wm_q)
## Neighbour list object:
## Number of regions: 88 
## Number of nonzero links: 448 
## Percentage nonzero weights: 5.785124 
## Average number of links: 5.090909 
## Link number distribution:
## 
##  1  2  3  4  5  6  7  8  9 11 
##  2  2 12 16 24 14 11  4  2  1 
## 2 least connected regions:
## 29 64 with 1 link
## 1 most connected region:
## 84 with 11 links

From the summary, we can find out that:

  • There are 88 area units/counties in Hunan

  • Most connected area unit has 11 neighbours which is shown by the first row 11 with the next row showing 1 meaning only 1 area unit has 11 neighbours

  • There are 2 area units with only 1 neighbour

wm_q[[1]]
## [1]  2  3  4 57 85

wm_q list all the neighbouring polygons of the ID we call. So for polygon ID 1: 2,3,4,57,85 are all its neighbours.

So, what is polygon ID 1 and which area units are the 5 neighbours?

  • We can find out by calling the NAME_3 variable using $ and subsetting it by calling the ID 1
hunan$NAME_3[1]
## [1] "Anxiang"
hunan$NAME_3[c(2,3,4,57,85)]
## [1] "Hanshou" "Jinshi"  "Li"      "Nan"     "Taoyuan"

Thus, Polygon ID 1 is Anxiang and Anxiang has 5 neighbours, namely Hanshou, Jinshi, Li, Nan, Taoyuan.

str(wm_q)
## List of 88
##  $ : int [1:5] 2 3 4 57 85
##  $ : int [1:5] 1 57 58 78 85
##  $ : int [1:4] 1 4 5 85
##  $ : int [1:4] 1 3 5 6
##  $ : int [1:4] 3 4 6 85
##  $ : int [1:5] 4 5 69 75 85
##  $ : int [1:4] 67 71 74 84
##  $ : int [1:7] 9 46 47 56 78 80 86
##  $ : int [1:6] 8 66 68 78 84 86
##  $ : int [1:8] 16 17 19 20 22 70 72 73
##  $ : int [1:3] 14 17 72
##  $ : int [1:5] 13 60 61 63 83
##  $ : int [1:4] 12 15 60 83
##  $ : int [1:3] 11 15 17
##  $ : int [1:4] 13 14 17 83
##  $ : int [1:5] 10 17 22 72 83
##  $ : int [1:7] 10 11 14 15 16 72 83
##  $ : int [1:5] 20 22 23 77 83
##  $ : int [1:6] 10 20 21 73 74 86
##  $ : int [1:7] 10 18 19 21 22 23 82
##  $ : int [1:5] 19 20 35 82 86
##  $ : int [1:5] 10 16 18 20 83
##  $ : int [1:7] 18 20 38 41 77 79 82
##  $ : int [1:5] 25 28 31 32 54
##  $ : int [1:5] 24 28 31 33 81
##  $ : int [1:4] 27 33 42 81
##  $ : int [1:3] 26 29 42
##  $ : int [1:5] 24 25 33 49 54
##  $ : int [1:3] 27 37 42
##  $ : int 33
##  $ : int [1:8] 24 25 32 36 39 40 56 81
##  $ : int [1:8] 24 31 50 54 55 56 75 85
##  $ : int [1:5] 25 26 28 30 81
##  $ : int [1:3] 36 45 80
##  $ : int [1:6] 21 41 47 80 82 86
##  $ : int [1:6] 31 34 40 45 56 80
##  $ : int [1:4] 29 42 43 44
##  $ : int [1:4] 23 44 77 79
##  $ : int [1:5] 31 40 42 43 81
##  $ : int [1:6] 31 36 39 43 45 79
##  $ : int [1:6] 23 35 45 79 80 82
##  $ : int [1:7] 26 27 29 37 39 43 81
##  $ : int [1:6] 37 39 40 42 44 79
##  $ : int [1:4] 37 38 43 79
##  $ : int [1:6] 34 36 40 41 79 80
##  $ : int [1:3] 8 47 86
##  $ : int [1:5] 8 35 46 80 86
##  $ : int [1:5] 50 51 52 53 55
##  $ : int [1:4] 28 51 52 54
##  $ : int [1:5] 32 48 52 54 55
##  $ : int [1:3] 48 49 52
##  $ : int [1:5] 48 49 50 51 54
##  $ : int [1:3] 48 55 75
##  $ : int [1:6] 24 28 32 49 50 52
##  $ : int [1:5] 32 48 50 53 75
##  $ : int [1:7] 8 31 32 36 78 80 85
##  $ : int [1:6] 1 2 58 64 76 85
##  $ : int [1:5] 2 57 68 76 78
##  $ : int [1:4] 60 61 87 88
##  $ : int [1:4] 12 13 59 61
##  $ : int [1:7] 12 59 60 62 63 77 87
##  $ : int [1:3] 61 77 87
##  $ : int [1:4] 12 61 77 83
##  $ : int [1:2] 57 76
##  $ : int 76
##  $ : int [1:5] 9 67 68 76 84
##  $ : int [1:4] 7 66 76 84
##  $ : int [1:5] 9 58 66 76 78
##  $ : int [1:3] 6 75 85
##  $ : int [1:3] 10 72 73
##  $ : int [1:3] 7 73 74
##  $ : int [1:5] 10 11 16 17 70
##  $ : int [1:5] 10 19 70 71 74
##  $ : int [1:6] 7 19 71 73 84 86
##  $ : int [1:6] 6 32 53 55 69 85
##  $ : int [1:7] 57 58 64 65 66 67 68
##  $ : int [1:7] 18 23 38 61 62 63 83
##  $ : int [1:7] 2 8 9 56 58 68 85
##  $ : int [1:7] 23 38 40 41 43 44 45
##  $ : int [1:8] 8 34 35 36 41 45 47 56
##  $ : int [1:6] 25 26 31 33 39 42
##  $ : int [1:5] 20 21 23 35 41
##  $ : int [1:9] 12 13 15 16 17 18 22 63 77
##  $ : int [1:6] 7 9 66 67 74 86
##  $ : int [1:11] 1 2 3 5 6 32 56 57 69 75 ...
##  $ : int [1:9] 8 9 19 21 35 46 47 74 84
##  $ : int [1:4] 59 61 62 88
##  $ : int [1:2] 59 87
##  - attr(*, "class")= chr "nb"
##  - attr(*, "region.id")= chr [1:88] "0" "1" "2" "3" ...
##  - attr(*, "call")= language poly2nb(pl = hunan, queen = TRUE)
##  - attr(*, "type")= chr "queen"
##  - attr(*, "sym")= logi TRUE

As you can see from above, the ID is not printed so you need to know that the last row which is “int [1:2] 59 87” is the last observation and it has 2 neighbours, 59 and 87.

There are some observations that has only 1 neighbour as shown by “int 76”.

4.2 Creating Rook contiguity based neighbours

wm_r <- poly2nb(hunan, queen = FALSE)

summary(wm_r)
## Neighbour list object:
## Number of regions: 88 
## Number of nonzero links: 440 
## Percentage nonzero weights: 5.681818 
## Average number of links: 5 
## Link number distribution:
## 
##  1  2  3  4  5  6  7  8  9 10 
##  2  2 12 20 21 14 11  3  2  1 
## 2 least connected regions:
## 29 64 with 1 link
## 1 most connected region:
## 84 with 10 links

From the summary, we can find out that:

  • There are 88 area units/counties in Hunan

  • Most connected area unit has 10 neighbours which is shown by the first row 10 with the next row showing 1 meaning only 1 area unit has 10 neighbours

  • There are 2 area units with only 1 neighbour

wm_r[[2]]
## [1] 57 58 78 85

So for polygon ID 2: 57,58, 78, 85 are all its neighbours.

So, what is polygon ID 2 and which area units are the 4 neighbours?

hunan$NAME_3[2]
## [1] "Hanshou"
hunan$NAME_3[c(57,58,78,85)]
## [1] "Nan"       "Yuanjiang" "Taojiang"  "Taoyuan"

Thus, Polygon ID 2 is Hanshou and Hanshou has 4 neighbours, namely Nan, Yuanjiang, Taojiang, Taoyuan.

4.3 Plotting Queen contiguity based neighbours maps

plot(hunan, borders = 'lightgrey') 
plot(wm_q, coordinates(hunan), pch = 19, cex = 0.6, add = TRUE, col = "red")

4.4 Plotting Rook contiguity based neighbours maps

plot(hunan, borders = 'lightgrey') 
plot(wm_r, coordinates(hunan), pch = 19, cex = 0.6, add = TRUE, col = "red")

Plotting the 2 maps next to each other

#par(mfrow = c(1,2))

plot(hunan, borders = 'lightgrey') 
plot(wm_q, coordinates(hunan), pch = 19, cex = 0.6,add = TRUE, col = "red")
plot(wm_r, coordinates(hunan), pch = 19, cex = 0.6, add = TRUE, col = "blue")

As we know that ROOK contiguity neighbours only take the direct shared boundary neighbours so they will have less neighbours edges seen on the map plot. This is shown by the red edges, that are the QUEEN contiguity neighbours, that are not covered in blue meaning that those edges are present in QUEEN but not in ROOK.

4.5 Computing distance based neighbours

We will derive distance-based weight matrices by using dnearneigh() function of spdep package.

The function identifies neighbours of area points by Euclidean distance with a distance range with lower d1 and upper d2 controlled by the bounds= argument.

If unprojected coordinates are used and either specified in the coordinates object x or with x as a two column matrix & longlat = TRUE, great circle distances in km will be calculated assuming the WGS84 reference ellipsoid.

4.5.1 Determine the cut-off distance

To determine the upper limit for distance band:

  • Return a matrix with the indices of points belonging to the set of the k nearest neighbours of each other by using knearneigh() function of spdep package

  • Convert the knn object returned by knearneigh() function into a neighbours list of class nb with a list of integer vector containing neighbour region number ids by using knn2nb() function

  • Return the length of neighbour relationship edges by using nbdlists() function to return the units of the coorinates if the coordinates are projected, in km otherwise

  • Remove the list structure of the returned object by using unlist() function

coords <- coordinates(hunan)
head(coords)
##       [,1]     [,2]
## 0 112.1531 29.44362
## 1 112.0372 28.86489
## 2 111.8917 29.47107
## 3 111.7031 29.74499
## 4 111.6138 29.49258
## 5 111.0341 29.79863
k1 <- knn2nb(knearneigh(coords))
k1dists <- unlist(nbdists(k1, coords, longlat = TRUE))
summary(k1dists)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   24.79   32.57   38.01   39.07   44.52   61.79

The summary shows that the largest first nearest neighbour distance is 61.79 km, so using this as the upper band/threshold gives certainty that all units will have at least 1 neighbour because it’s the max distance.

4.5.2 Computing fixed distance weight matrix

We will derive distance-based weight matrices by using dnearneigh() function of spdep package.

wm_d62 <- dnearneigh(coords, 0 ,62, longlat = TRUE)
wm_d62
## Neighbour list object:
## Number of regions: 88 
## Number of nonzero links: 324 
## Percentage nonzero weights: 4.183884 
## Average number of links: 3.681818

4.5.3 Plotting fixed distance weight matrix

Plotting the base map of hunan, k1 (computed by knearneigh() function which return the links of 1st nearest neighbour) and wm_d62 (the links of neighbours within 62 km) separately to see the difference.

par(mfrow = c(1,2))

plot(hunan, border = "lightgrey")
plot(k1, coords, add = TRUE, col = "red", length = 0.08)
title(main = "Links of 1st nearest neighbour", cex.main = 0.9)
plot(hunan, border = "lightgrey")
plot(wm_d62, coords, add = TRUE, pch = 19, cex = 0.6)
title(main = "Neighbours within 62 km", cex.main = 0.9)

Plotting them together.

plot(hunan, border = 'lightgrey')
plot(wm_d62, coords, add = TRUE)
plot(k1, coords, add = TRUE, col = 'red', length = 0.08)

4.6 Computing adaptive distance weight matrix

Fixed distance weight matrix’s characteristic is that more densely settled areas (urban areas) tend to have more neighbours and the less densely settled areas (rural counties) tend to have lesser neighbours.

It’s possible to control the number of neighbours directly by using k-nearest neighbours knearneigh() function, either accepting asymmetric neighbours or imposing symmetry as shown below.

knn6 <- knn2nb(knearneigh(coords, k =6))
knn6
## Neighbour list object:
## Number of regions: 88 
## Number of nonzero links: 528 
## Percentage nonzero weights: 6.818182 
## Average number of links: 6 
## Non-symmetric neighbours list

Notice that all of the counties have 6 neighbours, not more than 6 or less than 6.

str(knn6)
## List of 88
##  $ : int [1:6] 2 3 4 5 57 64
##  $ : int [1:6] 1 3 57 58 78 85
##  $ : int [1:6] 1 2 4 5 57 85
##  $ : int [1:6] 1 3 5 6 69 85
##  $ : int [1:6] 1 3 4 6 69 85
##  $ : int [1:6] 3 4 5 69 75 85
##  $ : int [1:6] 9 66 67 71 74 84
##  $ : int [1:6] 9 46 47 78 80 86
##  $ : int [1:6] 8 46 66 68 84 86
##  $ : int [1:6] 16 19 22 70 72 73
##  $ : int [1:6] 10 14 16 17 70 72
##  $ : int [1:6] 13 15 60 61 63 83
##  $ : int [1:6] 12 15 60 61 63 83
##  $ : int [1:6] 11 15 16 17 72 83
##  $ : int [1:6] 12 13 14 17 60 83
##  $ : int [1:6] 10 11 17 22 72 83
##  $ : int [1:6] 10 11 14 16 72 83
##  $ : int [1:6] 20 22 23 63 77 83
##  $ : int [1:6] 10 20 21 73 74 82
##  $ : int [1:6] 18 19 21 22 23 82
##  $ : int [1:6] 19 20 35 74 82 86
##  $ : int [1:6] 10 16 18 19 20 83
##  $ : int [1:6] 18 20 41 77 79 82
##  $ : int [1:6] 25 28 31 52 54 81
##  $ : int [1:6] 24 28 31 33 54 81
##  $ : int [1:6] 25 27 29 33 42 81
##  $ : int [1:6] 26 29 30 37 42 81
##  $ : int [1:6] 24 25 33 49 52 54
##  $ : int [1:6] 26 27 37 42 43 81
##  $ : int [1:6] 26 27 28 33 49 81
##  $ : int [1:6] 24 25 36 39 40 54
##  $ : int [1:6] 24 31 50 54 55 56
##  $ : int [1:6] 25 26 28 30 49 81
##  $ : int [1:6] 36 40 41 45 56 80
##  $ : int [1:6] 21 41 46 47 80 82
##  $ : int [1:6] 31 34 40 45 56 80
##  $ : int [1:6] 26 27 29 42 43 44
##  $ : int [1:6] 23 43 44 62 77 79
##  $ : int [1:6] 25 40 42 43 44 81
##  $ : int [1:6] 31 36 39 43 45 79
##  $ : int [1:6] 23 35 45 79 80 82
##  $ : int [1:6] 26 27 37 39 43 81
##  $ : int [1:6] 37 39 40 42 44 79
##  $ : int [1:6] 37 38 39 42 43 79
##  $ : int [1:6] 34 36 40 41 79 80
##  $ : int [1:6] 8 9 35 47 78 86
##  $ : int [1:6] 8 21 35 46 80 86
##  $ : int [1:6] 49 50 51 52 53 55
##  $ : int [1:6] 28 33 48 51 52 54
##  $ : int [1:6] 32 48 51 52 54 55
##  $ : int [1:6] 28 48 49 50 52 54
##  $ : int [1:6] 28 48 49 50 51 54
##  $ : int [1:6] 48 50 51 52 55 75
##  $ : int [1:6] 24 28 49 50 51 52
##  $ : int [1:6] 32 48 50 52 53 75
##  $ : int [1:6] 32 34 36 78 80 85
##  $ : int [1:6] 1 2 3 58 64 68
##  $ : int [1:6] 2 57 64 66 68 78
##  $ : int [1:6] 12 13 60 61 87 88
##  $ : int [1:6] 12 13 59 61 63 87
##  $ : int [1:6] 12 13 60 62 63 87
##  $ : int [1:6] 12 38 61 63 77 87
##  $ : int [1:6] 12 18 60 61 62 83
##  $ : int [1:6] 1 3 57 58 68 76
##  $ : int [1:6] 58 64 66 67 68 76
##  $ : int [1:6] 9 58 67 68 76 84
##  $ : int [1:6] 7 65 66 68 76 84
##  $ : int [1:6] 9 57 58 66 78 84
##  $ : int [1:6] 4 5 6 32 75 85
##  $ : int [1:6] 10 16 19 22 72 73
##  $ : int [1:6] 7 19 73 74 84 86
##  $ : int [1:6] 10 11 14 16 17 70
##  $ : int [1:6] 10 19 21 70 71 74
##  $ : int [1:6] 19 21 71 73 84 86
##  $ : int [1:6] 6 32 50 53 55 69
##  $ : int [1:6] 58 64 65 66 67 68
##  $ : int [1:6] 18 23 38 61 62 63
##  $ : int [1:6] 2 8 9 46 58 68
##  $ : int [1:6] 38 40 41 43 44 45
##  $ : int [1:6] 34 35 36 41 45 47
##  $ : int [1:6] 25 26 28 33 39 42
##  $ : int [1:6] 19 20 21 23 35 41
##  $ : int [1:6] 12 13 15 16 22 63
##  $ : int [1:6] 7 9 66 68 71 74
##  $ : int [1:6] 2 3 4 5 56 69
##  $ : int [1:6] 8 9 21 46 47 74
##  $ : int [1:6] 59 60 61 62 63 88
##  $ : int [1:6] 59 60 61 62 63 87
##  - attr(*, "region.id")= chr [1:88] "1" "2" "3" "4" ...
##  - attr(*, "call")= language knearneigh(x = coords, k = 6)
##  - attr(*, "sym")= logi FALSE
##  - attr(*, "type")= chr "knn"
##  - attr(*, "knn-k")= num 6
##  - attr(*, "class")= chr "nb"

4.6.1 Plotting distance based neighbours (adaptive)

plot(hunan, border = 'lightgrey')
plot(knn6, coords, pch = 19, cex = 0.6, add = TRUE, col = 'red')

4.7 Computing weights based on IDW (Inverse Distance Weights)

We will derive spatial weight matrices based on Inverse Distance method by using nbdists() function of spdep package.

dist <- nbdists(wm_q, coords, longlat = TRUE)
ids <- lapply(dist, function(x) 1/(x))
ids
## [[1]]
## [1] 0.01535405 0.03916350 0.01820896 0.02807922 0.01145113
## 
## [[2]]
## [1] 0.01535405 0.01764308 0.01925924 0.02323898 0.01719350
## 
## [[3]]
## [1] 0.03916350 0.02822040 0.03695795 0.01395765
## 
## [[4]]
## [1] 0.01820896 0.02822040 0.03414741 0.01539065
## 
## [[5]]
## [1] 0.03695795 0.03414741 0.01524598 0.01618354
## 
## [[6]]
## [1] 0.015390649 0.015245977 0.021748129 0.011883901 0.009810297
## 
## [[7]]
## [1] 0.01708612 0.01473997 0.01150924 0.01872915
## 
## [[8]]
## [1] 0.02022144 0.03453056 0.02529256 0.01036340 0.02284457 0.01500600 0.01515314
## 
## [[9]]
## [1] 0.02022144 0.01574888 0.02109502 0.01508028 0.02902705 0.01502980
## 
## [[10]]
## [1] 0.02281552 0.01387777 0.01538326 0.01346650 0.02100510 0.02631658 0.01874863
## [8] 0.01500046
## 
## [[11]]
## [1] 0.01882869 0.02243492 0.02247473
## 
## [[12]]
## [1] 0.02779227 0.02419652 0.02333385 0.02986130 0.02335429
## 
## [[13]]
## [1] 0.02779227 0.02650020 0.02670323 0.01714243
## 
## [[14]]
## [1] 0.01882869 0.01233868 0.02098555
## 
## [[15]]
## [1] 0.02650020 0.01233868 0.01096284 0.01562226
## 
## [[16]]
## [1] 0.02281552 0.02466962 0.02765018 0.01476814 0.01671430
## 
## [[17]]
## [1] 0.01387777 0.02243492 0.02098555 0.01096284 0.02466962 0.01593341 0.01437996
## 
## [[18]]
## [1] 0.02039779 0.02032767 0.01481665 0.01473691 0.01459380
## 
## [[19]]
## [1] 0.01538326 0.01926323 0.02668415 0.02140253 0.01613589 0.01412874
## 
## [[20]]
## [1] 0.01346650 0.02039779 0.01926323 0.01723025 0.02153130 0.01469240 0.02327034
## 
## [[21]]
## [1] 0.02668415 0.01723025 0.01766299 0.02644986 0.02163800
## 
## [[22]]
## [1] 0.02100510 0.02765018 0.02032767 0.02153130 0.01489296
## 
## [[23]]
## [1] 0.01481665 0.01469240 0.01401432 0.02246233 0.01880425 0.01530458 0.01849605
## 
## [[24]]
## [1] 0.02354598 0.01837201 0.02607264 0.01220154 0.02514180
## 
## [[25]]
## [1] 0.02354598 0.02188032 0.01577283 0.01949232 0.02947957
## 
## [[26]]
## [1] 0.02166102 0.01743220 0.02207751 0.02209114
## 
## [[27]]
## [1] 0.02166102 0.02490625 0.01562326
## 
## [[28]]
## [1] 0.01837201 0.02188032 0.02229549 0.03076171 0.02039506
## 
## [[29]]
## [1] 0.02490625 0.01686587 0.01395022
## 
## [[30]]
## [1] 0.02090587
## 
## [[31]]
## [1] 0.02607264 0.01577283 0.01219005 0.01724850 0.01229012 0.01609781 0.01139438
## [8] 0.01150130
## 
## [[32]]
## [1] 0.01220154 0.01219005 0.01712515 0.01340413 0.01280928 0.01198216 0.01053374
## [8] 0.01065655
## 
## [[33]]
## [1] 0.01949232 0.01743220 0.02229549 0.02090587 0.01979045
## 
## [[34]]
## [1] 0.03113041 0.03589551 0.02882915
## 
## [[35]]
## [1] 0.01766299 0.02185795 0.02616766 0.02111721 0.02108253 0.01509020
## 
## [[36]]
## [1] 0.01724850 0.03113041 0.01571707 0.01860991 0.02073549 0.01680129
## 
## [[37]]
## [1] 0.01686587 0.02234793 0.01510990 0.01550676
## 
## [[38]]
## [1] 0.01401432 0.02407426 0.02276151 0.01719415
## 
## [[39]]
## [1] 0.01229012 0.02172543 0.01711924 0.02629732 0.01896385
## 
## [[40]]
## [1] 0.01609781 0.01571707 0.02172543 0.01506473 0.01987922 0.01894207
## 
## [[41]]
## [1] 0.02246233 0.02185795 0.02205991 0.01912542 0.01601083 0.01742892
## 
## [[42]]
## [1] 0.02207751 0.01562326 0.01395022 0.02234793 0.01711924 0.01836831 0.01683518
## 
## [[43]]
## [1] 0.01510990 0.02629732 0.01506473 0.01836831 0.03112027 0.01530782
## 
## [[44]]
## [1] 0.01550676 0.02407426 0.03112027 0.01486508
## 
## [[45]]
## [1] 0.03589551 0.01860991 0.01987922 0.02205991 0.02107101 0.01982700
## 
## [[46]]
## [1] 0.03453056 0.04033752 0.02689769
## 
## [[47]]
## [1] 0.02529256 0.02616766 0.04033752 0.01949145 0.02181458
## 
## [[48]]
## [1] 0.02313819 0.03370576 0.02289485 0.01630057 0.01818085
## 
## [[49]]
## [1] 0.03076171 0.02138091 0.02394529 0.01990000
## 
## [[50]]
## [1] 0.01712515 0.02313819 0.02551427 0.02051530 0.02187179
## 
## [[51]]
## [1] 0.03370576 0.02138091 0.02873854
## 
## [[52]]
## [1] 0.02289485 0.02394529 0.02551427 0.02873854 0.03516672
## 
## [[53]]
## [1] 0.01630057 0.01979945 0.01253977
## 
## [[54]]
## [1] 0.02514180 0.02039506 0.01340413 0.01990000 0.02051530 0.03516672
## 
## [[55]]
## [1] 0.01280928 0.01818085 0.02187179 0.01979945 0.01882298
## 
## [[56]]
## [1] 0.01036340 0.01139438 0.01198216 0.02073549 0.01214479 0.01362855 0.01341697
## 
## [[57]]
## [1] 0.028079221 0.017643082 0.031423501 0.029114131 0.013520292 0.009903702
## 
## [[58]]
## [1] 0.01925924 0.03142350 0.02722997 0.01434859 0.01567192
## 
## [[59]]
## [1] 0.01696711 0.01265572 0.01667105 0.01785036
## 
## [[60]]
## [1] 0.02419652 0.02670323 0.01696711 0.02343040
## 
## [[61]]
## [1] 0.02333385 0.01265572 0.02343040 0.02514093 0.02790764 0.01219751 0.02362452
## 
## [[62]]
## [1] 0.02514093 0.02002219 0.02110260
## 
## [[63]]
## [1] 0.02986130 0.02790764 0.01407043 0.01805987
## 
## [[64]]
## [1] 0.02911413 0.01689892
## 
## [[65]]
## [1] 0.02471705
## 
## [[66]]
## [1] 0.01574888 0.01726461 0.03068853 0.01954805 0.01810569
## 
## [[67]]
## [1] 0.01708612 0.01726461 0.01349843 0.01361172
## 
## [[68]]
## [1] 0.02109502 0.02722997 0.03068853 0.01406357 0.01546511
## 
## [[69]]
## [1] 0.02174813 0.01645838 0.01419926
## 
## [[70]]
## [1] 0.02631658 0.01963168 0.02278487
## 
## [[71]]
## [1] 0.01473997 0.01838483 0.03197403
## 
## [[72]]
## [1] 0.01874863 0.02247473 0.01476814 0.01593341 0.01963168
## 
## [[73]]
## [1] 0.01500046 0.02140253 0.02278487 0.01838483 0.01652709
## 
## [[74]]
## [1] 0.01150924 0.01613589 0.03197403 0.01652709 0.01342099 0.02864567
## 
## [[75]]
## [1] 0.011883901 0.010533736 0.012539774 0.018822977 0.016458383 0.008217581
## 
## [[76]]
## [1] 0.01352029 0.01434859 0.01689892 0.02471705 0.01954805 0.01349843 0.01406357
## 
## [[77]]
## [1] 0.014736909 0.018804247 0.022761507 0.012197506 0.020022195 0.014070428
## [7] 0.008440896
## 
## [[78]]
## [1] 0.02323898 0.02284457 0.01508028 0.01214479 0.01567192 0.01546511 0.01140779
## 
## [[79]]
## [1] 0.01530458 0.01719415 0.01894207 0.01912542 0.01530782 0.01486508 0.02107101
## 
## [[80]]
## [1] 0.01500600 0.02882915 0.02111721 0.01680129 0.01601083 0.01982700 0.01949145
## [8] 0.01362855
## 
## [[81]]
## [1] 0.02947957 0.02209114 0.01150130 0.01979045 0.01896385 0.01683518
## 
## [[82]]
## [1] 0.02327034 0.02644986 0.01849605 0.02108253 0.01742892
## 
## [[83]]
## [1] 0.023354289 0.017142433 0.015622258 0.016714303 0.014379961 0.014593799
## [7] 0.014892965 0.018059871 0.008440896
## 
## [[84]]
## [1] 0.01872915 0.02902705 0.01810569 0.01361172 0.01342099 0.01297994
## 
## [[85]]
##  [1] 0.011451133 0.017193502 0.013957649 0.016183544 0.009810297 0.010656545
##  [7] 0.013416965 0.009903702 0.014199260 0.008217581 0.011407794
## 
## [[86]]
## [1] 0.01515314 0.01502980 0.01412874 0.02163800 0.01509020 0.02689769 0.02181458
## [8] 0.02864567 0.01297994
## 
## [[87]]
## [1] 0.01667105 0.02362452 0.02110260 0.02058034
## 
## [[88]]
## [1] 0.01785036 0.02058034

For IDW, we can see that by inversing the distance, the larger distances get scaled down and the smaller distances get scaled up. All the values are also in decimals because we divide 1 by the value.

4.8 Row-standardised weights matrix

The most intuitive way to summarize the neighbours’ values:

  • assigning the fraction 1/(no. of neighbours) to each neighbouring county

  • summing the weighted income values

However, the polygons along the edges of the study area will base their lagged values on fewer polygona thus potentially over/under-estimating the true nature of the spatial autocorrelation.

So, we will use the nb2listw() function of spdep package which will supplement a neighbours list with spatial weights for the chosen coding scheme (style argument).

rswm_q <- nb2listw(wm_q, style = "W", zero.policy = TRUE)
rswm_q
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 88 
## Number of nonzero links: 448 
## Percentage nonzero weights: 5.785124 
## Average number of links: 5.090909 
## 
## Weights style: W 
## Weights constants summary:
##    n   nn S0       S1       S2
## W 88 7744 88 37.86334 365.9147

The ‘zero.policy=TRUE’ argument allows for lists of non-neighbours.

Style=“W” means neighbouring polygon will be assigned equal weight, it’s row standardised (sums over all links to n)

Style=“B” means basic binary coding which is a more robust option.

To see the weight of the first polygon (polygon ID 1)’s four neighbours:

rswm_q$weights[1]
## [[1]]
## [1] 0.2 0.2 0.2 0.2 0.2

Each neighbour is assigned a 0.2 of the total weight. This means that when R computes the average neighbouring income values, each neighbour’s income will be multiplied by 0.2 before being tallied.

We will try to derive the row standardised distance weight matrix using the style B, basic binary coding.

rswm_ids <- nb2listw(wm_q, glist = ids, style = "B", zero.policy = TRUE)
rswm_ids
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 88 
## Number of nonzero links: 448 
## Percentage nonzero weights: 5.785124 
## Average number of links: 5.090909 
## 
## Weights style: B 
## Weights constants summary:
##    n   nn       S0        S1       S2
## B 88 7744 8.786712 0.3776401 3.813528

To see the row standardised + inversed distance weight of the first polygon (polygon ID 1)’s four neighbours:

rswm_ids$weights[1]
## [[1]]
## [1] 0.01535405 0.03916350 0.01820896 0.02807922 0.01145113
summary(unlist(rswm_ids$weights))
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.008218 0.015088 0.018739 0.019613 0.022823 0.040338

5. Application of Spatial Weight Matrix

We will create 4 different spatial lagged variables:

  • spatialy lagged variables with row-standardized weights and sum of neighbouring values

  • spatial window average, and spatial window sum

What is a spatially lagged variable?

With a neighbor structure defined by the non-zero elements of the spatial weights matrix W, a spatially lagged variable is a weighted sum or a weighted average of the neighbouring values for that variable.

5.2 Spatial lag w/ row-standardized weights

We will use lag.listw() or lag() function of spdep package to compute the lag vector var (numeric vector of the same length as the listw object rswm_q)

GDPPC.lag <- lag.listw(rswm_q, hunan$GDPPC)

rswm_q$neighbours[1]
## [[1]]
## [1]  2  3  4 57 85
rswm_q$weights[1]
## [[1]]
## [1] 0.2 0.2 0.2 0.2 0.2
GDPPC.lag[1]
## [1] 24847.2

This means that 24847.20 is the value of tallying the first polygon’s (Anxiang) weighted GDPPC values so it would be (0.2xGDPPC of polygon 2 + 0.2xGDPPC of polygon 3 + 0.2xGDPPC of polygon 4 + 0.2xGDPPC of polygon 57 + 0.2xGDPPC of polygon 85)

We will append the spatially lag GDPPC values into hunan SpatialPolygonDataFrame by subsetting the list and turning it into data frame before finally using left_join() function of dplyr package to add the data to hunan.

lag.list <- list(hunan$NAME_3, lag.listw(rswm_q, hunan$GDPPC))
lag.res <- as.data.frame(lag.list)
colnames(lag.res) <- c("NAME_3", "lag GDPPC (queen)")

head(lag.res)
##    NAME_3 lag GDPPC (queen)
## 1 Anxiang          24847.20
## 2 Hanshou          22724.80
## 3  Jinshi          24143.25
## 4      Li          27737.50
## 5   Linli          27270.25
## 6  Shimen          21248.80
hunan@data <- left_join(hunan@data, lag.res)
## Joining, by = "NAME_3"
head(hunan@data)
##    NAME_2  ID_3  NAME_3   ENGTYPE_3 Shape_Leng Shape_Area  County    City
## 1 Changde 21098 Anxiang      County   1.869074 0.10056190 Anxiang Changde
## 2 Changde 21100 Hanshou      County   2.360691 0.19978745 Hanshou Changde
## 3 Changde 21101  Jinshi County City   1.425620 0.05302413  Jinshi Changde
## 4 Changde 21102      Li      County   3.474325 0.18908121      Li Changde
## 5 Changde 21103   Linli      County   2.289506 0.11450357   Linli Changde
## 6 Changde 21104  Shimen      County   4.171918 0.37194707  Shimen Changde
##   avg_wage deposite     FAI Gov_Rev Gov_Exp     GDP GDPPC     GIO   Loan  NIPCR
## 1    31935   5517.2  3541.0  243.64  1779.5 12482.0 23667  5108.9 2806.9 7693.7
## 2    32265   7979.0  8665.0  386.13  2062.4 15788.0 20981 13491.0 4550.0 8269.9
## 3    28692   4581.7  4777.0  373.31  1148.4  8706.9 34592 10935.0 2242.0 8169.9
## 4    32541  13487.0 16066.0  709.61  2459.5 20322.0 24473 18402.0 6748.0 8377.0
## 5    32667    564.1  7781.2  336.86  1538.7 10355.0 25554  8214.0  358.0 8143.1
## 6    33261   8334.4 10531.0  548.33  2178.8 16293.0 27137 17795.0 6026.5 6156.0
##    Bed    Emp  EmpR EmpRT Pri_Stu Sec_Stu Household Household_R NOIP Pop_R
## 1 1931 336.39 270.5 205.9  19.584  17.819     148.1       135.4   53 346.0
## 2 2560 456.78 388.8 246.7  42.097  33.029     240.2       208.7   95 553.2
## 3  848 122.78  82.1  61.7   8.723   7.592      81.9        43.7   77  92.4
## 4 2038 513.44 426.8 227.1  38.975  33.938     268.5       256.0   96 539.7
## 5 1440 307.36 272.2 100.8  23.286  18.943     129.1       157.2   99 246.6
## 6 2502 392.05 329.6 193.8  29.245  26.104     190.6       184.7  122 399.2
##     RSCG Pop_T    Agri Service Disp_Inc      RORP    ROREmp lag GDPPC (queen)
## 1 3957.9 528.3 4524.41   14100    16610 0.6549309 0.8041262          24847.20
## 2 4460.5 804.6 6545.35   17727    18925 0.6875466 0.8511756          22724.80
## 3 3683.0 251.8 2562.46    7525    19498 0.3669579 0.6686757          24143.25
## 4 7110.2 832.5 7562.34   53160    18985 0.6482883 0.8312558          27737.50
## 5 3604.9 409.3 3583.91    7031    18604 0.6024921 0.8856065          27270.25
## 6 6490.7 600.5 5266.51    6981    19275 0.6647794 0.8407091          21248.80

Next, we will plot both the original GDPPC value and the spatially lagged GDPPC values for comparison.

gdppc <- qtm(hunan, "GDPPC")
lag_gdppc <- qtm(hunan, "lag GDPPC (queen)")
tmap_arrange(gdppc, lag_gdppc, asp = 1, ncol =2)

5.2 Spatial window sum

The spatial window sum uses and includes the diagonal element. We will assign knn6 to a new variable to alter its structure to add the diagonal elements by using the include.self() function of spdep package which

knn6a <- knn6

include.self(knn6a)
## Neighbour list object:
## Number of regions: 88 
## Number of nonzero links: 616 
## Percentage nonzero weights: 7.954545 
## Average number of links: 7 
## Non-symmetric neighbours list

Let’s see the difference.

plot(hunan, border = 'lightgrey')

plot(include.self(knn6a), coords, pch = 19, cex = 0.6, col = 'blue', add = TRUE)
plot(knn6, coords, pch = 19, cex = 0.6, col = 'red', add = TRUE)

The spatial window sum is simply the sum of one area’s GDPPC value and its neighbours’ GDPPC value. When dealing with a binary variable, the spatial window sum corresponds to the number of events (observations with a value of 1) within the window centered on a location (including the value at that location). Thus, we will assign binary weights to the neighbour structure.

binary.knn6 <- lapply(knn6a, function(x) 0*x+1)
binary.knn6[1]
## [[1]]
## [1] 1 1 1 1 1 1

Next, we will use nb2listw() function

wm_knn6 <- nb2listw(knn6a, glist = binary.knn6, style = "B")

Then, we will compute the spatially lagged GDPPC values for knn6 (adaptive weight matrix of maximum 6 neighbours).

lag_knn6.list <- list(hunan$NAME_3, lag.listw(wm_knn6, hunan$GDPPC))
lag_knn6.res <- as.data.frame(lag_knn6.list)
colnames(lag_knn6.res) <- c("NAME_3", "lag GDPPC (adaptive distance)")

head(lag_knn6.res)
##    NAME_3 lag GDPPC (adaptive distance)
## 1 Anxiang                        157324
## 2 Hanshou                        148216
## 3  Jinshi                        138865
## 4      Li                        152543
## 5   Linli                        151462
## 6  Shimen                        140836
hunan@data <- left_join(hunan@data, lag_knn6.res)
## Joining, by = "NAME_3"
head(hunan@data)
##    NAME_2  ID_3  NAME_3   ENGTYPE_3 Shape_Leng Shape_Area  County    City
## 1 Changde 21098 Anxiang      County   1.869074 0.10056190 Anxiang Changde
## 2 Changde 21100 Hanshou      County   2.360691 0.19978745 Hanshou Changde
## 3 Changde 21101  Jinshi County City   1.425620 0.05302413  Jinshi Changde
## 4 Changde 21102      Li      County   3.474325 0.18908121      Li Changde
## 5 Changde 21103   Linli      County   2.289506 0.11450357   Linli Changde
## 6 Changde 21104  Shimen      County   4.171918 0.37194707  Shimen Changde
##   avg_wage deposite     FAI Gov_Rev Gov_Exp     GDP GDPPC     GIO   Loan  NIPCR
## 1    31935   5517.2  3541.0  243.64  1779.5 12482.0 23667  5108.9 2806.9 7693.7
## 2    32265   7979.0  8665.0  386.13  2062.4 15788.0 20981 13491.0 4550.0 8269.9
## 3    28692   4581.7  4777.0  373.31  1148.4  8706.9 34592 10935.0 2242.0 8169.9
## 4    32541  13487.0 16066.0  709.61  2459.5 20322.0 24473 18402.0 6748.0 8377.0
## 5    32667    564.1  7781.2  336.86  1538.7 10355.0 25554  8214.0  358.0 8143.1
## 6    33261   8334.4 10531.0  548.33  2178.8 16293.0 27137 17795.0 6026.5 6156.0
##    Bed    Emp  EmpR EmpRT Pri_Stu Sec_Stu Household Household_R NOIP Pop_R
## 1 1931 336.39 270.5 205.9  19.584  17.819     148.1       135.4   53 346.0
## 2 2560 456.78 388.8 246.7  42.097  33.029     240.2       208.7   95 553.2
## 3  848 122.78  82.1  61.7   8.723   7.592      81.9        43.7   77  92.4
## 4 2038 513.44 426.8 227.1  38.975  33.938     268.5       256.0   96 539.7
## 5 1440 307.36 272.2 100.8  23.286  18.943     129.1       157.2   99 246.6
## 6 2502 392.05 329.6 193.8  29.245  26.104     190.6       184.7  122 399.2
##     RSCG Pop_T    Agri Service Disp_Inc      RORP    ROREmp lag GDPPC (queen)
## 1 3957.9 528.3 4524.41   14100    16610 0.6549309 0.8041262          24847.20
## 2 4460.5 804.6 6545.35   17727    18925 0.6875466 0.8511756          22724.80
## 3 3683.0 251.8 2562.46    7525    19498 0.3669579 0.6686757          24143.25
## 4 7110.2 832.5 7562.34   53160    18985 0.6482883 0.8312558          27737.50
## 5 3604.9 409.3 3583.91    7031    18604 0.6024921 0.8856065          27270.25
## 6 6490.7 600.5 5266.51    6981    19275 0.6647794 0.8407091          21248.80
##   lag GDPPC (adaptive distance)
## 1                        157324
## 2                        148216
## 3                        138865
## 4                        152543
## 5                        151462
## 6                        140836
gdppc1 <- qtm(hunan, "GDPPC")
lag_gdppc1 <- qtm(hunan, "lag GDPPC (adaptive distance)")
tmap_arrange(gdppc1, lag_gdppc1, asp = 1, ncol =2)

6. Measure of Global Spatial Autocorrelation

We will perform Moran’s I statistics testing using moran.test() function of spdep package.

6.1 Moran’s I test

moran.test(hunan$GDPPC, listw = rswm_q, zero.policy = TRUE, na.action = na.omit)
## 
##  Moran I test under randomisation
## 
## data:  hunan$GDPPC  
## weights: rswm_q    
## 
## Moran I statistic standard deviate = 4.7351, p-value = 1.095e-06
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##       0.300749970      -0.011494253       0.004348351

6.1.2 Computing Monte Carlo Moran’s I

Instead of using moran.test() function, we will perform permutation test for Moran’s I statistic by using moran.mc() function of spdep package.

set.seed(1234)
bperm = moran.mc(hunan$GDPPC, listw = rswm_q, nsim = 999, zero.policy = TRUE, na.action = na.omit)
bperm
## 
##  Monte-Carlo simulation of Moran I
## 
## data:  hunan$GDPPC 
## weights: rswm_q  
## number of simulations + 1: 1000 
## 
## statistic = 0.30075, observed rank = 1000, p-value = 0.001
## alternative hypothesis: greater

For both monte carlo simulation and normal Moran’s I statistics, both of the p-value are 0.001 and lower and this means that if confidence interval is 98%, it means that distribution of GDPPC values in the counties are not randomly distributed. Thus, since the Moran I statistics (0.3) is bigger than 0, the spatial pattern is “clustered”.

6.1.3 Visualizing Monte Carlo Moran’s I

summary(bperm$res[1:999])
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.18339 -0.06168 -0.02125 -0.01505  0.02611  0.27593
hist(bperm$res, freq = TRUE, breaks = 20, xlab = "Simulated Moran's I")

abline(v=0, col="red")

6.2 Geary’s C test

We will perform Geary’s C test for spatial autocorrelation analysis using geary.test() function of spdep package.

geary.test(hunan$GDPPC, listw = rswm_q)
## 
##  Geary C test under randomisation
## 
## data:  hunan$GDPPC 
## weights: rswm_q 
## 
## Geary C statistic standard deviate = 3.6108, p-value = 0.0001526
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##         0.6907223         1.0000000         0.0073364

6.2.1 Computing Monte Carlo Geary’s C

We will perform permutation test for Geary’s C statistic by using geary.mc() function of spdep package.

set.seed(1234)
bperm = geary.mc(hunan$GDPPC, listw = rswm_q, nsim = 999)
bperm
## 
##  Monte-Carlo simulation of Geary C
## 
## data:  hunan$GDPPC 
## weights: rswm_q 
## number of simulations + 1: 1000 
## 
## statistic = 0.69072, observed rank = 1, p-value = 0.001
## alternative hypothesis: greater

For both monte carlo simulation and normal Geary’s C statistics, both of the p-value are 0.001 and lower and this means that if confidence interval is 98%, it means that distribution of GDPPC values in the counties are not randomly distributed. Thus, since the c statistics value is smaller than 1, the spatial pattern is “clustered”.

6.2.2 Visualizing Monte Carlo Geary’s C

summary(bperm$res[1:999])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.7142  0.9502  1.0052  1.0044  1.0595  1.2722
hist(bperm$res, freq = TRUE, breaks = 20, xlab = "Simulated Geary's C")

abline(v=1, col="red")

7. Spatial Correlogram

Spatial correlograms are great to examine patterns of spatial autocorrelation in your data. They show how correlated pairs of spatial observations are when you increase the distance (lag) between them.

7.1 Computing Moran’s I correlogram

MI_corr <- sp.correlogram(wm_q, hunan$GDPPC, order = 6, method = "I", style = "B")
plot(MI_corr)

7.2 Computing Geary’s C correlogram

GC_corr <- sp.correlogram(wm_q, hunan$GDPPC, order = 6, method = "C", style = "W")
plot(GC_corr)

Let’s get the mean distance of the lags order 1-6 to see the distances. We use nblag() function which creates higher order neighbour lists, where higher order neighbours are only lags links from each other on the graph described by the input neighbours list. Credit

nb6 <- nblag(wm_q, 6)
correlogram_bins <- sapply(nb6, function(x) mean(unlist(nbdists(x,coords, longlat = TRUE))))
correlogram_bins
## [1]  55.87101 106.63547 162.68559 219.06365 271.94224 323.42132
summary(correlogram_bins)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   55.87  120.65  190.87  189.94  258.72  323.42
Conclusion:

From the correlograms, we can see that Moran’s I values and Geary’s c values are inversely related. We can also see that the spatial pattern of GDPPC variable is clustered from lag 1 to 4 (from 55.87 km to 219.06 km) and dispersed from lag 4 onwards (from 219.06 km to maximum distance of 323.42 km).