Module 7 Spatial Autocorrelation

Introduction

Welcome to Module 7! In this module, we’ll delve into the concept of spatial autocorrelation, a core idea in spatial data analysis. We’ll explore how to measure and visualize the extent to which values of a variable are correlated with themselves through space. We’ll also look at how to identify spatial clusters and outliers. Understanding these concepts is crucial when dealing with spatial data, as it allows us to see if spatial patterns exist in your data and avoid violating the independence assumption in statistical tests.

Learning Objectives

By the end of this module, you should be able to:

Define spatial autocorrelation and related concepts.
Calculate and interpret the Global Moran’s I statistic.
Conduct hypothesis testing for global spatial autocorrelation.
Visualize spatial autocorrelation using Moran’s I scatterplots.
Calculate and interpret Local Moran’s I statistics and Getis-Ord Gi*.
Identify spatial clusters (high-high, low-low, high-low, low-high) and hot spots/cold spots
Use R packages like spdep, tmap and ggplot2 for spatial autocorrelation and cluster analysis.

Spatial Autocorrelation: Key Concepts

Spatial autocorrelation refers to the degree to which values at one location are similar to values at nearby locations. This concept is closely linked to Tobler’s First Law of Geography: “Everything is related to everything else, but near things are more related than distant things”.

Positive Spatial Autocorrelation: Observations with similar values tend to cluster together. For example, areas with high disease rates are close to other areas with high rates.
Negative Spatial Autocorrelation: Observations with dissimilar values tend to be close together. For example, high-income neighborhoods may be surrounded by low-income neighborhoods.
No Spatial Autocorrelation (Random): There is no spatial pattern; values are distributed randomly in space.

Global Spatial Autocorrelation: Moran’s I

What is Global Moran’s I?

Global Moran’s I is a single index that summarizes the spatial autocorrelation across an entire study area. It quantifies how similar each region is with its neighbors, and averages all these assessments.

Mathematically, Global Moran’s I is calculated as: \[ I = \frac{n \sum_i \sum_j w_{ij} (Y_i - \bar{Y})(Y_j - \bar{Y})}{(\sum_{i \neq j} w_{ij}) \sum_i (Y_i - \bar{Y})^2} \]

Where:

n is the number of regions.
Y_i is the value of the variable in region i.
Ȳ is the mean of all values.
w_ij is the spatial weight between regions i and j. It reflects the spatial proximity between regions.

Spatial Weights

Spatial weights are crucial in calculating Moran’s I. They quantify how near regions are to each other. Common weight schemes include:

Queen Contiguity: Regions that share an edge or a corner are considered neighbors.
Rook Contiguity: Regions that share an edge are considered neighbors.
Distance-Based Weights: Weights are based on the distance between regions.

In this module we will use queen contiguity weights.

Interpretation of Global Moran’s I:

I > E[I] (expected value, approximately -1/(n-1)): Indicates positive spatial autocorrelation; similar values tend to cluster.
I < E[I]: Indicates negative spatial autocorrelation; dissimilar values tend to be near each other.
I ≈ E[I]: Indicates no spatial autocorrelation; values are randomly distributed.

Hypothesis Testing for Spatial Autocorrelation

To test for spatial autocorrelation, we use a hypothesis testing approach

Null Hypothesis (H0): There is no spatial autocorrelation (I ≈ E[I]).
Alternative Hypothesis (H1): There is spatial autocorrelation (I ≠ E[I]).
- The direction of autocorrelation is either positive (I > E[I]) or negative (I<E[I]).

We test this hypothesis by comparing the observed value of Moran’s I to the distribution of values that would be observed if spatial autocorrelation was absent.

The test statistic is calculated as a z-score:

\[ z = \frac{I - E[I]}{\sqrt{Var[I]}} \]

Where:

I is the observed Moran’s I value.
E[I] is the expected Moran’s I value under the null hypothesis.
Var[I] is the variance of Moran’s I under the null hypothesis.

When the number of regions is sufficiently large, the distribution of the test statistic is assumed to be normal with mean 0 and variance 1. An alternative approach to judge significance is Monte Carlo randomization. This method creates random patterns by reassigning the observed values among the areas and calculates the Moran’s I for each of the patterns, providing a randomization distribution for the Moran’s I.

Based on the p-value, we make our decision:

If p-value < α (significance level, usually 0.05): Reject the null hypothesis. There is evidence of significant spatial autocorrelation.
If p-value ≥ α: Fail to reject the null hypothesis. There is no evidence of significant spatial autocorrelation.

Global Moran’s I in R

Let’s use the data from the example codes you provided to see how we can implement the global Moran’s I test in R.

First we load the necessary libraries and data. Note that we have changed the file paths to be reproducible. Make sure you use your local paths.

# Load libraries
library(sf)
library(ggplot2)
library(spdep)
library(tmap)
library(haven)
library(survey)
library(RColorBrewer)

# Load shapefile
filepath<-"gadm41_SOM_1.shp"
somalia<-st_read(filepath)
somalia$NAME_1

Now, let’s load the fuel use data and generate the variable we will use to do the test of spatial autocorrelation.

# Load DHS data 
fueldata <- read_dta("fueluse.dta")
str(fueldata$Region)

# Calculate the mean solid fuel use by region 
options(survey.lonely.psu = "remove")
design <- svydesign(id = ~HV001, strata = ~STRU_CD, weights = ~HV005, data = fueldata,nest=TRUE)
result <- svyby(~dv, ~Region, FUN = svyciprop, design = design, vartype = NULL)
print(result)

# Add region names to the result data frame
labels <- c("Awdal", "Woqooyi Galbeed", "Togdheer", "Sool", "Sanaag", "Bari", "Nugaal", "Mudug", 
            "Galguduud", "Hiiraan", "Shabeellaha Dhexe", "Banaadir",  "Bay", 
            "Bakool", "Gedo",  "Jubbada Hoose")
result$Region <- labels
result <- result[order(result$Region), ]
print(result)

# Merge the shapefile and result data frame
merged_data <- merge(somalia, result, by.x = "NAME_1", by.y = "Region")

Now we create the spatial weights matrix based on queen contiguity and use the moran.test() function of the spdep package to calculate Moran’s I.

# Remove rows with NA values in the `dv` column (mean fuel use)
merged_data <- merged_data[!is.na(merged_data$dv), ]
# Create a neighborhood object based on queen contiguity between regions.
nb <- poly2nb(merged_data, queen = TRUE)
# Create a spatial weights list from the neighborhood object using the 'W' style (row-standardized weights).
nbw <- nb2listw(nb, style = "W")

# Calculate Global Moran's I
gmoran <- moran.test(merged_data$dv, nbw, alternative = "greater")
print(gmoran)
print(gmoran[["estimate"]][["Moran I statistic"]])
print(gmoran[["statistic"]])
print(gmoran[["p.value"]])

The results of the global moran’s I test indicate a positive spatial autocorrelation in the use of solid fuel for cooking. This is because the estimated Moran’s I is higher than the expected value and because the p-value is lower than the significance level of 0.05

We can use the moran.mc() function to check the significance by using a Monte Carlo approach.

# Moran's for MC
gmoranMC <- moran.mc(merged_data$dv, nbw, nsim = 999)
print(gmoranMC)
windows()
hist(gmoranMC$res)
abline(v = gmoranMC$statistic, col = "red")

The results of the Monte Carlo approach also suggest the existence of positive spatial autocorrelation. The observed statistic is in the tails of the distribution.

Moran’s I Scatterplot

A Moran’s I scatterplot helps visualize spatial autocorrelation.

The x-axis represents the variable of interest (e.g., fuel use).
The y-axis represents the spatial lag of the variable (weighted average of the neighbors’ values).
The slope of the regression line corresponds to Moran’s I.
The four quadrants of the plot show how values of a variable are associated with its neighbors:
- High-High: High values surrounded by high values.
- Low-Low: Low values surrounded by low values.
- High-Low: High values surrounded by low values.
- Low-High: Low values surrounded by high values.

Here’s the code to produce a Moran’s I scatterplot using our fuel data:

# Visualize with Moran's I scatterplot
moran.plot(merged_data$dv, nbw)

The Moran’s scatter plot shows a positive trend, indicating again positive spatial autocorrelation.

Local Spatial Autocorrelation: Local Moran’s I

What is Local Moran’s I?

While the Global Moran’s I provides a single index to assess the overall spatial autocorrelation across an entire study region, Local Indicators of Spatial Association (LISAs), such as the Local Moran’s I, allow us to identify areas with significant spatial clustering of similar values or areas that are significantly different from its neighbors. It is a local version of the global Moran’s I, calculated for each region or polygon in the dataset, and defined as: \[ I_i = \frac{n(Y_i - \bar{Y})}{\sum_j(Y_j - \bar{Y})^2}\sum_j w_{ij}(Y_j - \bar{Y}) \] Where:

n is the number of regions.
Y_i is the value of the variable in region i.
Ȳ is the mean of all values.
w_ij is the spatial weight between regions i and j.

Interpretation of Local Moran’s I:

High I_i: Indicates spatial clustering of similar values (high-high or low-low).
Low I_i: Indicates spatial clustering of dissimilar values (high-low or low-high) or spatial outlier. ### Hypothesis Testing for Local Spatial Autocorrelation Similar to the global statistic, local Moran’s I statistics are compared against a reference distribution obtained via Monte Carlo randomization. In this approach, the observed value Y_i at region i is fixed, and the remaining values are randomly reassigned over the other regions. This allows us to find a p-value representing the probability of observing the given local I statistic if the values were randomly distributed.

Local Moran’s I in R

Let’s use the localmoran() function to calculate local Moran’s I statistic.

# Calculate Local Moran's I
lmoran <- localmoran(merged_data$dv, nbw, alternative = "greater")
print(head(lmoran))

# Add Local Moran's results to merged_data
merged_data$lmI <- lmoran[, "Ii"]
merged_data$lmZ <- lmoran[, "Z.Ii"]
merged_data$lmp <- lmoran[, "Pr(z > E(Ii))"]

The localmoran() function returns a matrix with the local Moran’s I statistic, its expectation and variance, and the p-value. We have added the local Moran’s I, the z-scores, and p-values to the dataframe to visualize them in a map.

Cluster Mapping

Mapping local Moran’s I statistic, z-scores, or p-values can help visualize spatial clusters.

The following is the code to produce maps of the local Moran’s I statistics, the z-scores, and p-values. We use tmap to do this.

# Create maps using tmap
tmap_mode("view")

# Create thematic maps using tmap

# Map 1: Fuel Use by region
p1 <- tm_shape(merged_data) +
  tm_polygons(col = "dv", title = "Fuel Use", style = "quantile") +
  tm_layout(legend.outside = TRUE)

# Map 2: Local Moran's I statistic
p2 <- tm_shape(merged_data) +
  tm_polygons(col = "lmI", title = "Local Moran's I",
              style = "quantile") +
  tm_layout(legend.outside = TRUE)

# Map 3: Z-score for Local Moran's I
p3 <- tm_shape(merged_data) +
  tm_polygons(col = "lmZ", title = "Z-score",
              breaks = c(-Inf, 1.65, Inf)) +
  tm_layout(legend.outside = TRUE)

# Map 4: P-value for Local Moran's I
p4 <- tm_shape(merged_data) +
  tm_polygons(col = "lmp", title = "p-value",
              breaks = c(-Inf, 0.05, Inf)) +
  tm_layout(legend.outside = TRUE)


# Arrange the generated tmap objects into a single interactive display
# Arrange maps in a grid with 2 columns

tmap_arrange(p1, p2, p3, p4, ncol = 2)

The maps above help visualizing the spatial pattern of the local Moran’s I statistics and to check which areas are significantly autocorrelated based on the p-values and z-scores.

Getis-Ord Gi*

What is Getis-Ord Gi*?

Getis-Ord Gi* statistic is another measure of local spatial autocorrelation which is especially useful for identifying hot spots or cold spots. Unlike the local Moran’s I, Getis-Ord Gi* focuses on identifying clusters of high and low values.

For a given area i it is defined as: \[ G_i^* = \frac{\sum_j w_{ij} Y_j}{\sum_j Y_j} \]

Where:

Y_j is the value of the variable in region j
w_ij is the spatial weight between regions i and j. A positive z-score for the Getis-Ord Gi* statistic suggests that area i is part of a hot spot, whereas a negative z-score suggests that area i is part of a cold spot.

Interpretation of Getis-Ord Gi*

High positive Gi* scores: Indicate a “hot spot” (cluster of high values).
Low negative Gi* scores: Indicate a “cold spot” (cluster of low values).
Gi* scores around 0: Indicate that there is no spatial clustering.

Getis-Ord Gi* in R

Let’s calculate the Getis-Ord Gi* using the localG() function.

# Calculate Getis-Ord Gi*
local_gi <- localG(merged_data$dv, nbw)
merged_data$gi <- local_gi

# Convert Getis-Ord Gi* results into z-score to use for plotting
z_gi <- (local_gi - mean(local_gi)) / sd(local_gi)
merged_data$zgi <- z_gi

Now, we can add these to our map of the other statistics.

# Create maps using tmap
tmap_mode("view")

# Create thematic maps using tmap
# Map 1: Fuel Use by region
p1 <- tm_shape(merged_data) +
  tm_polygons(col = "dv", title = "Fuel Use", style = "quantile") +
  tm_layout(legend.outside = TRUE)

# Map 2: Local Moran's I statistic
p2 <- tm_shape(merged_data) +
  tm_polygons(col = "lmI", title = "Local Moran's I",
              style = "quantile") +
  tm_layout(legend.outside = TRUE)

# Map 3: Z-score for Local Moran's I
p3 <- tm_shape(merged_data) +
  tm_polygons(col = "lmZ", title = "Z-score",
              breaks = c(-Inf, 1.65, Inf)) +
  tm_layout(legend.outside = TRUE)

# Map 4: P-value for Local Moran's I
p4 <- tm_shape(merged_data) +
  tm_polygons(col = "lmp", title = "p-value",
              breaks = c(-Inf, 0.05, Inf)) +
  tm_layout(legend.outside = TRUE)

# Map 5: Getis-Ord Gi* statistic
p5 <- tm_shape(merged_data) +
  tm_polygons(col = "gi", title = "Getis-Ord Gi*",
              style = "quantile") +
  tm_layout(legend.outside = TRUE)

# Map 6: Z-score for Getis-Ord Gi* statistic
p6 <- tm_shape(merged_data) +
  tm_polygons(col = "zgi", title = "Z-score of Getis-Ord Gi*",
              breaks = c(-Inf, 1.65, Inf))+
  tm_layout(legend.outside = TRUE)

# Arrange the generated tmap objects into a single interactive display
# Arrange maps in a grid with 2 columns

tmap_arrange(p1, p2, p3, p4, p5, p6, ncol = 2)

The maps with the Gi* statistics helps identifying hot spots and cold spots. You can compare them with the results from the local Moran’s I test.

Identifying Cluster Types

We can combine the Moran’s I scatterplot information and local Moran’s I statistic to identify the type of spatial clusters.

# Identifying Cluster Types
mp <- moran.plot(as.vector(scale(merged_data$dv)), nbw)
merged_data$quadrant <- NA
# high-high
merged_data[(mp$x >= 0 & mp$wx >= 0) & (merged_data$lmp <= 0.05), "quadrant"]<- 1
# low-low
merged_data[(mp$x <= 0 & mp$wx <= 0) & (merged_data$lmp <= 0.05), "quadrant"]<- 2
# high-low
merged_data[(mp$x >= 0 & mp$wx <= 0) & (merged_data$lmp <= 0.05), "quadrant"]<- 3
# low-high
merged_data[(mp$x <= 0 & mp$wx >= 0) & (merged_data$lmp <= 0.05), "quadrant"]<- 4
# non-significant
merged_data[(merged_data$lmp > 0.05), "quadrant"] <- 5

tm_shape(merged_data) + tm_fill(col = "quadrant", title = "",
                                breaks = c(1, 2, 3, 4, 5, 6),
                                palette =  c("red", "blue", "lightpink", "skyblue2", "white"),
                                labels = c("High-High", "Low-Low", "High-Low",
                                           "Low-High", "Non-significant")) +
  tm_legend(text.size = 1)  + tm_borders(alpha = 0.5) +
  tm_layout(frame = FALSE,  title = "Clusters")  +
  tm_layout(legend.outside = TRUE)

The map above shows the clusters based on local Moran’s I and quadrants of the Moran’s I scatter plot.

Conclusion

In this module, we have explored how to quantify and visualize spatial autocorrelation using global and local measures. These techniques can be used to find spatial patterns that can be used in further analysis such as Bayesian spatial models, which will be explored in the next module. You should now have the tools to assess and map the spatial structure in your own datasets.

Further Study

Explore other spatial weight matrices (e.g., distance-based weights).
Experiment with other measures of local spatial autocorrelation like the Gi* statistic.
Practice using your own spatial datasets to implement this module.

Appendix

Here is the code with the example of sanitation access for additional practice and reference.

# Load libraries
# `sf`: For handling spatial vector data (shapefiles).
library(sf)
# `ggplot2`: For creating static and publication-quality graphics.
library(ggplot2)
# `spdep`: For spatial analysis and spatial autocorrelation.
library(spdep)
# `tmap`: For creating thematic maps.
library(tmap)
# `haven`: For reading data files from various statistical packages (including .dta).
library(haven)
# `survey`: For complex survey analysis
library(survey)
# `RColorBrewer`: Provides color schemes for maps and charts.
library(RColorBrewer)

# Load shapefile
# Specify the file path to the shapefile representing the administrative boundaries of Somalia.
filepath<-"gadm41_SOM_1.shp"
# Read the shapefile using st_read() from the sf package.
somalia<-st_read(filepath)
# Print the names of the first-level administrative units (regions) in Somalia.
somalia$NAME_1

# Load DHS data 
# Specify the file path to the DHS dataset (.dta file).
sanitationdata <- read_dta("sanitation_access.dta")
# Print the structure of the 'HV024' column, which likely contains region codes.
str(sanitationdata$Region)

# Calculate the mean solid fuel use by region 
# Apply the function to subsets of the survey data using svyby
# Create a survey design object
library(survey)
options(survey.lonely.psu = "remove")
design <- svydesign(id = ~HV001, strata = ~STRU_CD, weights = ~HV005, data = sanitationdata,nest=TRUE)
result <- svyby(~dv, ~Region, FUN = svyciprop, design = design, vartype = NULL)
# Print the resulting data frame showing the mean fuel use per region.
print(result)

# Add region names to the result data frame
# Define a character vector with the names of the regions, ordered to match the codes in the result.
labels <- c("Awdal", "Woqooyi Galbeed", "Togdheer", "Sool", "Sanaag", "Bari", "Nugaal", "Mudug", 
            "Galguduud", "Hiiraan", "Shabeellaha Dhexe", "Banaadir",  "Bay", 
            "Bakool", "Gedo",  "Jubbada Hoose")
# Add a 'Region' column to the result data frame, mapping region names to region codes.
result$Region <- labels
# Sort the result by the 'Region' column
result <- result[order(result$Region), ]
# Print the resulting dataframe that is now labeled and sorted.
print(result)

# Merge the shapefile and result data frame
# Merge the shapefile data (`somalia`) with the calculated mean fuel use data (`result`).
merged_data <- merge(somalia, result, by.x = "NAME_1", by.y = "Region")

# Calculate centroids for label placement
# Calculate the centroids (geometric center points) of each polygon in the merged dataset.
centroid <- st_centroid(merged_data)
# Extract x-coordinates of centroids and add them as column named 'x_position'
merged_data$x_position <- centroid[, 1]
# Extract y-coordinates of centroids and add them as column named 'y_position'
merged_data$y_position <- centroid[, 2]

# Convert SpatialPolygonsDataFrame to sf object
# Convert merged_data which is a SpatialPolygonsDataFrame into an sf object for use in ggplot
merged_data_sf <- st_as_sf(merged_data)
# Define a custom color palette using RColorBrewer
colors <- brewer.pal(9, "YlOrRd")

# Plot the spatial data
# Initiate a ggplot object.
p1 <- ggplot() +
  # Add polygon layer from merged_data_sf, fill based on the `dv` (fuel use) variable.
  geom_sf(data = merged_data_sf, aes(fill = dv)) +
  # Define a color gradient fill scale using the defined color palette.
  scale_fill_gradientn(colors = colors, na.value = "grey") +
  # Customize the legend title.
  guides(fill = guide_legend(title = "Unimproved Sanitation Access")) +
  # Add region name labels at the centroid location of each region.
  geom_sf_text(data = merged_data_sf, aes(label = NAME_1),
               cex = 3, hjust = 0.5, vjust = 0.5, color = "black", fontface = "bold") +
  # Remove y axis label
  ylab(label = "") +
  # Remove x axis label
  xlab(label = "") +
  # Set the plot title.
  ggtitle("Spatial Distribution of Unimproved Sanitatino Access among Households in Somalia") +
  # Customize the plot theme.
  theme(legend.text = element_text(size = 10),
        plot.title = element_text(color = "blue", size = 10, face = "bold"),
        strip.text.x = element_text(size = 10))


# Print the created ggplot object
print(p1)

# Exploratory Spatial Data Analysis
# Spatial Autocorrelation Analysis

# 1. Create spatial weights matrix: Queen contiguity
# Remove rows with NA values in the `dv` column (mean fuel use)
merged_data <- merged_data[!is.na(merged_data$dv), ]
# Create a neighborhood object based on queen contiguity between regions.
nb <- poly2nb(merged_data, queen = TRUE)
# Create a spatial weights list from the neighborhood object using the 'W' style (row-standardized weights).
nbw <- nb2listw(nb, style = "W")

# 2. Calculate Global Moran's I
# Calculate Global Moran's I statistic for the fuel use data.
gmoran <- moran.test(merged_data$dv, nbw, alternative = "greater")
# Print the output of the Moran's I test.
print(gmoran)
# Extract and print the Moran's I statistic.
print(gmoran[["estimate"]][["Moran I statistic"]])
# Extract and print the z-score of the test.
print(gmoran[["statistic"]])
# Extract and print the p-value of the test.
print(gmoran[["p.value"]])

# Interpretation:
# - If p-value < 0.05 (significance level), reject the null hypothesis of no spatial autocorrelation.
# - Conclusion: There is evidence of spatial autocorrelation in fuel use.

# Moran's for MC
# Conduct a Monte Carlo simulation for Moran's I to assess significance.
gmoranMC <- moran.mc(merged_data$dv, nbw, nsim = 999)
# Print the output of the Monte Carlo simulation
print(gmoranMC)
# Open a new graphics window.
windows()
# Create a histogram of simulated Moran's I values.
hist(gmoranMC$res)
# Add a vertical line at the observed Moran's I statistic.
abline(v = gmoranMC$statistic, col = "red")

# 3. Visualize with Moran's I scatterplot
# Generate a Moran's I scatterplot to visualize spatial autocorrelation.
moran.plot(merged_data$dv, nbw)

# 4. Calculate Local Moran's I
# Calculate Local Moran's I for each region.
lmoran <- localmoran(merged_data$dv, nbw,
                     alternative = "greater")
# Print the first few rows of the local Moran's I results
print(head(lmoran))

# 5. Add Local Moran's results to merged_data
# Add the local Moran's I statistic to the merged dataset.
merged_data$lmI <- lmoran[, "Ii"]
# Add the z-scores of local Moran's I to the merged dataset.
merged_data$lmZ <- lmoran[, "Z.Ii"]
# Add p-values of local Moran's I to the merged dataset.
merged_data$lmp <- lmoran[, "Pr(z > E(Ii))"]

# 6. Calculate Getis-Ord Gi*
# Calculate Getis-Ord Gi* statistic for each region.
local_gi <- localG(merged_data$dv, nbw)
# Add Getis-Ord Gi* statistic to the merged dataset
merged_data$gi <- local_gi

# Convert Getis-Ord Gi* results into z-score to use for plotting
# Calculate z-scores for the Getis-Ord Gi* statistic.
z_gi <- (local_gi - mean(local_gi)) / sd(local_gi)
# Add the z-scores of Gi* to the merged dataset.
merged_data$zgi <- z_gi

# Create maps using tmap
# Set tmap mode to "view" for interactive map viewing.
tmap_mode("view")

# Create thematic maps using tmap

# Map 1: Fuel Use by region
p1 <- tm_shape(merged_data) +
  # Plot polygons filled based on `dv` (fuel use) using quantile breaks.
  tm_polygons(col = "dv", title = "Fuel Use", style = "quantile") +
  # Adjust layout so that the legend is outside of the plot.
  tm_layout(legend.outside = TRUE)

# Map 2: Local Moran's I statistic
p2 <- tm_shape(merged_data) +
  # Plot polygons filled based on the `lmI` variable using quantile breaks.
  tm_polygons(col = "lmI", title = "Local Moran's I",
              style = "quantile") +
  # Adjust layout so that the legend is outside of the plot.
  tm_layout(legend.outside = TRUE)

# Map 3: Z-score for Local Moran's I
p3 <- tm_shape(merged_data) +
  # Plot polygons filled based on `lmZ` (z-scores) with specified breaks
  tm_polygons(col = "lmZ", title = "Z-score",
              breaks = c(-Inf, 1.65, Inf)) +
  # Adjust layout so that the legend is outside of the plot.
  tm_layout(legend.outside = TRUE)

# Map 4: P-value for Local Moran's I
p4 <- tm_shape(merged_data) +
  # Plot polygons filled based on `lmp` (p-values) with specified breaks
  tm_polygons(col = "lmp", title = "p-value",
              breaks = c(-Inf, 0.05, Inf)) +
  # Adjust layout so that the legend is outside of the plot.
  tm_layout(legend.outside = TRUE)

# Map 5: Getis-Ord Gi* statistic
p5 <- tm_shape(merged_data) +
  # Plot polygons filled based on `gi` (Getis-Ord Gi*) using quantile breaks
  tm_polygons(col = "gi", title = "Getis-Ord Gi*",
              style = "quantile") +
  # Adjust layout so that the legend is outside of the plot.
  tm_layout(legend.outside = TRUE)

# Map 6: Z-score for Getis-Ord Gi* statistic
p6 <- tm_shape(merged_data) +
  # Plot polygons filled based on `zgi` (z-scores) with specified breaks
  tm_polygons(col = "zgi", title = "Z-score of Getis-Ord Gi*",
              breaks = c(-Inf, 1.65, Inf))+
  # Adjust layout so that the legend is outside of the plot.
  tm_layout(legend.outside = TRUE)

# Arrange the generated tmap objects into a single interactive display
# Arrange maps in a grid with 2 columns

tmap_arrange(p1, p2, p3, p4, p5, p6, ncol = 2)
arranged_map<-tmap_arrange(p1, p2, p3, p4, p5, p6, ncol = 2)
tmap_save(arranged_map, filename = "arranged_map.png", width=10, height=12, units = "in")

The module above explains the methods used to perform the analysis of spatial autocorrelation. It provides all the theory and the codes.

Module 7 Spatial Autocorrelation

Abdisalam Hassan Muse

2024-12-29

Introduction

Learning Objectives

Spatial Autocorrelation: Key Concepts

Global Spatial Autocorrelation: Moran’s I

What is Global Moran’s I?

Spatial Weights

Interpretation of Global Moran’s I:

Hypothesis Testing for Spatial Autocorrelation

Global Moran’s I in R

Moran’s I Scatterplot

Local Spatial Autocorrelation: Local Moran’s I

What is Local Moran’s I?

Interpretation of Local Moran’s I:

Local Moran’s I in R

Cluster Mapping

Getis-Ord Gi*

What is Getis-Ord Gi*?

Interpretation of Getis-Ord Gi*

Getis-Ord Gi* in R

Identifying Cluster Types

Conclusion

Further Study

Appendix