1 Introduction

This analysis aims to analyse spatio-temporal patterns of COVID-19 cases at three states in Central Mexico: Mexico City, Mexico State and Morelos State. Localised spatial statistic methods will be utilised to conduct the analysis.

The data that will be used in this analysis is extracted from the home page of the Secretary of Health, Government of Mexico.
It is a shapefile consisting of reported COVID-19 cases and related data at the municipality level, aggregated up to epidemiological weeks (e-week).
This analysis will focus on the epidemiological period of e-weeks 13 to 32.

2 Install and load packages

packages = c('tidyverse', 'tmap', 'rgdal', 'spdep', 'sf', 'ggpubr')
for (p in packages) {
  if (!require(p, character.only = T)) {
    install.packages(p)
  }
  library(p, character.only = T)
}

## Warning: package 'tmap' was built under R version 4.0.3

3 Data import

Import the data and examine the content of the dataset.

Data will first be imported as an sf object to facilitate data pre-processing and initial analysis, then converted to sp class in the later part of the analysis.

Import

municipalities <- st_read(dsn = 'data/geospatial',
                          layer = 'municipalities_COVID')

## Reading layer `municipalities_COVID' from data source `C:\Users\Xiao Rong\Desktop\School\Geospatial Analytics and Applications\Assignments\Take-Home Exercise 2\IS415_Take-home_Ex02\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 2465 features and 198 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 911292 ymin: 319149.1 xmax: 4082997 ymax: 2349615
## projected CRS:  MEXICO_ITRF_2008_LCC

The spatial data is projected using the Lambert Conformal Conic (LCC) projection, with the Mexico ITRF2008 datum.

Glimpse

glimpse(municipalities)

## Rows: 2,465
## Columns: 199
## $ CVEGEO   <chr> "01001", "01002", "01003", "01004", "01005", "01006", "010...
## $ CVE_ENT  <chr> "01", "01", "01", "01", "01", "01", "01", "01", "01", "01"...
## $ CVE_MUN  <chr> "001", "002", "003", "004", "005", "006", "007", "008", "0...
## $ NOMGEO   <chr> "Aguascalientes", "Asientos", "Calvillo", "Cosío", "Jesús ...
## $ Pop2010  <int> 797010, 45492, 54136, 15042, 99590, 41862, 49156, 8443, 19...
## $ Pop2020  <int> 961977, 50864, 60760, 16918, 130184, 50032, 57981, 9661, 2...
## $ new1     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new2     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new3     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new4     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new5     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new6     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new7     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new8     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new9     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new10    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new11    <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0...
## $ new12    <dbl> 7, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 9, 0, 11, 1, 0, 0, 0, ...
## $ new13    <dbl> 30, 0, 0, 0, 1, 4, 0, 0, 1, 0, 0, 1, 25, 1, 33, 1, 0, 0, 6...
## $ new14    <dbl> 8, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 4, 77, 4, 165, 2, 0, 0, 1...
## $ new15    <dbl> 10, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 3, 121, 14, 184, 5, 0, 1...
## $ new16    <dbl> 51, 0, 0, 0, 6, 0, 1, 0, 0, 0, 0, 18, 230, 19, 288, 4, 2, ...
## $ new17    <dbl> 60, 0, 8, 0, 3, 0, 2, 1, 0, 0, 3, 27, 219, 23, 319, 5, 4, ...
## $ new18    <dbl> 87, 1, 1, 0, 3, 6, 4, 4, 1, 0, 1, 41, 237, 32, 321, 13, 0,...
## $ new19    <dbl> 101, 0, 1, 0, 4, 3, 6, 0, 1, 1, 1, 49, 415, 43, 313, 13, 1...
## $ new20    <dbl> 101, 0, 1, 6, 5, 2, 8, 0, 1, 0, 1, 50, 462, 15, 244, 7, 10...
## $ new21    <dbl> 99, 5, 0, 7, 5, 14, 14, 0, 13, 0, 4, 66, 782, 14, 245, 7, ...
## $ new22    <dbl> 185, 2, 0, 3, 6, 6, 12, 0, 6, 2, 1, 73, 535, 8, 170, 6, 13...
## $ new23    <dbl> 257, 1, 3, 7, 5, 17, 18, 2, 15, 1, 5, 112, 677, 17, 146, 2...
## $ new24    <dbl> 312, 8, 3, 6, 11, 8, 17, 3, 9, 0, 7, 124, 642, 16, 117, 1,...
## $ new25    <dbl> 280, 6, 8, 12, 10, 19, 25, 7, 10, 0, 3, 193, 603, 8, 192, ...
## $ new26    <dbl> 258, 4, 11, 1, 15, 13, 20, 6, 7, 1, 4, 211, 590, 32, 233, ...
## $ new27    <dbl> 234, 9, 8, 0, 14, 7, 29, 1, 6, 1, 5, 208, 534, 10, 281, 22...
## $ new28    <dbl> 295, 11, 20, 2, 15, 8, 42, 5, 11, 2, 2, 200, 442, 18, 268,...
## $ new29    <dbl> 307, 6, 35, 10, 13, 21, 43, 12, 7, 2, 3, 234, 363, 25, 233...
## $ new30    <dbl> 267, 7, 13, 9, 10, 14, 26, 4, 17, 0, 4, 204, 268, 24, 240,...
## $ new31    <dbl> 219, 11, 13, 4, 15, 19, 23, 4, 11, 2, 3, 129, 177, 13, 166...
## $ new32    <dbl> 23, 0, 3, 0, 2, 0, 7, 0, 0, 1, 0, 30, 34, 1, 15, 2, 21, 10...
## $ cumul1   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul4   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul5   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul6   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul7   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul8   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul9   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul10  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul11  <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0...
## $ cumul12  <dbl> 8, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 11, 1, 0, 0, 0,...
## $ cumul13  <dbl> 38, 1, 0, 0, 1, 4, 0, 0, 1, 0, 0, 2, 35, 1, 44, 2, 0, 0, 6...
## $ cumul14  <dbl> 46, 1, 0, 0, 1, 5, 1, 0, 1, 0, 0, 6, 112, 5, 209, 4, 0, 0,...
## $ cumul15  <dbl> 56, 1, 0, 0, 1, 5, 2, 0, 1, 1, 0, 9, 233, 19, 393, 9, 0, 1...
## $ cumul16  <dbl> 107, 1, 0, 0, 7, 5, 3, 0, 1, 1, 0, 27, 463, 38, 681, 13, 2...
## $ cumul17  <dbl> 167, 1, 8, 0, 10, 5, 5, 1, 1, 1, 3, 54, 682, 61, 1000, 18,...
## $ cumul18  <dbl> 254, 2, 9, 0, 13, 11, 9, 5, 2, 1, 4, 95, 919, 93, 1321, 31...
## $ cumul19  <dbl> 355, 2, 10, 0, 17, 14, 15, 5, 3, 2, 5, 144, 1334, 136, 163...
## $ cumul20  <dbl> 456, 2, 11, 6, 22, 16, 23, 5, 4, 2, 6, 194, 1796, 151, 187...
## $ cumul21  <dbl> 555, 7, 11, 13, 27, 30, 37, 5, 17, 2, 10, 260, 2578, 165, ...
## $ cumul22  <dbl> 740, 9, 11, 16, 33, 36, 49, 5, 23, 4, 11, 333, 3113, 173, ...
## $ cumul23  <dbl> 997, 10, 14, 23, 38, 53, 67, 7, 38, 5, 16, 445, 3790, 190,...
## $ cumul24  <dbl> 1309, 18, 17, 29, 49, 61, 84, 10, 47, 5, 23, 569, 4432, 20...
## $ cumul25  <dbl> 1589, 24, 25, 41, 59, 80, 109, 17, 57, 5, 26, 762, 5035, 2...
## $ cumul26  <dbl> 1847, 28, 36, 42, 74, 93, 129, 23, 64, 6, 30, 973, 5625, 2...
## $ cumul27  <dbl> 2081, 37, 44, 42, 88, 100, 158, 24, 70, 7, 35, 1181, 6159,...
## $ cumul28  <dbl> 2376, 48, 64, 44, 103, 108, 200, 29, 81, 9, 37, 1381, 6601...
## $ cumul29  <dbl> 2683, 54, 99, 54, 116, 129, 243, 41, 88, 11, 40, 1615, 696...
## $ cumul30  <dbl> 2950, 61, 112, 63, 126, 143, 269, 45, 105, 11, 44, 1819, 7...
## $ cumul31  <dbl> 3169, 72, 125, 67, 141, 162, 292, 49, 116, 13, 47, 1948, 7...
## $ cumul32  <dbl> 3192, 72, 128, 67, 143, 162, 299, 49, 116, 14, 47, 1978, 7...
## $ activ1   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ4   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ5   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ6   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ activ7   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ activ8   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ activ9   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ activ10  <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ activ11  <dbl> 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 8, 0, 0, 0, 0, 1...
## $ activ12  <dbl> 16, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 28, 0, 36, 1, 0, 0, 0...
## $ activ13  <dbl> 45, 1, 0, 0, 1, 5, 0, 0, 1, 0, 0, 2, 68, 1, 134, 2, 0, 0, ...
## $ activ14  <dbl> 53, 1, 0, 0, 1, 5, 1, 0, 1, 1, 0, 9, 176, 13, 332, 6, 0, 1...
## $ activ15  <dbl> 75, 1, 0, 0, 3, 5, 2, 0, 1, 1, 0, 18, 370, 28, 606, 11, 2,...
## $ activ16  <dbl> 139, 1, 0, 0, 7, 5, 3, 1, 1, 1, 1, 42, 576, 47, 921, 17, 6...
## $ activ17  <dbl> 203, 1, 9, 0, 12, 5, 5, 1, 2, 1, 3, 81, 807, 80, 1249, 23,...
## $ activ18  <dbl> 309, 2, 9, 0, 14, 13, 9, 5, 2, 2, 4, 134, 1117, 120, 1579,...
## $ activ19  <dbl> 393, 2, 11, 1, 19, 14, 15, 5, 3, 2, 6, 171, 1561, 142, 183...
## $ activ20  <dbl> 500, 4, 11, 7, 23, 22, 26, 5, 5, 2, 7, 224, 2158, 157, 206...
## $ activ21  <dbl> 622, 9, 11, 14, 28, 34, 41, 5, 18, 2, 11, 297, 2864, 174, ...
## $ activ22  <dbl> 848, 10, 12, 18, 34, 41, 52, 5, 26, 5, 12, 402, 3400, 185,...
## $ activ23  <dbl> 1133, 14, 15, 27, 44, 60, 75, 7, 39, 5, 16, 512, 4097, 204...
## $ activ24  <dbl> 1424, 19, 18, 34, 53, 70, 86, 12, 48, 5, 24, 675, 4732, 21...
## $ activ25  <dbl> 1713, 27, 28, 42, 69, 88, 116, 22, 60, 6, 26, 901, 5361, 2...
## $ activ26  <dbl> 1972, 31, 39, 42, 83, 95, 141, 23, 68, 6, 34, 1104, 5893, ...
## $ activ27  <dbl> 2248, 41, 53, 43, 96, 104, 171, 26, 79, 8, 36, 1320, 6401,...
## $ activ28  <dbl> 2544, 51, 90, 50, 109, 119, 220, 30, 83, 10, 39, 1537, 681...
## $ activ29  <dbl> 2845, 56, 108, 56, 122, 135, 255, 43, 97, 11, 42, 1748, 71...
## $ activ30  <dbl> 3079, 70, 124, 66, 135, 148, 282, 46, 110, 13, 44, 1915, 7...
## $ activ31  <dbl> 3189, 72, 127, 67, 143, 162, 297, 49, 116, 14, 47, 1976, 7...
## $ activ32  <dbl> 3192, 72, 128, 67, 143, 162, 299, 49, 116, 14, 47, 1978, 7...
## $ death1   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death4   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death5   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death6   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death7   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death8   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death9   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death10  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death11  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death12  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death13  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death14  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 10, 1, 0, 0, 3, ...
## $ death15  <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 15, 2, 25, 0, 0, 0, 1,...
## $ death16  <dbl> 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 20, 2, 69, 3, 0, 0, 2,...
## $ death17  <dbl> 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 23, 2, 95, 0, 0, 0, 4,...
## $ death18  <dbl> 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 42, 7, 109, 3, 0, 0, 3...
## $ death19  <dbl> 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 9, 46, 6, 131, 4, 0, 1, 5...
## $ death20  <dbl> 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 89, 8, 109, 1, 2, 0, 3...
## $ death21  <dbl> 4, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 10, 130, 7, 88, 2, 0, 1, ...
## $ death22  <dbl> 17, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 12, 110, 3, 79, 2, 0, 1,...
## $ death23  <dbl> 18, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 9, 119, 2, 71, 3, 2, 1, ...
## $ death24  <dbl> 27, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 12, 119, 2, 45, 0, 0, 1,...
## $ death25  <dbl> 25, 0, 0, 0, 0, 1, 1, 0, 0, 0, 2, 17, 93, 2, 40, 0, 6, 0, ...
## $ death26  <dbl> 24, 2, 1, 0, 0, 1, 0, 0, 0, 0, 0, 13, 107, 4, 31, 0, 4, 0,...
## $ death27  <dbl> 19, 0, 0, 0, 0, 0, 3, 0, 1, 0, 0, 17, 100, 2, 50, 0, 0, 2,...
## $ death28  <dbl> 19, 2, 0, 0, 0, 1, 0, 0, 0, 1, 1, 23, 101, 4, 60, 1, 5, 1,...
## $ death29  <dbl> 19, 2, 0, 0, 1, 0, 0, 0, 0, 1, 1, 31, 58, 2, 34, 0, 1, 1, ...
## $ death30  <dbl> 20, 0, 1, 0, 3, 1, 1, 0, 0, 0, 0, 29, 62, 7, 35, 2, 3, 0, ...
## $ death31  <dbl> 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 41, 4, 37, 3, 4, 2, ...
## $ death32  <dbl> 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 18, 0, 21, 0, 0, 0, 2,...
## $ actvrt1  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt2  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt3  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt4  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt5  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt6  <dbl> 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000...
## $ actvrt7  <dbl> 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000...
## $ actvrt8  <dbl> 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000...
## $ actvrt9  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvr10  <dbl> 0.1039526, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ actvr11  <dbl> 0.4158104, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ actvr12  <dbl> 1.6632414, 1.9660271, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ actvr13  <dbl> 4.5739139, 1.9660271, 0.0000000, 0.0000000, 0.7681436, 9.9...
## $ actvr14  <dbl> 5.0936769, 1.9660271, 0.0000000, 0.0000000, 0.7681436, 9.9...
## $ actvr15  <dbl> 6.133203, 0.000000, 0.000000, 0.000000, 2.304431, 9.993604...
## $ actvr16  <dbl> 9.771543, 0.000000, 0.000000, 0.000000, 4.608861, 0.000000...
## $ actvr17  <dbl> 15.592888, 0.000000, 14.812377, 0.000000, 8.449579, 0.0000...
## $ actvr18  <dbl> 24.324906, 1.966027, 14.812377, 0.000000, 8.449579, 15.989...
## $ actvr19  <dbl> 26.403958, 1.966027, 18.104016, 5.910864, 9.217723, 17.988...
## $ actvr20  <dbl> 30.873919, 5.898081, 3.291639, 41.376049, 8.449579, 33.978...
## $ actvr21  <dbl> 32.537160, 13.762189, 3.291639, 82.752098, 10.754010, 41.9...
## $ actvr22  <dbl> 47.29843, 15.72822, 1.64582, 100.48469, 11.52215, 53.96546...
## $ actvr23  <dbl> 65.801989, 19.660271, 6.583278, 118.217283, 16.131015, 75....
## $ actvr24  <dbl> 83.36998, 19.66027, 11.52074, 118.21728, 19.20359, 71.9539...
## $ actvr25  <dbl> 89.918990, 33.422460, 26.333114, 141.860740, 26.885024, 93...
## $ actvr26  <dbl> 87.216222, 33.422460, 39.499671, 88.662963, 29.957598, 69....
## $ actvr27  <dbl> 85.65693, 43.25260, 57.60369, 53.19778, 33.03017, 67.95651...
## $ actvr28  <dbl> 86.38460, 47.18465, 102.04082, 47.28691, 30.72574, 61.9603...
## $ actvr29  <dbl> 90.75061, 49.15068, 113.56155, 82.75210, 29.95760, 79.9488...
## $ actvr30  <dbl> 86.38460, 57.01478, 116.85319, 135.94988, 29.95760, 87.943...
## $ actvr31  <dbl> 67.04942, 41.28657, 60.89533, 100.48469, 26.11688, 85.9450...
## $ actvr32  <dbl> 36.071548, 31.456433, 32.916392, 65.019506, 16.131015, 53....
## $ dethrt1  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt2  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt3  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt4  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt5  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt6  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt7  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt8  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt9  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt10  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt11  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt12  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt13  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt14  <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt15  <dbl> 0.1039526, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt16  <dbl> 0.2079052, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt17  <dbl> 0.2079052, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt18  <dbl> 0.3118578, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt19  <dbl> 0.7276681, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt20  <dbl> 0.6237155, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt21  <dbl> 0.4158104, 0.0000000, 0.0000000, 0.0000000, 1.5362871, 1.9...
## $ dthrt22  <dbl> 1.7671940, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 1.9...
## $ dthrt23  <dbl> 1.8711466, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ dthrt24  <dbl> 2.8067199, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 1.9...
## $ dthrt25  <dbl> 2.5988147, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 1.9...
## $ dthrt26  <dbl> 2.4948621, 3.9320541, 1.6458196, 0.0000000, 0.0000000, 1.9...
## $ dthrt27  <dbl> 1.975099, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000...
## $ dthrt28  <dbl> 1.9750992, 3.9320541, 0.0000000, 0.0000000, 0.0000000, 1.9...
## $ dthrt29  <dbl> 1.9750992, 3.9320541, 0.0000000, 0.0000000, 0.7681436, 0.0...
## $ dthrt30  <dbl> 2.079052, 0.000000, 1.645820, 0.000000, 2.304431, 1.998721...
## $ dthrt31  <dbl> 1.351384, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000...
## $ dthrt32  <dbl> 0.7276681, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((2489073 111..., MULTIPOLYGON ...

4 Data wrangling

Pre-process and prepare data for analysis.

4.1 Extract data

As the analysis focuses on Central Mexico (study area), the data for municipalities in Mexico City, Mexico State and Morelos State, will be filtered and extracted.

The respective CVE_ENT codes are “09”, “15” and “17” respectively for the extraction.

Extract

municipalities <- municipalities %>%
  filter(CVE_ENT %in% c("09", "15", "17"))

Check extract

unique(municipalities$CVE_ENT)

## [1] "09" "15" "17"

glimpse(municipalities)

## Rows: 176
## Columns: 199
## $ CVEGEO   <chr> "09002", "09003", "09004", "09005", "09006", "09007", "090...
## $ CVE_ENT  <chr> "09", "09", "09", "09", "09", "09", "09", "09", "09", "09"...
## $ CVE_MUN  <chr> "002", "003", "004", "005", "006", "007", "008", "009", "0...
## $ NOMGEO   <chr> "Azcapotzalco", "Coyoacán", "Cuajimalpa de Morelos", "Gust...
## $ Pop2010  <int> 414711, 620416, 186391, 1185772, 384326, 1815786, 239086, ...
## $ Pop2020  <int> 408441, 621952, 199809, 1176967, 393821, 1815551, 245147, ...
## $ new1     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new2     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new3     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new4     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new5     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new6     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new7     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new8     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ new9     <dbl> 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ new10    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ new11    <dbl> 0, 1, 18, 0, 1, 1, 1, 0, 11, 0, 9, 0, 4, 2, 29, 1, 0, 0, 0...
## $ new12    <dbl> 1, 9, 30, 6, 1, 7, 5, 0, 26, 1, 10, 3, 8, 11, 56, 2, 0, 0,...
## $ new13    <dbl> 18, 15, 18, 21, 9, 32, 9, 0, 18, 2, 22, 8, 11, 20, 25, 6, ...
## $ new14    <dbl> 27, 19, 9, 68, 16, 59, 12, 1, 34, 8, 29, 21, 36, 29, 31, 2...
## $ new15    <dbl> 41, 63, 18, 97, 38, 125, 18, 22, 56, 19, 83, 40, 43, 74, 3...
## $ new16    <dbl> 69, 106, 29, 204, 123, 301, 38, 36, 124, 49, 177, 65, 76, ...
## $ new17    <dbl> 98, 162, 35, 337, 155, 535, 86, 78, 142, 96, 207, 151, 100...
## $ new18    <dbl> 137, 179, 45, 373, 209, 693, 51, 70, 197, 135, 199, 152, 1...
## $ new19    <dbl> 224, 277, 58, 546, 254, 863, 74, 94, 238, 203, 309, 248, 1...
## $ new20    <dbl> 316, 319, 63, 677, 319, 966, 133, 119, 322, 303, 312, 330,...
## $ new21    <dbl> 336, 342, 91, 778, 305, 1076, 144, 195, 385, 352, 311, 343...
## $ new22    <dbl> 323, 374, 159, 735, 276, 873, 142, 163, 387, 233, 402, 364...
## $ new23    <dbl> 349, 389, 141, 831, 283, 884, 126, 183, 383, 303, 367, 427...
## $ new24    <dbl> 310, 299, 137, 692, 220, 818, 122, 223, 458, 237, 434, 436...
## $ new25    <dbl> 345, 382, 140, 611, 245, 747, 181, 170, 435, 279, 425, 323...
## $ new26    <dbl> 328, 272, 126, 525, 206, 665, 179, 137, 367, 277, 498, 341...
## $ new27    <dbl> 316, 382, 168, 573, 165, 762, 208, 176, 388, 295, 509, 288...
## $ new28    <dbl> 301, 313, 159, 512, 172, 670, 229, 151, 419, 312, 437, 340...
## $ new29    <dbl> 300, 490, 171, 625, 186, 744, 345, 204, 564, 351, 501, 441...
## $ new30    <dbl> 332, 427, 220, 663, 233, 728, 247, 197, 492, 326, 523, 377...
## $ new31    <dbl> 271, 396, 114, 505, 196, 620, 277, 145, 425, 285, 446, 336...
## $ new32    <dbl> 24, 30, 7, 59, 62, 135, 12, 11, 62, 38, 65, 52, 21, 34, 31...
## $ cumul1   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul4   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul5   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul6   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul7   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul8   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ cumul9   <dbl> 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## $ cumul10  <dbl> 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0...
## $ cumul11  <dbl> 0, 1, 18, 1, 1, 2, 1, 0, 12, 0, 9, 0, 4, 2, 31, 1, 0, 0, 0...
## $ cumul12  <dbl> 1, 10, 48, 7, 2, 9, 6, 0, 38, 1, 19, 3, 12, 13, 87, 3, 0, ...
## $ cumul13  <dbl> 19, 25, 66, 28, 11, 41, 15, 0, 56, 3, 41, 11, 23, 33, 112,...
## $ cumul14  <dbl> 46, 44, 75, 96, 27, 100, 27, 1, 90, 11, 70, 32, 59, 62, 14...
## $ cumul15  <dbl> 87, 107, 93, 193, 65, 225, 45, 23, 146, 30, 153, 72, 102, ...
## $ cumul16  <dbl> 156, 213, 122, 397, 188, 526, 83, 59, 270, 79, 330, 137, 1...
## $ cumul17  <dbl> 254, 375, 157, 734, 343, 1061, 169, 137, 412, 175, 537, 28...
## $ cumul18  <dbl> 391, 554, 202, 1107, 552, 1754, 220, 207, 609, 310, 736, 4...
## $ cumul19  <dbl> 615, 831, 260, 1653, 806, 2617, 294, 301, 847, 513, 1045, ...
## $ cumul20  <dbl> 931, 1150, 323, 2330, 1125, 3583, 427, 420, 1169, 816, 135...
## $ cumul21  <dbl> 1267, 1492, 414, 3108, 1430, 4659, 571, 615, 1554, 1168, 1...
## $ cumul22  <dbl> 1590, 1866, 573, 3843, 1706, 5532, 713, 778, 1941, 1401, 2...
## $ cumul23  <dbl> 1939, 2255, 714, 4674, 1989, 6416, 839, 961, 2324, 1704, 2...
## $ cumul24  <dbl> 2249, 2554, 851, 5366, 2209, 7234, 961, 1184, 2782, 1941, ...
## $ cumul25  <dbl> 2594, 2936, 991, 5977, 2454, 7981, 1142, 1354, 3217, 2220,...
## $ cumul26  <dbl> 2922, 3208, 1117, 6502, 2660, 8646, 1321, 1491, 3584, 2497...
## $ cumul27  <dbl> 3238, 3590, 1285, 7075, 2825, 9408, 1529, 1667, 3972, 2792...
## $ cumul28  <dbl> 3539, 3903, 1444, 7587, 2997, 10078, 1758, 1818, 4391, 310...
## $ cumul29  <dbl> 3839, 4393, 1615, 8212, 3183, 10822, 2103, 2022, 4955, 345...
## $ cumul30  <dbl> 4171, 4820, 1835, 8875, 3416, 11550, 2350, 2219, 5447, 378...
## $ cumul31  <dbl> 4442, 5216, 1949, 9380, 3612, 12170, 2627, 2364, 5872, 406...
## $ cumul32  <dbl> 4466, 5246, 1956, 9439, 3674, 12305, 2639, 2375, 5934, 410...
## $ activ1   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ4   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ5   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ6   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ7   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ8   <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ activ9   <dbl> 0, 0, 0, 2, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0...
## $ activ10  <dbl> 0, 0, 5, 2, 0, 2, 1, 0, 5, 0, 5, 1, 0, 1, 12, 0, 0, 0, 0, ...
## $ activ11  <dbl> 1, 7, 33, 4, 2, 5, 6, 0, 27, 0, 18, 3, 7, 5, 59, 2, 0, 0, ...
## $ activ12  <dbl> 8, 19, 60, 20, 6, 26, 18, 0, 49, 1, 39, 11, 20, 25, 103, 5...
## $ activ13  <dbl> 34, 41, 70, 68, 20, 75, 24, 1, 80, 8, 74, 32, 40, 54, 133,...
## $ activ14  <dbl> 70, 82, 86, 173, 58, 174, 41, 9, 135, 29, 118, 57, 86, 122...
## $ activ15  <dbl> 130, 166, 112, 321, 129, 427, 70, 43, 215, 61, 252, 125, 1...
## $ activ16  <dbl> 199, 321, 140, 571, 286, 854, 113, 107, 368, 142, 483, 227...
## $ activ17  <dbl> 332, 513, 182, 940, 482, 1492, 206, 186, 545, 257, 681, 39...
## $ activ18  <dbl> 516, 731, 237, 1438, 722, 2315, 280, 288, 758, 441, 949, 6...
## $ activ19  <dbl> 776, 1054, 298, 2015, 1008, 3207, 386, 388, 1053, 695, 127...
## $ activ20  <dbl> 1126, 1398, 385, 2755, 1328, 4207, 507, 548, 1403, 1023, 1...
## $ activ21  <dbl> 1456, 1751, 498, 3557, 1618, 5208, 653, 737, 1807, 1317, 1...
## $ activ22  <dbl> 1795, 2117, 660, 4275, 1893, 6095, 786, 921, 2172, 1579, 2...
## $ activ23  <dbl> 2143, 2496, 801, 5115, 2147, 6979, 925, 1129, 2651, 1881, ...
## $ activ24  <dbl> 2451, 2841, 917, 5756, 2394, 7723, 1057, 1304, 3058, 2111,...
## $ activ25  <dbl> 2810, 3169, 1080, 6329, 2591, 8434, 1221, 1481, 3471, 2408...
## $ activ26  <dbl> 3117, 3496, 1181, 6842, 2766, 9100, 1408, 1626, 3812, 2681...
## $ activ27  <dbl> 3444, 3862, 1384, 7459, 2953, 9874, 1626, 1787, 4286, 3016...
## $ activ28  <dbl> 3746, 4227, 1540, 8022, 3137, 10528, 1877, 1964, 4744, 332...
## $ activ29  <dbl> 4034, 4681, 1677, 8636, 3369, 11287, 2170, 2170, 5284, 366...
## $ activ30  <dbl> 4340, 5075, 1885, 9182, 3553, 11950, 2446, 2322, 5705, 394...
## $ activ31  <dbl> 4460, 5245, 1954, 9427, 3666, 12283, 2638, 2375, 5924, 410...
## $ activ32  <dbl> 4466, 5246, 1956, 9439, 3674, 12305, 2639, 2375, 5934, 410...
## $ death1   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death4   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death5   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death6   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death7   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death8   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death9   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death10  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death11  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death12  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ death13  <dbl> 0, 0, 0, 2, 1, 2, 0, 0, 2, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0...
## $ death14  <dbl> 2, 1, 2, 5, 1, 6, 1, 0, 2, 1, 3, 3, 2, 3, 1, 3, 0, 0, 0, 0...
## $ death15  <dbl> 4, 7, 1, 26, 3, 15, 4, 0, 11, 1, 3, 4, 4, 9, 4, 4, 0, 0, 0...
## $ death16  <dbl> 6, 8, 3, 19, 7, 33, 3, 1, 11, 2, 10, 4, 5, 12, 3, 11, 0, 0...
## $ death17  <dbl> 15, 19, 5, 37, 26, 76, 0, 3, 23, 5, 14, 6, 9, 16, 9, 14, 0...
## $ death18  <dbl> 14, 11, 1, 74, 20, 115, 9, 5, 29, 16, 14, 12, 23, 33, 18, ...
## $ death19  <dbl> 38, 26, 4, 82, 30, 127, 4, 7, 31, 14, 32, 19, 21, 32, 25, ...
## $ death20  <dbl> 33, 30, 11, 98, 46, 174, 16, 4, 50, 14, 22, 23, 24, 37, 12...
## $ death21  <dbl> 44, 36, 8, 117, 49, 176, 12, 3, 40, 22, 45, 24, 22, 52, 26...
## $ death22  <dbl> 47, 34, 9, 114, 37, 127, 5, 7, 60, 17, 26, 29, 12, 50, 25,...
## $ death23  <dbl> 44, 41, 11, 121, 29, 120, 19, 4, 56, 20, 31, 27, 19, 53, 2...
## $ death24  <dbl> 43, 34, 18, 148, 35, 110, 10, 6, 61, 17, 29, 40, 23, 37, 1...
## $ death25  <dbl> 47, 37, 6, 112, 26, 103, 10, 7, 41, 24, 35, 16, 14, 29, 14...
## $ death26  <dbl> 38, 33, 5, 77, 22, 63, 7, 11, 32, 12, 23, 21, 13, 31, 13, ...
## $ death27  <dbl> 27, 31, 5, 72, 19, 67, 12, 4, 39, 11, 29, 19, 6, 26, 16, 1...
## $ death28  <dbl> 19, 25, 9, 56, 25, 65, 8, 2, 26, 8, 15, 15, 7, 33, 12, 16,...
## $ death29  <dbl> 31, 23, 10, 57, 23, 52, 5, 2, 28, 10, 16, 11, 16, 20, 14, ...
## $ death30  <dbl> 20, 20, 7, 52, 17, 29, 5, 0, 20, 3, 13, 7, 9, 26, 9, 14, 0...
## $ death31  <dbl> 19, 16, 8, 33, 12, 29, 6, 2, 24, 8, 13, 7, 9, 16, 5, 14, 0...
## $ death32  <dbl> 8, 6, 1, 18, 3, 17, 3, 0, 8, 2, 10, 4, 2, 10, 10, 6, 0, 0,...
## $ actvrt1  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt2  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt3  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt4  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt5  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt6  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt7  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ actvrt8  <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0...
## $ actvrt9  <dbl> 0.0000000, 0.0000000, 0.0000000, 0.1699283, 0.0000000, 0.0...
## $ actvr10  <dbl> 0.0000000, 0.0000000, 2.5023898, 0.1699283, 0.0000000, 0.1...
## $ actvr11  <dbl> 0.2448334, 1.1254888, 16.5157726, 0.3398566, 0.5078449, 0....
## $ actvr12  <dbl> 1.9586672, 3.0548981, 30.0286774, 1.5293547, 1.5235348, 1....
## $ actvr13  <dbl> 8.3243357, 6.5921486, 32.5310672, 5.6076339, 5.0784493, 4....
## $ actvr14  <dbl> 16.893505, 12.058808, 26.525332, 14.358941, 14.219658, 9.3...
## $ actvr15  <dbl> 29.869675, 23.635264, 26.024854, 25.574209, 31.232463, 22....
## $ actvr16  <dbl> 40.397512, 45.019551, 35.033457, 42.736967, 67.543376, 42....
## $ actvr17  <dbl> 64.146352, 69.297952, 48.045884, 65.167503, 107.663126, 72...
## $ actvr18  <dbl> 94.505694, 90.843023, 62.559745, 94.904955, 150.576023, 10...
## $ actvr19  <dbl> 141.268874, 117.854754, 79.075517, 122.688232, 183.332021,...
## $ actvr20  <dbl> 194.397722, 142.293939, 101.597025, 154.209931, 214.818407...
## $ actvr21  <dbl> 230.143399, 163.999794, 130.624747, 180.039033, 227.514531...
## $ actvr22  <dbl> 249.485238, 170.913511, 181.173020, 192.018978, 224.721384...
## $ actvr23  <dbl> 248.995571, 176.540955, 208.198830, 200.515393, 207.962501...
## $ actvr24  <dbl> 243.609236, 175.254682, 209.700264, 186.836164, 197.043835...
## $ actvr25  <dbl> 248.50590, 169.14489, 210.20074, 174.51636, 177.23788, 128...
## $ actvr26  <dbl> 238.467735, 160.784112, 190.181623, 146.733086, 157.178007...
## $ actvr27  <dbl> 243.11957, 164.16058, 233.72321, 144.69395, 141.94266, 118...
## $ actvr28  <dbl> 229.164065, 170.109590, 230.219860, 143.844305, 138.641667...
## $ actvr29  <dbl> 224.512231, 190.529173, 248.237066, 152.425684, 153.115248...
## $ actvr30  <dbl> 219.370729, 195.031128, 250.739456, 146.393229, 152.353480...
## $ actvr31  <dbl> 174.811050, 163.678226, 207.197874, 119.374630, 134.324985...
## $ actvr32  <dbl> 105.768030, 90.843023, 139.633350, 68.226212, 77.446353, 5...
## $ dethrt1  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt2  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt3  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt4  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt5  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt6  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt7  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt8  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dethrt9  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt10  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt11  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt12  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ dthrt13  <dbl> 0.0000000, 0.0000000, 0.0000000, 0.1699283, 0.2539225, 0.1...
## $ dthrt14  <dbl> 0.4896668, 0.1607841, 1.0009559, 0.4248207, 0.2539225, 0.3...
## $ dthrt15  <dbl> 0.9793336, 1.1254888, 0.5004780, 2.2090679, 0.7617674, 0.8...
## $ dthrt16  <dbl> 1.4690004, 1.2862729, 1.5014339, 1.6143188, 1.7774573, 1.8...
## $ dthrt17  <dbl> 3.6725010, 3.0548981, 2.5023898, 3.1436735, 6.6019842, 4.1...
## $ dthrt18  <dbl> 3.4276676, 1.7686252, 0.5004780, 6.2873471, 5.0784493, 6.3...
## $ dthrt19  <dbl> 9.3036693, 4.1803869, 2.0019118, 6.9670602, 7.6176740, 6.9...
## $ dthrt20  <dbl> 8.079502, 4.823523, 5.505258, 8.326487, 11.680433, 9.58386...
## $ dthrt21  <dbl> 10.772670, 5.788228, 4.003824, 9.940805, 12.442201, 9.6940...
## $ dthrt22  <dbl> 11.507170, 5.466660, 4.504302, 9.685913, 9.395131, 6.99512...
## $ dthrt23  <dbl> 10.772670, 6.592149, 5.505258, 10.280662, 7.363752, 6.6095...
## $ dthrt24  <dbl> 10.527836, 5.466660, 9.008603, 12.574694, 8.887286, 6.0587...
## $ dthrt25  <dbl> 11.507170, 5.949012, 3.002868, 9.515985, 6.601984, 5.67320...
## $ dthrt26  <dbl> 9.3036693, 5.3058757, 2.5023898, 6.5422395, 5.5862943, 3.4...
## $ dthrt27  <dbl> 6.610502, 4.984307, 2.502390, 6.117419, 4.824527, 3.690340...
## $ dthrt28  <dbl> 4.651835, 4.019603, 4.504302, 4.757992, 6.348062, 3.580180...
## $ dthrt29  <dbl> 7.589835, 3.698035, 5.004780, 4.842957, 5.840217, 2.864144...
## $ dthrt30  <dbl> 4.8966681, 3.2156822, 3.5033457, 4.4181358, 4.3166819, 1.5...
## $ dthrt31  <dbl> 4.6518347, 2.5725458, 4.0038237, 2.8038169, 3.0470696, 1.5...
## $ dthrt32  <dbl> 1.9586672, 0.9647047, 0.5004780, 1.5293547, 0.7617674, 0.9...
## $ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((2794860 837..., MULTIPOLYGON ...

4.2 Handle invalid geometries

Ensure that spatial data to be used for analysis has no invalid geometries.

length(which(st_is_valid(municipalities) == FALSE))

## [1] 0

There are no invalid geometries in the data that needs to be handled.

4.3 Handle missing values

Check the data for missing values, as missing values can impact future calculations.

municipalities[rowSums(is.na(municipalities))!=0,]

## Simple feature collection with 0 features and 198 fields
## bbox:           xmin: NA ymin: NA xmax: NA ymax: NA
## projected CRS:  MEXICO_ITRF_2008_LCC
##   [1] CVEGEO   CVE_ENT  CVE_MUN  NOMGEO   Pop2010  Pop2020  new1     new2    
##   [9] new3     new4     new5     new6     new7     new8     new9     new10   
##  [17] new11    new12    new13    new14    new15    new16    new17    new18   
##  [25] new19    new20    new21    new22    new23    new24    new25    new26   
##  [33] new27    new28    new29    new30    new31    new32    cumul1   cumul2  
##  [41] cumul3   cumul4   cumul5   cumul6   cumul7   cumul8   cumul9   cumul10 
##  [49] cumul11  cumul12  cumul13  cumul14  cumul15  cumul16  cumul17  cumul18 
##  [57] cumul19  cumul20  cumul21  cumul22  cumul23  cumul24  cumul25  cumul26 
##  [65] cumul27  cumul28  cumul29  cumul30  cumul31  cumul32  activ1   activ2  
##  [73] activ3   activ4   activ5   activ6   activ7   activ8   activ9   activ10 
##  [81] activ11  activ12  activ13  activ14  activ15  activ16  activ17  activ18 
##  [89] activ19  activ20  activ21  activ22  activ23  activ24  activ25  activ26 
##  [97] activ27  activ28  activ29  activ30  activ31  activ32  death1   death2  
## [105] death3   death4   death5   death6   death7   death8   death9   death10 
## [113] death11  death12  death13  death14  death15  death16  death17  death18 
## [121] death19  death20  death21  death22  death23  death24  death25  death26 
## [129] death27  death28  death29  death30  death31  death32  actvrt1  actvrt2 
## [137] actvrt3  actvrt4  actvrt5  actvrt6  actvrt7  actvrt8  actvrt9  actvr10 
## [145] actvr11  actvr12  actvr13  actvr14  actvr15  actvr16  actvr17  actvr18 
## [153] actvr19  actvr20  actvr21  actvr22  actvr23  actvr24  actvr25  actvr26 
## [161] actvr27  actvr28  actvr29  actvr30  actvr31  actvr32  dethrt1  dethrt2 
## [169] dethrt3  dethrt4  dethrt5  dethrt6  dethrt7  dethrt8  dethrt9  dthrt10 
## [177] dthrt11  dthrt12  dthrt13  dthrt14  dthrt15  dthrt16  dthrt17  dthrt18 
## [185] dthrt19  dthrt20  dthrt21  dthrt22  dthrt23  dthrt24  dthrt25  dthrt26 
## [193] dthrt27  dthrt28  dthrt29  dthrt30  dthrt31  dthrt32  geometry
## <0 rows> (or 0-length row.names)

There are no missing values in the data.

4.4 Define projection

To prepare the spatial data for geospatial analysis, the coordinate reference system (CRS) has to be defined.

The spatial data of Mexico is utilised in this analysis.
Initial data exploration reveals that data is to be projected using the LCC projection, with the Mexico ITRF2008 datum.
The corresponding EPSG code is EPSG:6372.
The CRS of the data will be checked, then assigned accordingly.
Unit of measurement will be in metres.

4.4.1 Check CRS

st_crs(municipalities)

## Coordinate Reference System:
##   User input: MEXICO_ITRF_2008_LCC 
##   wkt:
## PROJCRS["MEXICO_ITRF_2008_LCC",
##     BASEGEOGCRS["ITRF2008",
##         DATUM["International Terrestrial Reference Frame 2008",
##             ELLIPSOID["GRS 1980",6378137,298.257222101,
##                 LENGTHUNIT["metre",1]],
##             ID["EPSG",1061]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["Degree",0.0174532925199433]]],
##     CONVERSION["unnamed",
##         METHOD["Lambert Conic Conformal (2SP)",
##             ID["EPSG",9802]],
##         PARAMETER["Latitude of false origin",12,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8821]],
##         PARAMETER["Longitude of false origin",-102,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8822]],
##         PARAMETER["Latitude of 1st standard parallel",17.5,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8823]],
##         PARAMETER["Latitude of 2nd standard parallel",29.5,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8824]],
##         PARAMETER["Easting at false origin",2500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8826]],
##         PARAMETER["Northing at false origin",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8827]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1,
##                 ID["EPSG",9001]]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1,
##                 ID["EPSG",9001]]]]

4.4.2 Assign CRS

Assign EPSG:6372 as the projection for the data.

municipalities <- st_set_crs(municipalities, 6372)
st_crs(municipalities)

## Coordinate Reference System:
##   User input: EPSG:6372 
##   wkt:
## PROJCRS["Mexico ITRF2008 / LCC",
##     BASEGEOGCRS["Mexico ITRF2008",
##         DATUM["Mexico ITRF2008",
##             ELLIPSOID["GRS 1980",6378137,298.257222101,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",6365]],
##     CONVERSION["Mexico LCC",
##         METHOD["Lambert Conic Conformal (2SP)",
##             ID["EPSG",9802]],
##         PARAMETER["Latitude of false origin",12,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8821]],
##         PARAMETER["Longitude of false origin",-102,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8822]],
##         PARAMETER["Latitude of 1st standard parallel",17.5,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8823]],
##         PARAMETER["Latitude of 2nd standard parallel",29.5,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8824]],
##         PARAMETER["Easting at false origin",2500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8826]],
##         PARAMETER["Northing at false origin",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8827]]],
##     CS[Cartesian,2],
##         AXIS["northing (N)",north,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["easting (E)",east,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["unknown"],
##         AREA["Mexico"],
##         BBOX[12.1,-122.19,32.72,-84.64]],
##     ID["EPSG",6372]]

4.5 Calculate COVID-19 rate

For the analysis, the number of cases per 10,000 population, termed the COVID-19 rate, will be calculated for each municipality for each e-week.

The cumulative number of cases in each e-week will be used for the calculation, to be able to monitor the increases in COVID-19 cases over time.
Latest population data in 2020 (column: Pop2020) will be used for the calculation.

\[COVID{-}19\ rate\ for \ each\ e{-}week= \frac{Total\ cumulative\ cases\ in\ the\ e{-}week}{Total\ population}\ \times\ 10000\]

Compute calculation

Calculate the COVID-19 rate for each municipality for every e-week
Extract only relevant columns for the analysis (municipality information, population count, and COVID-19 rate for e-weeks 13 to 32).
As the analysis focuses on e-weeks 13 to 32, only data for these e-weeks will be extracted from the dataset.

cmexico_covid <- municipalities %>%
  mutate_at(.vars = vars(contains('cumul')),
            .funs = list(rate = ~ (.x / Pop2020) * 10000)) %>%
  select(1:4, Pop2020, ends_with('rate'), geometry) %>%
  select(1:5, 18:38)

Glimpse

glimpse(cmexico_covid)

## Rows: 176
## Columns: 26
## $ CVEGEO       <chr> "09002", "09003", "09004", "09005", "09006", "09007", ...
## $ CVE_ENT      <chr> "09", "09", "09", "09", "09", "09", "09", "09", "09", ...
## $ CVE_MUN      <chr> "002", "003", "004", "005", "006", "007", "008", "009"...
## $ NOMGEO       <chr> "Azcapotzalco", "Coyoacán", "Cuajimalpa de Morelos", "...
## $ Pop2020      <int> 408441, 621952, 199809, 1176967, 393821, 1815551, 2451...
## $ cumul13_rate <dbl> 0.46518347, 0.40196028, 3.30315451, 0.23789962, 0.2793...
## $ cumul14_rate <dbl> 1.12623365, 0.70745009, 3.75358467, 0.81565583, 0.6855...
## $ cumul15_rate <dbl> 2.1300506, 1.7203900, 4.6544450, 1.6398081, 1.6504960,...
## $ cumul16_rate <dbl> 3.8194011, 3.4247016, 6.1058311, 3.3730767, 4.7737424,...
## $ cumul17_rate <dbl> 6.2187684, 6.0294042, 7.8575039, 6.2363686, 8.7095406,...
## $ cumul18_rate <dbl> 9.5729861, 8.9074398, 10.1096547, 9.4055313, 14.016520...
## $ cumul19_rate <dbl> 15.0572543, 13.3611597, 13.0124269, 14.0445739, 20.466...
## $ cumul20_rate <dbl> 22.7939898, 18.4901729, 16.1654380, 19.7966468, 28.566...
## $ cumul21_rate <dbl> 31.020392, 23.988990, 20.719787, 26.406858, 36.310913,...
## $ cumul22_rate <dbl> 38.928511, 30.002315, 28.677387, 32.651723, 43.319173,...
## $ cumul23_rate <dbl> 47.473197, 36.256817, 35.734126, 39.712243, 50.505179,...
## $ cumul24_rate <dbl> 55.063032, 41.064262, 42.590674, 45.591763, 56.091473,...
## $ cumul25_rate <dbl> 63.509785, 47.206215, 49.597365, 50.783072, 62.312573,...
## $ cumul26_rate <dbl> 71.540320, 51.579543, 55.903388, 55.243690, 67.543376,...
## $ cumul27_rate <dbl> 79.277056, 57.721496, 64.311417, 60.112136, 71.733097,...
## $ cumul28_rate <dbl> 86.646541, 62.754039, 72.269017, 64.462300, 76.100563,...
## $ cumul29_rate <dbl> 93.991543, 70.632460, 80.827190, 69.772559, 80.823521,...
## $ cumul30_rate <dbl> 102.120012, 77.497942, 91.837705, 75.405683, 86.739915...
## $ cumul31_rate <dbl> 108.754998, 83.864993, 97.543154, 79.696372, 91.716795...
## $ cumul32_rate <dbl> 109.342598, 84.347345, 97.893488, 80.197661, 93.291114...
## $ geometry     <MULTIPOLYGON [m]> MULTIPOLYGON (((2794860 837..., MULTIPOLY...

4.6 Compute variables for analysis

Compute variables that will be used across the analysis.

title_list: List of titles labels for each e-week (e.g. ‘e-week 13’, ‘e-week 14’)
rate_cols: List of column names to access the COVID-19 rate for each week

# List of title labels
title_list <- c('e-week 13', 'e-week 14', 'e-week 15', 'e-week 16', 'e-week 17', 'e-week 18', 'e-week 19', 'e-week 20', 'e-week 21', 'e-week 22', 'e-week 23', 'e-week 24', 'e-week 25', 'e-week 26', 'e-week 27', 'e-week 28', 'e-week 29', 'e-week 30', 'e-week 31', 'e-week 32' )

# List of column names
rate_cols <- colnames(st_set_geometry(cmexico_covid[6:26], NULL))
rate_cols

##  [1] "cumul13_rate" "cumul14_rate" "cumul15_rate" "cumul16_rate" "cumul17_rate"
##  [6] "cumul18_rate" "cumul19_rate" "cumul20_rate" "cumul21_rate" "cumul22_rate"
## [11] "cumul23_rate" "cumul24_rate" "cumul25_rate" "cumul26_rate" "cumul27_rate"
## [16] "cumul28_rate" "cumul29_rate" "cumul30_rate" "cumul31_rate" "cumul32_rate"

5 Study area

The plot below shows the map of the study area. The spatial data is plotted so as to better understand the data and where the municipalities of each state (Mexico City, Mexico State, Morelos State) in Central Mexico are located.

tm_shape(cmexico_covid) +
  tm_fill('CVE_ENT',
          palette = c('mistyrose3', 'slategray3', 'seashell2'),
          labels = c('Mexico City', 'Mexico State', 'Morelos State'),
          title = 'State') +
  tm_borders(lwd = 0.1, alpha = 1) +
  tm_layout(main.title = 'Central Mexico',
            main.title.position = 'center',
            main.title.size = 1,
            legend.outside = TRUE,
            frame = FALSE)

Municipalities in Central Mexico can be observed to have a relatively large variation in polygon size (land area).
The municipalities are also irregularly shaped.

6 Exploratory data analysis

Conduct initial data exploration to better understand the data and the number of COVID-19 cases.

6.1 Summary statistics

summary(cmexico_covid)

##     CVEGEO            CVE_ENT            CVE_MUN             NOMGEO         
##  Length:176         Length:176         Length:176         Length:176        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##     Pop2020         cumul13_rate      cumul14_rate     cumul15_rate   
##  Min.   :   4236   Min.   :0.00000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:  20628   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :  48678   Median :0.00000   Median :0.0000   Median :0.1438  
##  Mean   : 161878   Mean   :0.13460   Mean   :0.2391   Mean   :0.4419  
##  3rd Qu.: 161528   3rd Qu.:0.09602   3rd Qu.:0.2492   3rd Qu.:0.5548  
##  Max.   :1815551   Max.   :3.30315   Max.   :3.7669   Max.   :4.6544  
##   cumul16_rate     cumul17_rate     cumul18_rate      cumul19_rate   
##  Min.   :0.0000   Min.   :0.0000   Min.   : 0.0000   Min.   : 0.000  
##  1st Qu.:0.0000   1st Qu.:0.3161   1st Qu.: 0.7035   1st Qu.: 1.321  
##  Median :0.4537   Median :1.0178   Median : 1.6731   Median : 2.712  
##  Mean   :0.8922   Mean   :1.6694   Mean   : 2.6274   Mean   : 3.974  
##  3rd Qu.:1.1215   3rd Qu.:2.2477   3rd Qu.: 3.5595   3rd Qu.: 4.751  
##  Max.   :6.2694   Max.   :9.8299   Max.   :14.8524   Max.   :21.597  
##   cumul20_rate     cumul21_rate     cumul22_rate     cumul23_rate   
##  Min.   : 0.000   Min.   : 0.000   Min.   : 0.000   Min.   : 0.000  
##  1st Qu.: 1.722   1st Qu.: 2.475   1st Qu.: 4.098   1st Qu.: 5.262  
##  Median : 3.942   Median : 5.400   Median : 7.200   Median : 8.578  
##  Mean   : 5.667   Mean   : 7.599   Mean   : 9.955   Mean   :12.362  
##  3rd Qu.: 6.403   3rd Qu.: 8.763   3rd Qu.:11.534   3rd Qu.:14.060  
##  Max.   :30.135   Max.   :44.127   Max.   :55.822   Max.   :68.953  
##   cumul24_rate    cumul25_rate      cumul26_rate       cumul27_rate     
##  Min.   : 0.00   Min.   : 0.5329   Min.   :  0.5329   Min.   :  0.7993  
##  1st Qu.: 6.73   1st Qu.: 7.5974   1st Qu.:  9.0733   1st Qu.:  9.9213  
##  Median :10.71   Median :12.7031   Median : 14.1720   Median : 16.7099  
##  Mean   :15.18   Mean   :17.6072   Mean   : 19.8202   Mean   : 22.1463  
##  3rd Qu.:17.33   3rd Qu.:20.1004   3rd Qu.: 22.7162   3rd Qu.: 25.5418  
##  Max.   :84.95   Max.   :97.1508   Max.   :106.9806   Max.   :119.6088  
##   cumul28_rate      cumul29_rate      cumul30_rate      cumul31_rate    
##  Min.   :  1.066   Min.   :  1.066   Min.   :  1.774   Min.   :  1.774  
##  1st Qu.: 10.824   1st Qu.: 11.960   1st Qu.: 13.559   1st Qu.: 14.451  
##  Median : 18.370   Median : 20.343   Median : 22.798   Median : 24.504  
##  Mean   : 24.370   Mean   : 26.699   Mean   : 29.180   Mean   : 30.845  
##  3rd Qu.: 27.914   3rd Qu.: 31.119   3rd Qu.: 33.531   3rd Qu.: 35.605  
##  Max.   :133.483   Max.   :145.080   Max.   :159.215   Max.   :169.619  
##   cumul32_rate              geometry  
##  Min.   :  1.774   MULTIPOLYGON :176  
##  1st Qu.: 14.458   epsg:6372    :  0  
##  Median : 24.504   +proj=lcc ...:  0  
##  Mean   : 31.038                      
##  3rd Qu.: 35.675                      
##  Max.   :170.408

Based on summary statistics, there is generally an increase in the number of COVID-19 cases in Central Mexico with time, as observed from the increasing mean COVID-19 rates across the e-weeks.
It can be observed that the range of COVID-19 rates in the municipalities also increased across the e-weeks, from a range of 3.3 in e-week 13, to a range of about 170 in e-week 32. This indicates that the difference in COVID-19 rates across municipalities in Central Mexico increased with time. Certain municipalities experienced larger increases in COVID-19 cases with time.
By e-week 25, all municipalities had reported COVID-19 cases, indicated by the minimum COVID-19 rate that is no longer equal to 0.

6.2 Time-series box plot

A time-series box plot of COVID-19 rates is plotted to visualise extreme values.

# list to store ggplot2 plot object objects
boxplot_list <- vector(mode = "list", length = length(rate_cols))

# For each e-week, create a boxplot ggplot2 plot object and store it into a list (boxplot_list)
for (i in 1:length(rate_cols)) {
  box_plot <- ggplot(cmexico_covid, aes_string(x = rate_cols[[i]])) +
    geom_boxplot() +
    labs(title = title_list[[i]]) +
    theme_minimal() +
    theme(plot.title = element_text(size = 10),
          axis.title = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank())
  
  boxplot_list[[i]] <- box_plot
}

# Arrange all ggplot2 plot objects stored in boxplot_list, to visualise the boxplots of COVID-19 rates across the e-weeks
ggarrange(plotlist = boxplot_list,
          ncol = 5,
          nrow = 4)

It can be observed that in every e-week, there are multiple municipalities which are upper outliers for COVID-19 cases. These are the municipalities with extremely high COVID-19 rates as compared to the rest of the municipalities.
The thresholds by which COVID-19 rates which are considered to be outliers increase with the e-weeks, as the number of COVID-19 cases increase in Central Mexico.

6.3 Time-series histogram

A time-series histogram of COVID-19 rates is plotted, to visualise the distribution of COVID-19 rates across the e-weeks.

# list to store ggplot2 plot object objects
histplot_list <- vector(mode = "list", length = length(rate_cols))

# For each e-week, create a histogram ggplot2 plot object and store it into a list (histplot_list)
for (i in 1:length(rate_cols)) {
  hist_plot <- ggplot(cmexico_covid, aes_string(x = rate_cols[[i]])) +
    geom_histogram(fill = "darksalmon") +
    labs(title = title_list[[i]]) +
    theme_minimal() +
    theme(plot.title = element_text(size = 10),
          axis.title = element_blank())
  
  histplot_list[[i]] <- hist_plot
}

# Arrange all ggplot2 plot objects stored in histplot_list, to visualise the distribution of COVID-19 rates across the e-weeks
ggarrange(plotlist = histplot_list,
          ncol = 5,
          nrow = 4)

It can be observed that the distribution of COVID-19 rates across the e-weeks are right-skewed.
Initially, in e-week 13, the majority of municipalities had a COVID-19 rate of 0. However, as time passed, the number of municipalities having a COVID-19 rate of 0 decreased across the e-weeks. By e-week 32, all municipalities had COVID-19 cases reported, with no municipalities having a COVID-19 rate of 0.

7 Thematic mapping

To better understand the spatio-temporal distribution of COVID-19 at the municipality level, choropleth mapping techniques will be utilised for the analysis.

Choropleth maps visualising the spatial distribution of COVID-19 rates across municipalities in Central Mexico, will be plotted across time (e-week 13 to 32).
COVID-19 rates will be visualised instead of the number of COVID-19 cases, as the data used for choropleth mapping has to be standardised (to show rate instead of counts). This will enable the comparison of distribution of COVID-19 cases across areas with different population sizes.

Classification Method

To allow for a common basis of comparison of COVID-19 rates across time, a common classification scheme for COVID-19 rates will have to be utilised for COVID-19. Classifying every map using the same class intervals will enable easier comparisons to made.

An equal interval classification scheme will be utilised for the common classification scheme, for easier interpretation across time.
Custom breaks will have to be specified explicitly to construct the common classification scheme for COVID-19 rates.

To guide the specification of breakpoints, descriptive statistics of COVID-19 rates across the e-weeks are first computed and studied.

Summary statistics for COVID-19 rates in e-week 13

summary(cmexico_covid$cumul13_rate)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00000 0.00000 0.00000 0.13460 0.09602 3.30315

Summary statistics for COVID-19 rates in e-week 23

summary(cmexico_covid$cumul23_rate)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   5.262   8.578  12.362  14.060  68.953

Summary statistics for COVID-19 rates in e-week 32

summary(cmexico_covid$cumul32_rate)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.774  14.458  24.504  31.038  35.675 170.408

It can be observed that COVID-19 rates range from 0 to 170 across municipalities, from e-weeks 13 to 32.
Breakpoints can therefore be specified in intervals of 20 from values 0 to 180, with a total number of 9 data classes.
It must however be noted that especially for the earlier e-weeks, where COVID-19 rates are generally low (e.g. COVID-19 rate of e-week 1 is less than 20), breakpoints with intervals of 20 may over-generalise map data, resulting in loss of important spatial patterns.
While more classes (less interval between breakpoints) can be used (such as an interval of 10, resulting in 18 data classes), it will come at the expense of legibility, and the associated risk of map reading errors, since the use of more colours will make it more difficult for the reader to distinguish between the colours.

Therefore, for the purpose of this analysis, a common classification scheme will be constructed, in equal intervals of 20, with a range from 0 to 180.

To mitigate the challenge mentioned above, where the common classification scheme that is constructed could lead to loss of spatial patterns in certain e-weeks, the Jenks natural breaks classification method (with 5 data classes) will also be utilised for spatio-temporal analysis for each map across time. This will enable spatial patterns to be revealed for every e-week, as natural breaks in the data will be found, where class breaks are defined based on on minimising within-class variance and maximising between-class differences.

7.1 Time-series choropleth map of COVID-19 rates: common classification

A common custom classification scheme will be utilised for all the maps plotted.
- Breakpoints: 20, 40, 60, 80, 100, 120, 140, 160
- Minimum of 0, and maximum of 180

# List to store map objects
map_list <- vector(mode = "list", length = length(rate_cols))

# For each e-week, create a tmap object and store it into a list (map_list)
for (i in 1:length(rate_cols)) {
  cmap <- tm_shape(cmexico_covid) +
            tm_fill(col = rate_cols[[i]],
                    palette = 'Reds',
                    breaks = c(0, 20, 40, 60, 80, 100, 120, 140, 160, 180)) +
            tm_borders(lwd = 0.1,
                       alpha = 0.3) +
            tm_layout(panel.show = TRUE,
                      panel.labels = title_list[[i]],
                      panel.label.color = 'gray12',
                      panel.label.size = 0.8,
                      legend.show = FALSE)
  map_list[[i]] <- cmap
}

# Arrange all map objects stored in map_list, to create a time-series visualisation of choropleth maps across the e-weeks
tmap_arrange(map_list, ncol = 5)

A darker shade of red indicates a higher COVID-19 rate.
It can be observed that COVID-19 cases seem to originate from Mexico City, spreading radially outwards in all directions to the neighbouring municipalities with time.
By e-week 32, COVID-19 cases were still the most concentrated at municipalities around the location of origin at Mexico City (higher COVID-19 rate indicated by darker red areas), seemingly forming a custer of COVID-19 cases. While other municipalities saw an increase in COVID-19 cases, the highest number of COVID-19 cases reported were still located in municipalities around Mexico City (darker red areas).

Consistent with the preliminary discussion, spatial patterns in e-weeks 13 to 18 can hardly be observed. Spatial patterns were likely lost due to the classification scheme utilised. Therefore, a time-series choropleth map will also be visualised using the Jenks natural breaks classification scheme.

7.2 Time-series choropleth map of COVID-19 rates: natural breaks classification

Jenks natural breaks classification scheme will be utilised, with 5 data classes.

# List to store map objects
map_list <- vector(mode = "list", length = length(rate_cols))

# For each e-week, create a tmap object and store it into a list (map_list)
for (i in 1:length(rate_cols)) {
  cmap <- tm_shape(cmexico_covid) +
            tm_fill(col = rate_cols[[i]],
                    palette = 'Reds',
                    style = 'jenks',
                    n = 5) +
            tm_borders(lwd = 0.1,
                       alpha = 0.3) +
            tm_layout(panel.show = TRUE,
                      panel.labels = title_list[[i]],
                      panel.label.color = 'gray12',
                      panel.label.size = 0.8,
                      legend.show = FALSE)
  map_list[[i]] <- cmap
}

# Arrange all map objects stored in map_list, to create a time-series visualisation of choropleth maps across the e-weeks
tmap_arrange(map_list, ncol = 5)

A darker shade of red indicates a higher COVID-19 rate.
Utilising the natural breaks classification, spatial patterns can now be observed for all e-weeks.
Similar to observations made using the common classification scheme, COVID-19 cases seem to spread radially from municipalities located at the east side of Central Mexico, indicated by higher COVID-19 rates (darker red areas) at the east side of Central Mexico which decays radially.
However, it can now be observed that COVID-19 cases did not just originate from municipalities in Mexico City. Although there were more municipalities in Mexico City with high COVID-19 rates (darker red areas), there were also municipalities in Mexico State and Morelos State that had high COVID-19 rates, that were located away from the COVID-19 cluster (dark red areas) at municipalities around Mexico City. This is most obvious in e-week 13, where the west and south areas of Central Mexico had municipalities (dark red areas) with high COVID-19 rate, on top of those municipalities located around Mexico City (east of Central Mexico).
COVID-19 cases remain the most concentrated at municipalities around Mexico City, with higher COVID-19 rates (darker red areas), suggesting a cluster of COVID-19 cases in that area.
A small cluster of COVID-19 cases can also be observed at municipalities located in Morelos State (south of Central Mexico), especially from e-weeks 18 to 26. These municipalities had relatively high COVID-19 rates (indicated by darker red areas).

Based on the time-series choropleth map of COVID-19 rates, it can be generally observed that with time, COVID-19 cases spread spatially from the location of origin to neighbouring municipalities. This is indicated by high COVID-19 rates over time, in neighbouring municipalities from the original location with high COVID-19 rates.

Municipalities located around the Mexico City region (east side of Central Mexico) is observed to have higher COVID-19 rates since e-week 13, suggesting to be the location of origin and the cluster for COVID-19 cases in Mexico. However variations in the spatial patterns of COVID-19 rates were also observed with the use of different classification schemes for the choropleth maps.

Hence, while the spread and diffusion of COVID-19 cases and clusters of municipalities with high number of COVID-19 cases can be observed with choropleth mapping, these observations are not conclusive. Although choropleth mapping provides a basic understanding of spatial patterns, it faces several limitations, in particular the variation of spatial patterns observed with the use of different classification schemes.

Preliminary data analysis using choropleth maps therefore reveal limitations in its ability to represent geospatial data in a statistically robust way. Therefore, localised geospatial analysis methods will be utilised in the next section, to statistically detect spatial clusters, outliers, hot spots and cold spots of COVID-19 in Central Mexico. The next section will attempt to establish if there are indeed significant groupings of municipalities with identical COVID-19 rates around a particular location.

8 Localised geospatial statistics

In this section, localised spatial statistical analysis methods will be utilised to analyse the location related tendency in COVID-19 rates in Central Mexico. The following spatial statistics will be utilised for the analysis:

Local Moran’s I: cluster and outlier analysis
Getis-Ord Gi* : hot spot and cold spot area analysis

8.1 Prepare data

Data will be converted to the sp class to facilitate data analysis in this section.

cmexico_covid_sp <- as(cmexico_covid, 'Spatial')
cmexico_covid_sp

## class       : SpatialPolygonsDataFrame 
## features    : 176 
## extent      : 2645804, 2855437, 707708.1, 921773.9  (xmin, xmax, ymin, ymax)
## crs         : +proj=lcc +lat_0=12 +lon_0=-102 +lat_1=17.5 +lat_2=29.5 +x_0=2500000 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## variables   : 25
## names       : CVEGEO, CVE_ENT, CVE_MUN,                    NOMGEO, Pop2020,     cumul13_rate,     cumul14_rate,    cumul15_rate,     cumul16_rate,     cumul17_rate,     cumul18_rate,     cumul19_rate,     cumul20_rate,     cumul21_rate,     cumul22_rate, ... 
## min values  :  09002,      09,     001, Acambay de Ruíz Castañeda,    4236,                0,                0,               0,                0,                0,                0,                0,                0,                0,                0, ... 
## max values  :  17035,      17,     125,                  Zumpango, 1815551, 3.30315451255949, 3.76688512844288, 4.6544449949702, 6.26936126272312, 9.82987852566172, 14.8524441957079, 21.5970323811984, 30.1353940202768, 44.1268269582625, 55.8222298756556, ...

8.2 Define weight matrix

Before the analysis can be conducted, the weight matrix will first to have to defined for municipalities in Central Mexico. The neighbourhood structure of municipalities will be codified using an appropriate weight matrix, to express the spatial dependency between the municipalities in Central Mexico.

Several methods for defining a weight matrix will be evaluated, to aid in the selection of the final weight matrix to be utilised for the analysis.

8.2.1 Contiguity weight matrices

For this method, neighbours at the municipality level in Central Mexico will be determined based on municipalities with contiguous boundaries. There are two methods in determining contiguity:

Rook contiguity considers areas to be adjacent if areas have at least two common boundary points (a segment).
Queen contiguity considers areas to be adjacent if as long as the areas share a common boundary point.

The spatial weight matrices using these two methods will be computed.

Compute rook contiguity weight matrix

wm_rook <- poly2nb(cmexico_covid_sp, queen=FALSE)
summary(wm_rook)

## Neighbour list object:
## Number of regions: 176 
## Number of nonzero links: 946 
## Percentage nonzero weights: 3.053977 
## Average number of links: 5.375 
## Link number distribution:
## 
##  1  2  3  4  5  6  7  8  9 10 14 
##  3  3 18 47 27 31 22 14  6  4  1 
## 3 least connected regions:
## 77 95 121 with 1 link
## 1 most connected region:
## 117 with 14 links

The most connected municipality has 14 neighbours.
There are 3 municipalities which are the least connected with only one neighbour.

Compute queen contiguity weight matrix

wm_queen <- poly2nb(cmexico_covid_sp, queen=TRUE)
summary(wm_queen)

## Neighbour list object:
## Number of regions: 176 
## Number of nonzero links: 962 
## Percentage nonzero weights: 3.10563 
## Average number of links: 5.465909 
## Link number distribution:
## 
##  1  2  3  4  5  6  7  8  9 10 11 14 
##  3  3 18 41 31 31 22 13  9  3  1  1 
## 3 least connected regions:
## 77 95 121 with 1 link
## 1 most connected region:
## 117 with 14 links

Similar to the rook contiguity weight matrix, the most connected municipality has 14 neighbours.
There are 3 municipalities which are the least connected with only one neighbour.

Compute number of regions with no neighbours

print(paste0('Rook contiguity: There are ', length(which(card(wm_rook)==0)), ' municipalities with no neighbours.'))

## [1] "Rook contiguity: There are 0 municipalities with no neighbours."

print(paste0('Queen contiguity: There are ', length(which(card(wm_queen)==0)), ' municipalities with no neighbours.'))

## [1] "Queen contiguity: There are 0 municipalities with no neighbours."

Rook vs Queen contiguity matrices

	Rook	Queen
Average number of links per region	5.3750	5.4659
Number of regions without links	0	0

Results from both contiguity methods were highly similar, which can be attributed to the fact that the polygons for the municipalities have relatively irregular shapes.
All municipalities had at least one neighbour.

Distribution of the number of neighbours for municipalities

# Create data frame using neighbour list for queen contiguity weight matrix.
# For each number of neighbour, count the number of municipalities having that number of neighbouring municipalities
queen_df <- data.frame('Neighbours' = card(wm_queen)) %>%
  group_by(Neighbours) %>%
  summarise(Count = n())
queen_df[is.na(queen_df)] = 0

# Create bar chart visualising the distribution of number of neighbours
ggplot(queen_df, aes(x = Neighbours, y = Count)) +
  geom_col(fill = 'slategray3') +
  labs(title = 'Distribution of Number of Neighbours for Municipalities',
       x = 'Number of Neighbours',
       y = 'Number of Municipalities') +
  theme_minimal() +
  theme(axis.title.y = element_text(margin = margin(t = 0, r = 15, b = 0, l = 0)),
        axis.title.x = element_text(margin = margin(t = 10, r = 0, b = 0, l = 0)),
        plot.title = element_text(margin = margin(t = 0, r = 0, b = 20, l = 0)))

Although all municipalities had at least one neighbour, there is still a large number of municipalities with only a small number of neighbours (less than 6), which might not lead to stable and statistically significant results.

Plot of Queen contiguity neighbour map

plot(cmexico_covid_sp, border="lightgrey")
plot(wm_queen, coordinates(cmexico_covid_sp), pch = 19, cex = 0.6, add = TRUE, col= "red")

From the plot, the use of contiguity based methods lead to edge effects, where municipalities at the edge only have a small number of neighbours. This is apparent in the plot above, especially on the west side of Central Mexico, where the municipalities at the edge only have about 2 to 4 neighbours.
Furthermore, it must be noted from the plot that large municipalities surrounded by smaller municipalities will have far greater number of neighbours than its neighbouring zones.
This can be problematic for the purpose of analysing COVID-19, which is spread through contact and respiratory droplets. Hence for the municipalities in this analysis, being located at a certain distance from one another is more important than being located on which side of the boundary.
Thus, it could be more worthwhile to explore distance-based methods for computing the weight matrix.

8.2.2 Distance-based weight matrices

For this method, neighbours at the municipality level in Central Mexico will be determined based on Euclidean distance. There are two distance-based methods in determining adjacency:

Fixed distance: areas are considered to be adjacent if their centroids fall within a specified Euclidean distance band.
Adaptive distance: each area has k closest areas as neighbours, based on a proximity measure of Euclidean distance.

Obtain centroid of polygons (municipalities) for distance computation

coords <- coordinates(cmexico_covid_sp)

8.2.2.1 Fixed distance

Determine upper limit for distance band

k1 <- knn2nb(knearneigh(coords))
k1dists <- unlist(nbdists(k1, coords))
summary(k1dists)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2037    5734    8080    8725   11027   20497

Based on computations, the largest first nearest neighbour is 20497m (20.497km).
This distance will be utilised as the upper threshold for the distance band, so that all municipalities will have at least one neighbour.

Compute fixed distance weight matrix

wm_fixedd <- dnearneigh(coords, 0, 20497)
summary(wm_fixedd)

## Neighbour list object:
## Number of regions: 176 
## Number of nonzero links: 1710 
## Percentage nonzero weights: 5.520403 
## Average number of links: 9.715909 
## Link number distribution:
## 
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 22 
##  7 11  7 16  8  5 10 13 12  3 10 12  6 14 12 15 10  2  2  1 
## 7 least connected regions:
## 87 95 98 121 139 140 166 with 1 link
## 1 most connected region:
## 49 with 22 links

Each municipality has at least one neighbour.
The average number of links using the fixed distance measurement is 9.7160, which is higher than that of the contiguity measurements (5.3750 and 5.4659 for Rook and Queen respectively).
The most connected municipality has 22 neighbours.
There are 7 municipalities which are the least connected with only one neighbour.

Plot of fixed distance neighbour map

plot(cmexico_covid_sp, border="lightgrey")
plot(wm_fixedd, coords, pch = 19, cex = 0.6, add = TRUE, col= "red")

It can be noted from the plot that using the fixed distance method, regions with a higher number of smaller municipalities located in the area have a larger number of neighbours.
The minimum distance required for a relatively isolated municipality having at least one neighbour, is much higher than the distance to the closest neighbour of a municipality located at a denser region (with more municipalities).
As noted from exploratory data analysis, municipalities in Central Mexico are irregularly spaced and there is a large variation in polygon sizes. Hence, using a minimum distance may not be ideal as it results in significant disparities in the number of neighbours (which can be observed from the plot above) – ranging from 1 to 22, as noted above.
This might produce large estimate variances where data is sparse, and mask subtle local variations where data is dense.

Distribution of the number of neighbours for municipalities

# Create data frame using neighbour list for queen contiguity weight matrix.
# For each number of neighbour, count the number of municipalities having that number of neighbouring municipalities
fixed_df <- data.frame('Neighbours' = card(wm_fixedd)) %>%
  group_by(Neighbours) %>%
  summarise(Count = n())
fixed_df[is.na(fixed_df)] = 0

# Create bar chart visualising the distribution of number of neighbours
ggplot(fixed_df, aes(x = Neighbours, y = Count)) +
  geom_col(fill = 'slategray3') +
  labs(title = 'Distribution of Number of Neighbours for Municipalities',
       x = 'Number of Neighbours',
       y = 'Number of Municipalities') +
  theme_minimal() +
  theme(axis.title.y = element_text(margin = margin(t = 0, r = 15, b = 0, l = 0)),
        axis.title.x = element_text(margin = margin(t = 10, r = 0, b = 0, l = 0)),
        plot.title = element_text(margin = margin(t = 0, r = 0, b = 20, l = 0)))

Number of neighbours for municipalities range from 1 to 22, with about half of the municipalities having 10 or more neighbours (average number of links is 9.7160).
With a high parity in neighbour connectivity, the fixed distance matrix may not be suitable for use in this analysis.

8.2.2.2 Adaptive distance

As COVID-19 rates were observed to be skewed in exploratory data analysis (time-series histogram), each municipality should be evaluated in the context of at least 8 neighbours for results to be stable.
The average number of links for each area to have at least one neighbour is 9.7160, as computed using the fixed distance method. This average will be used as a reference for the selection of k nearest neighbours in the adaptive distance method.
Noting that with the selection of more neighbours (higher k), spatial patterns become increasingly homogenised, 9 nearest neighbours are chosen for the computation of the adaptive distance matrix.

Compute adaptive distance matrix of 9 nearest neighbours

wm_knn9 <- knn2nb(knearneigh(coords, k=9))
wm_knn9

## Neighbour list object:
## Number of regions: 176 
## Number of nonzero links: 1584 
## Percentage nonzero weights: 5.113636 
## Average number of links: 9 
## Non-symmetric neighbours list

Each municipality has 9 neighbours, there are no municipalities with no neighbours.

Summary statistics of neighbour distances with 9 nearest neighbours

k9dists <- unlist(nbdists(wm_knn9, coords))
summary(k9dists)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2037   10449   14291   15814   19206   57035

Using 9 nearest neighbours, the maximum distance between neighbours is 57035m (57.035km). This is still a reasonable distance for residents to travel (by car) and interact with others, which can lead the spread of COVID-19.

Therefore, upon evaluation of the various methods for defining a weight matrix, the adaptive distance method seems to be the most ideal.
The problem of a high disparity in number of neighbours for the fixed distance method, is mitigated through the use of the adaptive distance method for weight matrix computation.
Furthermore, using adaptive distance also mitigates the problem of having areas with very small number of neighbours (a problem faced by both fixed distance and contiguity based methods).
In the context of COVID-19, a distance based weight matrix is also the most suitable, since COVID-19 can be spread across municipalities as long as there is interaction between residents. This makes the distance between neighbours an important factor in determining whether the spread is possible. Using the adaptive distance method of 9 nearest neighbours, it has been observed that the maximum distance between neighbours still makes it possible for residents at different municipalities to interact (such as when residents travel from one place to another), which will lead to the spread in COVID-19.

As such, the adaptive distance weight matrix with 9 nearest neighbours will be utilised for the analysis.

Plot of adaptive distance neighbour map

plot(cmexico_covid_sp, border="lightgrey")
plot(wm_knn9, coords, pch = 19, cex = 0.6, add = TRUE, col= "red")

8.3 Row-standardised weight matrix

A row-standardised weight matrix will be constructed, based on 9 nearest neighbours.

Weights are assigned to each neighbouring municipality.
Standardisation is conducted, because the degree of connection of a municipality (sum of weights of neighbours) will depend on the number of its neighbours if the weight matrix is not standardised. This will create heterogeneity between municipalities.
In a row-standardised weight matrix: For a particular municipality, each neighbouring municipality is assigned a fraction of \(1\ /\ Total\ number\ of\ neighbours\). Each weight can thus be interpreted as the fraction of spatial influence on the particular municipality that is ascribable to the neighbouring municipality.
With weights assigned to each neighbour, in the computation of average neighbouring COVID-19 rate, each neighbour’s COVID-19 rate will be multiplied by its weight (\(1\ /\ Total\ number\ of\ neighbours\)), then tallied.

\[\sum^n_{j=1}w_{ij}=1 \] The sum of weights of its neighbours for a particular municipality is equal to 1.

Compute row-standardised weight matrix

rswm_knn9 <- nb2listw(wm_knn9, style = 'W')
rswm_knn9

## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 176 
## Number of nonzero links: 1584 
## Percentage nonzero weights: 5.113636 
## Average number of links: 9 
## Non-symmetric neighbours list
## 
## Weights style: W 
## Weights constants summary:
##     n    nn  S0       S1       S2
## W 176 30976 176 35.16049 720.9383

8.4 Local Moran’s I

Local Moran’s I (a local spatial autocorrelation indicator - LISA) will be utilised to detect spatial clusters of municipalities with high or low COVID-19 rates, and detect outliers. The analysis will be conducted across the time period that is being studied: e-week 13 to e-week 32, to better analyse and understand spatial clusters and outliers across time.

Interpretation of local Moran’s I values

\(I_i<0\): indicates negative spatial local autocorrelation, suggesting a combination of dissimilar values (high values surrounded by low values, or low values surrounded by high values)
\(I_i>0\): indicates positive spatial local autocorrelation, suggesting a grouping of similar values (that are higher or lower than average)

8.4.1 Compute Local Moran’s I

Compute Local Moran’s I of COVID-19 rate at the municipality level, for e-week 13 and e-week 32

The local Moran statistic, expectation, variance, standard deviate and p-value of local Moran statistic will be computed for each municipality.

localmoran_13 <- localmoran(cmexico_covid_sp$cumul13_rate, rswm_knn9)
localmoran_32 <- localmoran(cmexico_covid_sp$cumul32_rate, rswm_knn9)

cmexico_localmoran13 <- cbind(cmexico_covid_sp, localmoran_13)
cmexico_localmoran32 <- cbind(cmexico_covid_sp, localmoran_32)

Coefficients for Local Moran’s I for e-week 13 for the first 10 municipalities

Ii: local Moran statistic
E.Ii: expectation of local Moran statistic
Var.Ii: variance of local Moran statistic
Z.Ii: standard deviate of local Moran statistic
Pr(): p-value of local Moran statistic

head(localmoran_13, 10)

##             Ii         E.Ii     Var.Ii        Z.Ii    Pr(z > 0)
## 1   0.95865978 -0.005714286 0.08529692  3.30201356 4.799672e-04
## 2   0.90322512 -0.005714286 0.08529692  3.11220547 9.284760e-04
## 3  12.26403617 -0.005714286 0.08529692 42.01158446 0.000000e+00
## 4   0.26154780 -0.005714286 0.08529692  0.91510449 1.800684e-01
## 5   0.42947255 -0.005714286 0.08529692  1.49007826 6.810183e-02
## 6   0.07973037 -0.005714286 0.08529692  0.29256221 3.849284e-01
## 7   2.70678841 -0.005714286 0.08529692  9.28760014 7.890309e-21
## 8   0.02321467 -0.005714286 0.08529692  0.09905263 4.605482e-01
## 9   3.78215757 -0.005714286 0.08529692 12.96966054 9.091692e-39
## 10  0.00865042 -0.005714286 0.08529692  0.04918470 4.803861e-01

Coefficients for Local Moran’s I for e-week 32 for the first 10 municipalities

head(localmoran_32, 10)

##          Ii         E.Ii    Var.Ii      Z.Ii    Pr(z > 0)
## 1  3.861071 -0.005714286 0.1009115 12.172496 2.177787e-34
## 2  4.145869 -0.005714286 0.1009115 13.069028 2.474681e-39
## 3  2.845183 -0.005714286 0.1009115  8.974519 1.422969e-19
## 4  2.528878 -0.005714286 0.1009115  7.978801 7.388088e-16
## 5  4.060521 -0.005714286 0.1009115 12.800357 8.159912e-38
## 6  2.453082 -0.005714286 0.1009115  7.740201 4.963004e-15
## 7  3.643687 -0.005714286 0.1009115 11.488178 7.563415e-31
## 8  4.138931 -0.005714286 0.1009115 13.047190 3.296731e-39
## 9  2.945759 -0.005714286 0.1009115  9.291129 7.633025e-21
## 10 4.591182 -0.005714286 0.1009115 14.470857 9.257492e-48

8.4.2 Map Local Moran’s I

To enable a better understanding of the calculated scores, local Moran’s I scores for e-week 13 and 32 are visualised on a choropleth map.

Function for constructing choropleth map

localmoran_map <- function(df, eweek) {
  tm_shape(df) +
  tm_fill(col = "Ii", 
          style = "pretty",
          palette = "RdBu",
          title = 'Local Moran Statistics') +
  tm_borders(alpha = 0.5) +
  tm_layout(panel.labels = sprintf('Local Moran\'s I in e-week %s', eweek),
            panel.label.color = 'gray12')
}

Local Moran’s I plot for e-week 13 and 32

moranplot_13 <- localmoran_map(cmexico_localmoran13, 13)
moranplot_32 <- localmoran_map(cmexico_localmoran32, 32)
tmap_arrange(moranplot_13, moranplot_32)

Blue areas (positive local Moran’s I) indicate positive spatial local autocorrelation, while red areas (negative local Moran’s I) indicate negative spatial local autocorrelation.
The choropleth local Moran’s I plot shows evidence of both positive and negative local Moran’s I values at the start and end of the epidemiological time period being studied (e-weeks 13 and 32).
However, it is more useful to consider the local Moran statistics with its corresponding p-values, to interpret if the values have statistical significance.

As such, local Moran’s I values and the corresponding p-values will be visualised spatially side-by-side, for e-weeks 13 and 32, to understand which areas have local Moran’s I values that are statistically significant.

Function to construct p-value map

pvalue_map <- function(df, eweek) {
  tm_shape(df) +
  tm_fill(col = "Pr.z...0.", 
          breaks=c(-Inf, 0.001, 0.01, 0.05, 0.1, Inf),
          palette = "-Blues",
          title = 'Local Moran\'s I p-values') +
  tm_borders(alpha = 0.5) +
  tm_layout(panel.labels = sprintf('Local Moran\'s I p-values in e-week %s', eweek),
            panel.label.color = 'gray12')
}

Plot of local Moran’s I and corresponding p-values for e-weeks 13 and 32

tmap_arrange(moranplot_13, pvalue_map(cmexico_localmoran13, 13),
             moranplot_32, pvalue_map(cmexico_localmoran32, 32),
             ncol=2)

E-week 13

It can be observed that in e-week 13, there is relatively high and positive local Moran’s I values (indicated by darker blue areas on the local Moran’s I plot at the left-hand side), at the municipalities bordering Mexico City and Mexico State.
These values are statistically significant at 99.9% confidence level, with p-values of less than 0.001 (indicated by the corresponding areas that are in darkest shade of blue at the p-value plot on the right-hand side).
This means that there is significant and positive local autocorrelation for COVID-19 rate at these bordering municipalities, suggesting a grouping of municipalities with COVID-19 rates that are higher than average for these bordering municipalities at e-week 13.

E-week 32

By e-week 32 as time passed, municipalities located in Mexico City were the areas with positive and relatively higher Moran’s I values, which are statistically significant at 99.9% confidence level (p-value less than 0.001).
There is therefore significant and positive local autocorrelation for COVID-19 rate at municipalities of Mexico City at e-week 32, suggesting that this group of municipalities are a cluster for COVID-19 at e-week 32, with COVID-19 rates that are higher than average.

8.4.3 LISA cluster map

For an even more insightful analysis, LISA cluster maps will be created, such that local Moran’s I values are overlaid with the significance levels. The map will visualise statistically significant municipalities, colour coded by the type of spatial autocorrelation. This will enable the identification of cluster and outliers at the municipality level in Central Mexico.

Municipalities that are statistically insignificant are excluded from the analysis, while municipalities with statistically significant LISAs are grouped into four types of spatial groupings.

This analysis will define a statistical significance level of 5% for local Moran’s I. Each municipality will be grouped into one of the five categories:

High-High (HH)
High-Low (HL)
Low-High (LH)
Low-Low (LL)
Insignificant

The LISA spatial classification scheme will enable better interpretation of local Moran’s I results, and can be interpreted as such:

Municipality	Description	Municipality COVID-19 Rate	Neighbours’ COVID-19 Rate
HH	Municipality with high COVID-19 rate surrounded by neighbours also with high COVID-19 rate	Above average	Above average
HL	Municipality with high COVID-19 rate surrounded by neighbours with low COVID-19 rate	Above average	Below average
LH	Municipality with low COVID-19 rate surrounded by neighbours with high COVID-19 rate	Below average	Above average
LL	Municipality with low COVID-19 rate surrounded by neighbours also with low COVID-19 rate	Below average	Below average

For significant local Moran’s I, two types of associations can be inferred:

Spatial clusters: HH or LL (\(I_i>0\))
Spatial outliers: HL or LH (\(I_i<0\))

8.4.3.1 Moran’s diagram

For a quick reading of the spatial structure of COVID-19 cases at a municipality level in Central Mexico, a Moran’s scatterplot is plotted.

It illustrates the relationship between the COVID-19 rates at each municipality, and the average value of COVID-19 rates at neighboring locations.
The COVID-19 rate (variable of interest) that is used in the Moran’s diagram is one that is standardised – it is centered and scaled through subtraction of its mean and dividing the by the standard deviation.
There are four quadrants, each corresponding with one of the four LISA classification spatial association.
- Top left: LH
- Top right: HH
- Bottom left: LL
- Bottom right: HL
The density of points in each of the quadrants is used to visualise the dominant spatial structure.
Additionally, the direction and magnitude of global autocorrelation can be observed in a Moran’s diagram using the slope in Moran’s diagram.

cmexico_covid_sp$Z_cumul13_rate <- scale(cmexico_covid_sp$cumul13_rate) %>% as.vector 
cmexico_covid_sp$Z_cumul32_rate <- scale(cmexico_covid_sp$cumul32_rate) %>% as.vector 

par(mfrow=c(1,2))
moran.plot(cmexico_covid_sp$Z_cumul13_rate,
           rswm_knn9,
           labels=as.character(cmexico_covid_sp$NOMGEO),
           xlab="COVID-19 Rate (e-week 13)",
           ylab="Spatially Lagged COVID-19 Rate (e-week 13)")

moran.plot(cmexico_covid_sp$Z_cumul32_rate,
           rswm_knn9,
           labels=as.character(cmexico_covid_sp$NOMGEO),
           xlab="COVID-19 Rate (e-week 32)",
           ylab="Spatially Lagged COVID-19 Rate (e-week 32)")

Positive global autocorrelation can be observed for COVID-19 cases in e-weeks 13 and 32.
With time, by e-week 32, there is stronger positive global autocorrelation for COVID-19 rates than in e-week 13. This is indicated by a steeper slope on the Moran’s diagram.
By e-week 32, there are more high influence municipalities at the HH quadrant than e-week 13, indicating more municipalities with higher than average COVID-19 rates in a neighbourhood similar to it.

8.4.3.2 Construct LISA cluster map

Function to create LISA cluster map

lisa_map <- function(localmoran, df_var, df_localmoran, sigf, showlegend=FALSE) {
  
  quadrant <- vector(mode="numeric", length=nrow(localmoran))
  
  # Centre variable of interest around its mean
  DV <- df_var - mean(df_var)     
  
  # Centre local Moran's around the mean
  C_mI <- localmoran[,1] - mean(localmoran[,1]) 
  
  # Set statistical significance
  signif <- sigf 
  
  # Define HH, LL, LH, HL categories
  quadrant[DV >0 & C_mI>0] <- 4      
  quadrant[DV <0 & C_mI<0] <- 1      
  quadrant[DV <0 & C_mI>0] <- 2
  quadrant[DV >0 & C_mI<0] <- 3
  
  # Place non significant local Moran's in category 0
  quadrant[localmoran[,5]>signif] <- 0 
  
  # Map customisation
  df_localmoran$quadrant <- quadrant
  colors <- c("#ffffff", "#2c7bb6", "#abd9e9", "#ffcccb", "#d7191c")
  clusters <- c("insignificant", "low-low", "low-high", "high-low", "high-high")
  
  tm_shape(df_localmoran) +
    tm_fill(col = "quadrant",
            style = "cat",
            palette = colors[c(sort(unique(quadrant)))+1],
            labels = clusters[c(sort(unique(quadrant)))+1]) +
    tm_borders(lwd = 0.1,
               alpha=0.3) +
    tm_layout(panel.show = TRUE,
          legend.show = showlegend,
          panel.labels = title_list[[i]],
          panel.label.color = 'gray12',
          panel.label.size = 0.8)
}

Plot time-series LISA cluster map for e-weeks 13 to 32

# List to store map objects
lisamap_list <- vector(mode = "list", length = length(rate_cols))

# For each e-week, create a tmap object and store it into a list (lisamap_list)
for (i in 1:length(rate_cols)) {
  
  # Compute local Moran's I for the e-week
  localmoran_week <- localmoran(cmexico_covid_sp[[rate_cols[[i]]]], rswm_knn9)
  cmexico_localmoran_week <- cbind(cmexico_covid_sp, localmoran_week)
  
  # Create LISA cluster map for the e-week
  lisacmap <- lisa_map(localmoran_week,
                       cmexico_covid_sp[[rate_cols[[i]]]],
                       cmexico_localmoran_week,
                       0.05) 
  
  # Add LISA cluster map to list
  lisamap_list[[i]] <- lisacmap
}

# Arrange all map objects stored in lisamap_list, to create a time-series visualisation of LISA cluster maps across the e-weeks
tmap_arrange(lisamap_list, ncol = 5)

The LISA cluster maps above reveal interesting spatial relationships across time that are statistically significant at 95% confidence level:

A statistically significant spatial cluster of municipalities with high-high values (dark red areas) can first be observed in e-week 13. This cluster comprises municipalities located in Mexico City, as well as some municipalities bordering Mexico City and Mexico State. The municipalities in this cluster have high COVID-19 rates, and are surrounded by neighbours who also have high COVID-19 rates. Over time, from e-week 13 to e-week 32, this spatial cluster of municipalities with high COVID-19 rates grow, and more surrounding neighbours also have high-high values. This is indicated by the growing cluster of red areas over time. This suggests that the high number of COVID-19 cases in the original spatial cluster in e-week 13, spread over time to more neighbouring municipalities.
At e-week 17, two more statistically significant spatial clusters of municipalities with low-low vales (dark blue areas) can be observed at the west side of Central Mexico, at Mexico State. The municipalities in this cluster have low COVID-19 rates, and are surrounded by neighbours also with low COVID-19 rates. By e-week 18, these two spatial clusters merged into a single cluster that is larger, comprising of more municipalities. From e-week 18 to e-week 32, over time, the municipalities in the large low-low spatial cluster slowly evolved into low-high outliers. This means that the municipalities with low COVID-19 rates became surrounded with neighbours that have high COVID-19 rates. This could indicate that COVID-19 was not contained well for the neighbouring municipalities over time. Eventually, by e-week 32, there are significant outliers of municipalities with low-high values at the west side of Central Mexico in Mexico State. There are also some outliers with low-high values scattered across municipalities located in Morelo state. COVID-19 can be inferred to be contained relatively well in municipalities belonging to these low-high spatial clusters, in contrast to their neighbouring municpalities.

It would also be interesting to study the distribution of population in Central Mexico in union with the LISA cluster maps. The LISA cluster map in e-week 32 is plotted with a choropleth map visualising population in Central Mexico.

tmap_arrange(lisa_map(localmoran_32, cmexico_covid_sp$cumul32_rate, cmexico_localmoran32, 0.05, showlegend=TRUE),
             tm_shape(cmexico_covid) +
               tm_fill('Pop2020', title = 'Population in 2020') +
               tm_borders(lwd = 0.1,
                          alpha = 0.5))

It can be observed that the spatial cluster of high COVID-19 rates at Central Mexico in the LISA cluster map, corresponds (although not completely) with municipalities that have a higher population. This suggests that locations that are more densely populated have higher probability of COVID-19 cases, and increases the spread of COVID-19. This is consistent with current literature on the spread on COVID-19 and the need for physical distancing to limit the spread of COVID-19.

8.5 Getis-Ord Gi*

To detect hot spot areas with high COVID-19 rates, and cold spot areas with low COVID-19 rates in Central Mexico, Getis-Ord Gi* statistics. Time-series analysis will be conducted to understand the evolution of spatial hot spots and cold spots across time.

Interpretation of Gi* values

\(Gi^*>0\): indicates grouping of areas with values higher than average
\(Gi^*<0\): indicates grouping of areas with values lower than average

A larger magnitude represents a greater intensity of grouping.

For significant Gi* statistic values, two spatial associations can be inferred:

Hot spot areas: where \(Gi^*>0\), indicating that a location is associated with relatively high values in the surrounding locations.
Cold spot areas: where \(Gi^*<0\), indicating that a location is associated with relatively low values in the surrounding locations.

8.5.1 Compute and map Gi*

Gi* statistics at a municipality level for each e-week will be computed and visualised spatially on a choropleth map across time, for e-weeks 13 to 32. This will enable a better understanding spatial hot spot and cold spot areas at the municipality level across time.

Note that self-weights are included in the computation of Gi*.

# List to store map objects
gimap_list <- vector(mode = "list", length = length(rate_cols))

# For each e-week, create a tmap object and store it into a list (gimap_list)
for (i in 1:length(rate_cols)) {
  
  # Compute Gi* for the e-week
  gi_week <- localG(cmexico_covid_sp[[rate_cols[[i]]]], nb2listw(include.self(wm_knn9), style = 'W'))
  cmexico_gi_week <- cbind(cmexico_covid_sp, as.matrix(gi_week))
  names(cmexico_gi_week)[28] <- "gstar"

  # Create Gi* plot
  gimap <- tm_shape(cmexico_gi_week) +
    tm_fill(col = "gstar", 
            style = "pretty",
            palette = "-RdBu",
            title = "Gi*") +
    tm_borders(lwd = 0.1,
               alpha=0.3) +
    tm_layout(panel.show = TRUE,
          legend.show = TRUE,
          legend.title.size = 0.9,
          legend.text.size = 0.8,
          panel.labels = title_list[[i]],
          panel.label.color = 'gray12',
          panel.label.size = 0.8)

  # Add Gi* tmap object to list
  gimap_list[[i]] <- gimap
}

# Arrange all map objects stored in gimap_list, to create a time-series visualisation of Gi* maps across the e-weeks
tmap_arrange(gimap_list, ncol = 5)

Red areas indicate hot spot areas, where a municipality is associated with relatively high COVID-19 rates in the surrounding municipalities. Blue areas indicate cold spot areas, where a municipality is associated with relatively low COVID-19 rates in the surrounding locations. The darkness of the shade indicates the intensity of the Gi* values.

From the time-series Gi* plot, several interesting observations can be made:

At e-week 13, there is an initial hot spot area (red areas) of high COVID-19 rates at municipalities located around Mexico City. Over time, this hot spot grew, with more surrounding municipalities having high COVID-19 rates, indicated by an increase in the number of municipalities having areas shaded in red. By e-week 32, the hot spot area of high COVID-19 rates was located at Mexico City (a slight downward shift from the original hot spot area). This is consistent with the findings from Local Moran’s I where a statistically significant cluster of high COVID-19 rates which increased in size over time, was observed at Mexico City.
A large cold spot area of low COVID-19 rates can also be observed at Mexico State, west of Central Mexico, which grew in intensity particularly from e-weeks 18 to 27.
At e-week 13, there was also another less intense hot spot area (light red areas) located at the west of Central Mexico, at Mexico State. However, by e-week 15, the municipalities were no longer a hot spot area. By e-week 18, these municipalities even became a cold spot area up until e-week 30. This could suggest that the efforts made by these municipalities to contain COVID-19 worked well, and corresponds with the observations made from Local Moran’s I LISA cluster analysis.

9 Conclusion

The spatio-temporal analysis of COVID-19 cases at a municipality level in Central Mexico revealed several spatial patterns across time.
Preliminary analysis with thematic mapping suggested a cluster of high COVID-19 cases originating from municipalities located around Mexico City, that grew in size with time as COVID-19 spread to surrounding municipalities radially.
Further statistical analysis using Local Moran’s I and Getis-Ord Gi* showed statistical evidence of the preliminary observations.
LISA cluster maps of Local Moran’s I revealed evidence of statistically significant clustering of high COVID-19 rates at municipalities in Mexico City. This cluster also grew over time comprising more municipalities, which can be attributed to the spread of COVID-19 to surrounding municipalities. Analysis with Getis-Ord also point to evidence of a hot spot area with high COVID-19 rates at Mexico City that grew with time.
Furthermore, cold spot areas and statistically significant clustering of municipalities with low COVID-19 rates were also detected at the west of Central Mexico, in Mexico State.

10 References

Vincent Z W Mack & Tin Seong, Kam (2018) “Is There Space for Violence?: A Data-driven Approach to the Exploration of Spatial-Temporal Dimensions of Conflict”, GeoHumanities’18: Proceedings of the 2nd ACM SIGSPATIAL Workshop on Geospatial Humanities, pp. 1-10.
Patricia Carracedo et. al. (2018) “Detecting spatio-temporal mortality clusters of European countries by sex and age”, International Journal for Equity in Health, pp. 17:38.
Tsai, P., Lin, M., Chu, C. et al. Spatial autocorrelation analysis of health care hotspots in Taiwan in 2006. BMC Public Health 9, 464 (2009). https://doi.org/10.1186/1471-2458-9-464
Arie, P., .., IERRE, D., ELLEFON, INCENT, & OONIS (2018). Codifying the neighbourhood structure.
Handbook of Spatial Analysis

Detecting Spatio-Temporal Patterns of COVID-19 in Central Mexico

IS415 Geospatial Analytics and Applications | Take-Home Exercise 2

Xiao Rong Wong

1 Introduction

2 Install and load packages

3 Data import

Import

Glimpse

4 Data wrangling

4.1 Extract data

Extract

Check extract

4.2 Handle invalid geometries

4.3 Handle missing values

4.4 Define projection

4.4.1 Check CRS

4.4.2 Assign CRS

4.5 Calculate COVID-19 rate

Compute calculation

Glimpse

4.6 Compute variables for analysis

5 Study area

6 Exploratory data analysis

6.1 Summary statistics

6.2 Time-series box plot

6.3 Time-series histogram

7 Thematic mapping

7.1 Time-series choropleth map of COVID-19 rates: common classification

7.2 Time-series choropleth map of COVID-19 rates: natural breaks classification

8 Localised geospatial statistics

8.1 Prepare data

8.2 Define weight matrix

8.2.1 Contiguity weight matrices

8.2.2 Distance-based weight matrices

8.2.2.1 Fixed distance

8.2.2.2 Adaptive distance

8.3 Row-standardised weight matrix

8.4 Local Moran’s I

8.4.1 Compute Local Moran’s I

8.4.2 Map Local Moran’s I

8.4.3 LISA cluster map

8.4.3.1 Moran’s diagram

8.4.3.2 Construct LISA cluster map

8.5 Getis-Ord Gi*

8.5.1 Compute and map Gi*

9 Conclusion

10 References