Introduction to Spatial Analysis using R

1.0 Basic Mapping

1.1 Selecting a specific Region

First we need to load the necessary libraries. In the realm of spatial data analysis using R, a powerful suite of packages comes together to form a cohesive workflow—each playing a distinct role in transforming raw geographic data into meaningful insights.

It begins with sf, the backbone of modern vector data handling in R. Short for Simple Features, this package allows users to work seamlessly with spatial objects like points, lines, and polygons. Whether you’re mapping barangay boundaries in Davao or analyzing road networks, sf provides a tidy and efficient way to store, manipulate, and visualize spatial features. Its compatibility with the tidyverse makes it especially appealing for data scientists who prefer clean, pipe-friendly syntax.

Once the spatial data is loaded, ggplot2 steps in as the visualization powerhouse. Known for its elegant grammar of graphics, ggplot2 can render maps with stunning clarity using geom_sf(). Whether you’re plotting rainfall distribution or highlighting flood-prone zones, this package ensures your spatial story is not only accurate but visually compelling.

To prepare and refine the data before mapping, dplyr offers a robust toolkit for data manipulation. Filtering barangays by population density, summarizing rainfall averages, or joining spatial data with census tables becomes intuitive and efficient. Its verbs—like filter(), mutate(), and summarize()—make data wrangling feel like fluent conversation.

When working with place names, labels, or metadata, stringi becomes indispensable. This package specializes in string processing, helping clean and standardize textual data. Whether you’re harmonizing barangay names across datasets or formatting labels for maps, stringi ensures consistency and precision.

Finally, for deeper spatial analysis, spdep introduces tools for understanding spatial dependence and autocorrelation. It allows researchers to explore how geographic features influence each other—such as identifying clusters of dengue cases or rainfall anomalies. With functions for creating neighbor lists and calculating Moran’s I, spdep adds a layer of statistical rigor to spatial investigations.

Together, these packages form a powerful narrative: from loading and cleaning data, to visualizing patterns and uncovering spatial relationships. In a city like Davao, where geography plays a vital role in planning and resilience, this toolkit empowers analysts to turn maps into actionable knowledge.

library(sf)
library(ggplot2)
library(dplyr)
library(stringi)
library(spdep)

To begin any spatial analysis project in R, one of the first steps is to load the geographic data that defines your area of interest. In this case, the focus is on the administrative boundaries of the Philippines, specifically at the provincial level. These boundaries are stored in a GeoJSON file—a lightweight format ideal for web-based mapping and spatial data exchange.

Using the sf package, which is designed for handling vector spatial data, we read the GeoJSON file into R with a simple command. This file, named gadm41_PHL_1.json, contains GADM Level 1 data, which corresponds to the provinces of the Philippines. By executing the st_read() function, we convert the raw geographic data into an sf object called ph_prov. This object now holds all the spatial geometries and associated attributes—such as province names, codes, and hierarchical levels—ready for analysis and visualization.

This step is foundational. Once the data is loaded, it opens the door to a wide range of possibilities: filtering for specific regions like Davao, overlaying with raster data such as rainfall or elevation, and visualizing patterns using ggplot2. It’s the moment when static data becomes dynamic—ready to be explored, mapped, and understood in the context of real-world challenges and opportunities.

# 1. Load Philippine regions (GADM Level 1)
ph_prov <- st_read("C:/Users/User/Desktop/Spatial Analysis/gadm41_PHL_1.json/gadm41_PHL_1.json", layer = "gadm41_PHL_1")

## Reading layer `gadm41_PHL_1' from data source 
##   `C:\Users\User\Desktop\Spatial Analysis\gadm41_PHL_1.json\gadm41_PHL_1.json' 
##   using driver `GeoJSON'
## Simple feature collection with 81 features and 11 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 116.9283 ymin: 4.5869 xmax: 126.6052 ymax: 21.0699
## Geodetic CRS:  WGS 84

After successfully loading the GADM Level 1 shapefile into R using the sf package, the next logical step is to inspect the dataset to understand its structure—particularly the names of the provinces it contains. This is crucial for filtering, mapping, or joining data later on.

The command below retrieves a list of all unique province names from the NAME_1 column of the ph_prov object. This column typically holds the administrative names at Level 1, which in the case of the Philippines corresponds to provinces and highly urbanized cities.

By running this line, you gain a clear reference of how provinces are labeled in the dataset. This helps avoid mismatches when filtering for specific regions—like the Davao provinces—or when merging with external datasets such as rainfall, population, or health statistics. It also ensures consistency in naming conventions, which is especially important when working with multiple sources of spatial and tabular data.

This step acts as a checkpoint, confirming that the data was read correctly and is ready for targeted analysis.

# Check available province names
unique(ph_prov$NAME_1)  # Verify province names in GADM

##  [1] "Abra"               "AgusandelNorte"     "AgusandelSur"      
##  [4] "Aklan"              "Albay"              "Antique"           
##  [7] "Apayao"             "Aurora"             "Basilan"           
## [10] "Bataan"             "Batanes"            "Batangas"          
## [13] "Benguet"            "Biliran"            "Bohol"             
## [16] "Bukidnon"           "Bulacan"            "Cagayan"           
## [19] "CamarinesNorte"     "CamarinesSur"       "Camiguin"          
## [22] "Capiz"              "Catanduanes"        "Cavite"            
## [25] "Cebu"               "CompostelaValley"   "DavaodelNorte"     
## [28] "DavaodelSur"        "DavaoOriental"      "DinagatIslands"    
## [31] "EasternSamar"       "Guimaras"           "Ifugao"            
## [34] "IlocosNorte"        "IlocosSur"          "Iloilo"            
## [37] "Isabela"            "Kalinga"            "LaUnion"           
## [40] "Laguna"             "LanaodelNorte"      "LanaodelSur"       
## [43] "Leyte"              "Maguindanao"        "Marinduque"        
## [46] "Masbate"            "MetropolitanManila" "MisamisOccidental" 
## [49] "MisamisOriental"    "MountainProvince"   "NegrosOccidental"  
## [52] "NegrosOriental"     "NorthCotabato"      "NorthernSamar"     
## [55] "NuevaEcija"         "NuevaVizcaya"       "OccidentalMindoro" 
## [58] "OrientalMindoro"    "Palawan"            "Pampanga"          
## [61] "Pangasinan"         "Quezon"             "Quirino"           
## [64] "Rizal"              "Romblon"            "Samar"             
## [67] "Sarangani"          "Siquijor"           "Sorsogon"          
## [70] "SouthCotabato"      "SouthernLeyte"      "SultanKudarat"     
## [73] "Sulu"               "SurigaodelNorte"    "SurigaodelSur"     
## [76] "Tarlac"             "Tawi-Tawi"          "Zambales"          
## [79] "ZamboangadelNorte"  "ZamboangadelSur"    "ZamboangaSibugay"

To focus our spatial analysis on the Davao Region, we begin by isolating the provinces that fall under Region XI. This step is essential for narrowing down the geographic scope and ensuring that subsequent visualizations and analyses are region-specific.

Using the dplyr package, we apply the filter() function to the ph_prov object, which contains all provincial boundaries from the GADM dataset. The goal is to extract only those provinces that belong to the Davao Region. These include Davao del Norte, Davao del Sur, Davao Oriental, Compostela Valley, and Davao Occidental.

The filtering is done by matching the NAME_1 column—where province names are stored—against a vector of known Davao Region names. This results in a new spatial object, davao_region, which contains only the geometries and attributes relevant to Region XI.

This refined dataset sets the stage for targeted mapping, thematic overlays (such as rainfall or population), and spatial statistics. By working with a focused subset, we reduce complexity and enhance the clarity of our insights—especially when communicating findings to local stakeholders or decision-makers.

# Correct province names for Davao Region (Region XI)
davao_region <- ph_prov %>% filter(NAME_1 %in% c("DavaodelNorte", "DavaodelSur", "DavaoOriental","CompostelaValley", "DavaoOccidental"))

Note that Davao de Oro is still labeled as Compostella Valley, this is the reason why we need to check how the geographic locations are named within the source file.

After filtering the provinces that belong to Region XI, we move on to visualizing the Davao Region within the broader map of the Philippines. This is where the power of ggplot2 truly shines.

Using the geom_sf() function, we layer two spatial datasets: the full set of Philippine provinces (ph_prov) and the subset representing the Davao Region (davao_region). The base map is rendered in light gray to provide a neutral backdrop, while the Davao Region is highlighted in red to draw immediate attention. This contrast makes it easy to distinguish Region XI from the rest of the country.

The use of theme_minimal() ensures that the map remains clean and uncluttered, focusing the viewer’s attention on the geographic data rather than decorative elements. Finally, the labs() function adds a descriptive title, reinforcing the purpose of the visualization.

This map serves as a compelling visual reference, ideal for presentations, reports, or further spatial analysis. It clearly communicates the geographic scope of the Davao Region and sets the stage for deeper exploration—whether you’re examining rainfall patterns, population distribution, or public health data within these provinces.

# Plot the Philippines with Davao Region highlighted
ggplot() +
  geom_sf(data = ph_prov, fill = "lightgray", color = "black") +  # Other provinces in gray
  geom_sf(data = davao_region, fill = "red", color = "black") +  # Davao Region in red
  theme_minimal() +
  labs(title = "Highlighting Davao Region in the Philippines")

1.2 Plotting regions

To create a map that visually distinguishes all 18 administrative regions of the Philippines, we first need a way to associate each province with its corresponding region. Since the GADM shapefile provides provincial boundaries but not regional groupings, we bridge this gap using a lookup table.

This lookup table is stored in an Excel file named list.xlsx, which contains two key columns: one for province names and another for their corresponding region names or codes. By reading this file into R using the readxl package, we create a data frame called region_lookup. This table acts as a reference, enabling us to merge regional information into our spatial dataset.

The command below loads the Excel file into R, making it ready for a left join with the ph_prov spatial object. Once merged, each province in the shapefile will carry a region label, allowing us to group and color them accordingly in a map. This step is essential for thematic mapping, regional analysis, and for creating visuals that reflect the administrative structure of the country.

With this setup, we can now proceed to merge the data and plot the map using ggplot2, assigning distinct colors to each region for clarity and impact.

# An Excel file for mapping Province → Region
region_lookup <- readxl::read_excel("list.xlsx")

Once the region lookup table has been loaded from the Excel file, the next step is to integrate it with the spatial data from the GADM shapefile. This is done using a left join, which appends the region information to each province in the spatial object.

The key to this merge is the common identifier—province names. In the GADM data, these names are stored in the column, while in the Excel file, they are assumed to be under the column . By matching these two fields, we ensure that each province in the shapefile is correctly assigned to its corresponding region.

The command below creates a new version of ph_prov that now includes a region column. This enriched dataset allows us to group provinces by region, assign distinct colors for mapping, and perform region-level summaries or analyses.

This step is crucial for thematic mapping—especially when visualizing all 18 regions of the Philippines. It transforms the shapefile from a purely geographic dataset into one that reflects the country’s administrative structure, paving the way for more meaningful and organized visualizations.

# Merge into GADM provinces
ph_prov <- ph_prov %>% 
  left_join(region_lookup, by = c("NAME_1" = "Province"))

With the regional information successfully merged into the spatial dataset, we can now create a vibrant map that visually distinguishes all 18 administrative regions of the Philippines. This is achieved using ggplot2, which allows us to assign a unique color to each region based on the Region attribute added from the Excel lookup.

The plotting process begins by calling ggplot(ph_prov), which sets the spatial data as the foundation. The geom_sf() function then renders each province, using aes(fill = Region) to color them according to their region. The black borders and subtle line thickness (size = 0.2) help define provincial boundaries without overwhelming the visual.

To maintain a clean and professional look, theme_minimal() is applied, stripping away unnecessary grid lines and background clutter. Finally, labs() adds a descriptive title and a legend label, making the map informative and presentation-ready.

This map not only showcases the geographic layout of the Philippines but also emphasizes its administrative divisions, making it a powerful tool for regional planning, policy analysis, or educational purposes. Whether you’re highlighting development zones, disaster response areas, or resource allocation, this visualization brings clarity and impact to your spatial narrative.

# Plot with region colors
ggplot(ph_prov) +
  geom_sf(aes(fill = Region), color = "black", size = 0.2) +
  theme_minimal() +
  labs(title = "Philippines: 18 Administrative Regions",
       fill = "Region")

1.3 Baranggay Level Map

To create a barangay-level map of the Philippines, we need to work with GADM Level 3 data, which contains the most granular administrative boundaries—including municipalities and barangays. This level of detail is essential for hyper-local analysis, such as mapping flood-prone zones, health service coverage, or population density at the barangay level.

The code snippet below uses the sf package to load the GADM Level 3 shapefile. This file is in GeoJSON format and contains detailed geometries for each barangay across the country. Once read into R, the object ph_l3 becomes a spatial data frame, ready for filtering, joining, and visualization.

# Load the GADM Level 3 shapefile (Update the file path)
ph_l3<- st_read("C:/Users/User/Desktop/Spatial Analysis/gadm41_PHL_3.json/gadm41_PHL_3.json", layer = "gadm41_PHL_3")

## Reading layer `gadm41_PHL_3' from data source 
##   `C:\Users\User\Desktop\Spatial Analysis\gadm41_PHL_3.json\gadm41_PHL_3.json' 
##   using driver `GeoJSON'
## Simple feature collection with 41948 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 116.9283 ymin: 4.5869 xmax: 126.6053 ymax: 21.0699
## Geodetic CRS:  WGS 84

To visualize Davao City at the barangay level, we begin by filtering the GADM Level 3 shapefile to isolate all barangays within the city. This is done using the filter() function from the dplyr package, targeting the NAME_2 column which typically contains municipality or city names. By selecting only rows where NAME_2 equals “DavaoCity”, we create a new spatial object, davao_barangays, that holds the boundaries and attributes of every barangay in Davao City.

Next, we use ggplot2 to render the map. The geom_sf() function plots the spatial features, filling each barangay with a light gray tone and outlining them in black for clarity. The minimalist theme ensures that the map remains clean and focused, while the title added via labs() clearly communicates the purpose of the visualization.

This barangay-level map provides a detailed view of Davao City’s administrative divisions, making it a valuable tool for local planning, community profiling, or targeted interventions. Whether you’re analyzing health services, infrastructure, or environmental risks, this granular perspective brings local geography into sharper focus.

# Filter for Davao City (all barangays)
davao_barangays <- ph_l3 %>% filter(NAME_2 == "DavaoCity")

# Plot the baranggay level map
ggplot() +
  geom_sf(data = davao_barangays, fill = "lightgray", color = "black") + 
  theme_minimal() +
  labs(title = "Barangay-Level Map of Davao City")

To enhance the barangay-level map of Davao City, it’s helpful to first inspect the names of all barangays included in the dataset. This allows you to verify the data, spot inconsistencies, and prepare for labeling or filtering specific areas.

The command below extracts all distinct barangay names from the NAME_3 column of your davao_barangays object. This column typically holds the third-level administrative names in GADM Level 3 data—barangays in this case.

By reviewing this list, you can: - Confirm that all barangays in Davao City are present - Identify any naming issues (e.g., typos or formatting differences) - Prepare for labeling them directly on the map

Once verified, you can move forward with adding labels to your map using geom_sf_text() or geom_text() to display barangay names visually.

# Check unique barangay names
unique(davao_barangays$NAME_3)

##   [1] "Acacia"             "Agdao"              "Alambre"           
##   [4] "AlejandraNavarro"   "AlfonsoAngliongtoS" "Angalan"           
##   [7] "Atan-Awe"           "Baganihan"          "BagoAplaya"        
##  [10] "BagoGallera"        "BagoOshiro"         "Baguio"            
##  [13] "Balengaeng"         "Baliok"             "BangkasHeights"    
##  [16] "Bantol"             "Baracatan"          "Barangay1-A"       
##  [19] "Barangay10-A"       "Barangay11-B"       "Barangay12-B"      
##  [22] "Barangay13-B"       "Barangay14-B"       "Barangay15-B"      
##  [25] "Barangay16-B"       "Barangay17-B"       "Barangay18-B"      
##  [28] "Barangay19-B"       "Barangay2-A"        "Barangay20-B"      
##  [31] "Barangay21-C"       "Barangay22-C"       "Barangay23-C"      
##  [34] "Barangay24-C"       "Barangay25-C"       "Barangay26-C"      
##  [37] "Barangay27-C"       "Barangay28-C"       "Barangay29-C"      
##  [40] "Barangay3-A"        "Barangay30-C"       "Barangay31-D"      
##  [43] "Barangay32-D"       "Barangay33-D"       "Barangay34-D"      
##  [46] "Barangay35-D"       "Barangay36-D"       "Barangay37-D"      
##  [49] "Barangay38-D"       "Barangay39-D"       "Barangay4-A"       
##  [52] "Barangay40-D"       "Barangay5-A"        "Barangay6-A"       
##  [55] "Barangay7-A"        "Barangay8-A"        "Barangay9-A"       
##  [58] "Bato"               "Bayabas"            "BiaoEscuela"       
##  [61] "BiaoGuianga"        "BiaoJoaquin"        "Binugao"           
##  [64] "Bucana"             "Buda"               "Buhangin"          
##  [67] "Bunawan"            "Cabantian"          "Cadalian"          
##  [70] "Calinan"            "Callawa"            "Camansi"           
##  [73] "Carmen"             "CatalunanGrande"    "CatalunanPequeño"  
##  [76] "Catigan"            "Cawayan"            "Centro"            
##  [79] "Colosas"            "Communal"           "CrossingBayabas"   
##  [82] "Dacudao"            "Dalag"              "Dalagdag"          
##  [85] "Daliao"             "DaliaonPlantation"  "DatuSalumay"       
##  [88] "Dominga"            "Dumoy"              "Eden"              
##  [91] "Fatima"             "Gatungan"           "Gov.PacianoBangoy" 
##  [94] "Gov.VicenteDuterte" "Gumalang"           "Gumitan"           
##  [97] "Ilang"              "Inayangan"          "Indangan"          
## [100] "Kap.TomasMonteverd" "Kilate"             "Lacson"            
## [103] "Lamanan"            "Lampianao"          "Langub"            
## [106] "Lapu-Lapu"          "LeonGarcia,Sr."     "Lizada"            
## [109] "LosAmigos"          "Lubogan"            "Lumiad"            
## [112] "Ma-A"               "Mabuhay"            "Magsaysay"         
## [115] "Magtuod"            "Mahayag"            "Malabog"           
## [118] "Malagos"            "Malamba"            "Manambulan"        
## [121] "Mandug"             "ManuelGuianga"      "Mapula"            
## [124] "Marapangi"          "Marilog"            "MatinaAplaya"      
## [127] "MatinaBiao"         "MatinaCrossing"     "MatinaPangi"       
## [130] "Megkawayan"         "Mintal"             "Mudiang"           
## [133] "Mulig"              "NewCarmen"          "NewValencia"       
## [136] "Pampanga"           "Panacan"            "Panalum"           
## [139] "Pandaitan"          "Pangyan"            "Paquibato"         
## [142] "ParadiseEmbak"      "RafaelCastillo"     "Riverside"         
## [145] "Salapawan"          "Salaysay"           "Saloy"             
## [148] "SanAntonio"         "SanIsidro"          "SantoNiño"         
## [151] "Sasa"               "Sibulan"            "Sirawan"           
## [154] "Sirib"              "Suawan"             "Subasta"           
## [157] "Sumimao"            "Tacunan"            "Tagakpan"          
## [160] "Tagluno"            "Tagurano"           "Talandang"         
## [163] "Talomo"             "TalomoRiver"        "Tamayong"          
## [166] "Tambobong"          "Tamugan"            "Tapak"             
## [169] "Tawan-Tawan"        "Tibuloy"            "Tibungco"          
## [172] "Tigatto"            "Toril"              "Tugbok"            
## [175] "Tungakalan"         "Ubalde"             "Ula"               
## [178] "VicenteHizonSr."    "Waan"               "Wangan"            
## [181] "WilfredoAquino"     "Wines"

To emphasize specific barangays within Davao City—such as Buhangin and Agdao—we refine our spatial visualization by selectively highlighting them on the map. This approach is especially useful when focusing on priority areas for planning, intervention, or analysis.

Using dplyr, we filter the davao_barangays dataset to extract only the rows where NAME_3 matches either “Buhangin” or “Agdao”. This creates a new spatial object, selected_barangays, containing just the geometries and attributes of those two barangays.

In the ggplot2 visualization, we layer the full set of Davao City barangays in light gray to provide context, while rendering the selected barangays in red to make them stand out. The black borders help define each barangay’s shape, and the minimalist theme keeps the map clean and focused.

This targeted map is ideal for presentations or reports that need to draw attention to specific communities—whether for infrastructure development, health services, or disaster preparedness. It transforms geographic data into a clear visual narrative, spotlighting the areas that matter most.

# Select barangays to highlight
selected_barangays <- davao_barangays %>% filter(NAME_3 %in% c("Buhangin", "Agdao"))

# Plot with highlights
ggplot() +
  geom_sf(data = davao_barangays, fill = "lightgray", color = "black") +  # Other barangays in gray
  geom_sf(data = selected_barangays, fill = "red", color = "black") +  # Highlighted barangays in red
  theme_minimal() +
  labs(title = "Highlighted Barangays in Davao City")

1.4 Plotting Population density

To visualize population density across barangays in Davao City, we integrate demographic data with spatial boundaries to produce a thematic map that’s both informative and visually compelling.

It begins with importing a CSV file containing population and area data for each barangay. This dataset is read into R as pop_data, and from it, a new column—pop_density—is calculated by dividing population by land area in square kilometers. This metric provides a standardized way to compare how densely populated each barangay is.

Before merging, we ensure consistency in naming conventions. Barangay names in the spatial dataset (davao_barangays) are standardized using stri_trans_general() from the stringi package, which removes diacritics and harmonizes text encoding. This step is crucial to avoid mismatches when joining datasets.

pop_data <- read.csv("C:/Users/User/Desktop/Spatial Analysis/Popn.csv")

# Standardize both datasets
davao_barangays <- davao_barangays %>%
  mutate(NAME_3 = stri_trans_general(NAME_3, "Latin-ASCII"))

pop_data <- pop_data %>%
  mutate(pop_density = population / area_km2)

With names aligned, we perform a left_join() to merge the population data into the spatial object. The result, davao_map, now contains both geographic shapes and population density values for each barangay.

davao_map <- davao_barangays %>%
  left_join(pop_data, by = "NAME_3")

Using ggplot2, we render the map with geom_sf(), coloring each barangay based on its population density. The scale_fill_viridis_c() function applies a perceptually uniform color scale—here using the “magma” palette in reverse—to highlight density variations. Barangays with higher density appear in warmer tones, making patterns easy to interpret at a glance.

Finally, the map is saved as a high-resolution PDF, suitable for printing or inclusion in reports. This visualization transforms raw numbers into geographic insight, helping planners, researchers, and policymakers understand how people are distributed across Davao City’s barangays.

ggplot(davao_map) +
  geom_sf(aes(fill = pop_density), color = "black") +
  scale_fill_viridis_c(option = "magma",direction=-1, name = "Pop. Density (per km²)") +
  theme_minimal() +
  labs(title = "Population Density of Davao City Barangays")

# Save to PDF
ggsave("population.pdf", width = 8.5, height = 11, units = "in", dpi = 300)

1.5 Plotting Coordinates in a Map

This R script demonstrates a spatial mapping workflow using the sf, dplyr, and ggplot2 packages. It begins by loading a Level 3 administrative boundary shapefile for the Philippines, then filters it to isolate all barangays within Davao City.

A set of sample geographic coordinates—representing landmarks like USeP, the airport, and the Philippine Eagle Center—is defined and converted into spatial features. After ensuring both datasets share the same coordinate reference system (CRS), the script overlays the points onto the barangay map using ggplot2. The result is a clean, informative visualization that highlights the spatial distribution of key locations within Davao City.

# Load libraries
library(sf)
library(dplyr)
library(ggplot2)

# 1. Load GADM Level 3 shapefile (update file path if needed)
ph_l3 <- st_read("C:/Users/User/Desktop/Spatial Analysis/gadm41_PHL_3.json/gadm41_PHL_3.json")

## Reading layer `gadm41_PHL_3' from data source 
##   `C:\Users\User\Desktop\Spatial Analysis\gadm41_PHL_3.json\gadm41_PHL_3.json' 
##   using driver `GeoJSON'
## Simple feature collection with 41948 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 116.9283 ymin: 4.5869 xmax: 126.6053 ymax: 21.0699
## Geodetic CRS:  WGS 84

# 2. Filter for Davao City (all barangays)
davao_barangays <- ph_l3 %>% 
  filter(NAME_2 == "DavaoCity")   # check with unique(ph_l3$NAME_2) if unsure

# 3. Example coordinates (replace with your actual points)
points_df <- data.frame(
  lon = c(125.61, 125.65, 125.47),
  lat = c(7.08, 7.13, 7.18),
  name = c("USeP", "Airport", "Philippine Eagle")
)

# 4. Convert coordinates to sf
points_sf <- st_as_sf(points_df, coords = c("lon", "lat"), crs = 4326)

# 5. Make sure CRS matches
davao_barangays <- st_transform(davao_barangays, crs = 4326)

# 6. Plot barangay map of Davao City with overlaid points
ggplot() +
  geom_sf(data = davao_barangays, fill = "lightgray", color = "black") +
  geom_sf(data = points_sf, aes(color = name), size = 3) +
  coord_sf(xlim = st_bbox(davao_barangays)[c("xmin", "xmax")],
           ylim = st_bbox(davao_barangays)[c("ymin", "ymax")],
           expand = FALSE) +
  theme_minimal() +
  labs(title = "Plotting Coordinates in Davao City Map",
       x = "Longitude", y = "Latitude")

1.6 Mapping Disaster Preparedness

To visualize disaster preparedness across Philippine cities and municipalities, this workflow combines spatial boundaries with custom scoring data to produce a clear, categorized map.

It begins by loading the GADM Level 2 shapefile, which contains municipal boundaries, and a CSV file with preparedness scores. These scores—ranging from 10 to 100—reflect how well each locality is equipped to handle disasters, based on criteria you’ve defined. To ensure a smooth merge, both datasets are cleaned and standardized. Municipality names are trimmed and converted to lowercase, eliminating formatting inconsistencies that could disrupt the join. Once aligned, the spatial data and score data are merged using left_join(), enriching each geographic feature with its corresponding preparedness score.

Next, the scores are classified into three categories: - High (71–100): Strong preparedness - Moderate (41–70): Adequate but improvable - Low (10–40): Vulnerable and in need of support

These categories are then mapped using a custom color palette—dark green for high, yellow for moderate, and red for low—making it easy to identify priority areas at a glance. Municipalities without data are shaded in light gray to maintain visual balance.

The final map, rendered with ggplot2, is clean and informative. It includes a title, legend, and caption, and is saved as a high-resolution PDF for sharing or printing. This visualization transforms raw data into actionable insight, helping planners, responders, and policymakers focus their efforts where they’re needed most.

# Load necessary libraries
library(sf)
library(dplyr)
library(readr)

## Warning: package 'readr' was built under R version 4.4.3

library(ggplot2)

# Step 1: Load the GADM Level 2 shapefile (GeoJSON)
gadm_ph <- st_read("C:/Users/User/Desktop/Spatial Analysis/gadm41_PHL_2.json/gadm41_PHL_2.json")

## Reading layer `gadm41_PHL_2' from data source 
##   `C:\Users\User\Desktop\Spatial Analysis\gadm41_PHL_2.json\gadm41_PHL_2.json' 
##   using driver `GeoJSON'
## Simple feature collection with 1647 features and 13 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 116.9283 ymin: 4.5869 xmax: 126.6053 ymax: 21.0699
## Geodetic CRS:  WGS 84

# Step 2: Load your Score data
Score <- read_csv("C:/Users/User/Desktop/Spatial Analysis/Score.csv")

## Rows: 1647 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Province, Muni_City
## dbl (1): Score
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Step 3: Clean and standardize name fields
gadm_ph <- gadm_ph %>%
  mutate(NAME_2_clean = tolower(trimws(NAME_2)))

Score <- Score %>%
  mutate(Muni_City_clean = tolower(trimws(Muni_City)))

# Step 4: Join the data
gadm_score <- gadm_ph %>%
  left_join(Score, by = c("NAME_2_clean" = "Muni_City_clean"))

## Warning in sf_column %in% names(g): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 7 of `x` matches multiple rows in `y`.
## ℹ Row 8 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.

# Step 5: Classify scores into categories
gadm_score <- gadm_score %>%
  mutate(score_category = factor(case_when(
    Score >= 10 & Score <= 40 ~ "Low",
    Score > 40 & Score <= 70 ~ "Moderate",
    Score > 70 & Score <= 100 ~ "High",
    TRUE ~ NA_character_
  ), levels = c("High", "Moderate", "Low")))

# Step 6: Blue color palette (dark to light blue)
score_colors <- c(
  "High" = "darkgreen",  # darkest blue
  "Moderate" = "yellow",
  "Low" = "red"
)

# Step 7: Plot the categorized scores
ggplot(gadm_score) +
  geom_sf(aes(fill = score_category), color = "black", size = 0.1) +  # Black borders
  scale_fill_manual(values = score_colors, na.value = "grey90", name = "DPI Level") +
  theme_minimal() +
  labs(
    title = "Disaster Preparedness Index for  Cities and Municipalities",
    subtitle = "Categorized into  Levels",
    caption = "Sources: GADM v4.1 and Custom Score Data"
  ) +
  theme(
    legend.position = "right",
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  )

# Save to PDF
ggsave("disasterPreparedness.pdf", width = 8.5, height = 11, units = "in", dpi = 300)

2.0 Spatial Autocorrelation

Spatial autocorrelation is a foundational concept in spatial analysis that helps us understand how geographic phenomena are distributed across space. At its core, it asks a simple but powerful question: Are things that are close together also similar?

In traditional statistics, we often assume that observations are independent of one another. But in geography, this assumption rarely holds. Cities near each other may share similar climate, neighboring barangays might have comparable population densities, and adjacent flood zones could exhibit similar risk levels. Spatial autocorrelation captures this idea—the tendency for nearby locations to influence each other.

There are two main types: - Positive spatial autocorrelation: Similar values cluster together. For example, high-income neighborhoods tend to be near other high-income areas. - Negative spatial autocorrelation: Dissimilar values are neighbors. Think of a wealthy district bordering a low-income one. - Zero or random spatial autocorrelation: No discernible pattern; values are scattered without spatial logic.

We measure this using statistics like Global Moran’s I, which gives a single value summarizing the overall spatial pattern, and Local Moran’s I, which zooms in to identify specific clusters or outliers.

In the context of Davao City, spatial autocorrelation allows us to ask: Are barangays with high population density clustered together? Do vulnerable communities form spatial patterns? By answering these questions, we move beyond maps and into spatial reasoning—unlocking insights that can guide planning, policy, and resource allocation.

To explore spatial autocorrelation in Davao City’s population density, we move from mapping to spatial statistics—uncovering patterns that aren’t immediately visible on a map.

2.1 Data Preparation:

Before diving into analysis, we verify that davao_map is still an sf object. This ensures compatibility with spatial functions like neighbor detection and autocorrelation tests. The st_geometry() function confirms that geometries are intact and ready for spatial operations.

# Ensure it's an sf object
class(davao_map)

## [1] "sf"         "data.frame"

st_geometry(davao_map)

## Geometry set for 182 features 
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 125.2257 ymin: 6.9622 xmax: 125.7175 ymax: 7.5688
## Geodetic CRS:  WGS 84
## First 5 geometries:

## MULTIPOLYGON (((125.5963 7.1906, 125.603 7.1899...

## MULTIPOLYGON (((125.6153 7.0672, 125.6142 7.066...

## MULTIPOLYGON (((125.4728 7.0642, 125.4886 7.049...

## MULTIPOLYGON (((125.6312 7.2614, 125.6588 7.282...

## MULTIPOLYGON (((125.6035 7.0985, 125.6091 7.082...

2.2 Build spatial neighbors (contiguity)

Spatial autocorrelation depends on how geographic units relate to one another. Using Queen’s contiguity, we define neighbors as barangays that share either a border or a corner. This method captures more connections than Rook’s contiguity, which only considers shared edges. - poly2nb() builds the neighbor list. - nb2listw() converts it into a spatial weights matrix, standardizing influence across neighbors. This matrix becomes the backbone for measuring spatial relationships.

nb <- poly2nb(davao_map, queen = TRUE)

# Convert to spatial weights list (row-standardized)
lw <- nb2listw(nb, style = "W", zero.policy = TRUE)

2.3 Run Global Moran’s I

Global Moran’s I provides a single statistic that summarizes whether barangays with similar population densities tend to cluster. • Positive I: High-density barangays are near other high-density ones (and low with low). • Negative I: High-density barangays are surrounded by low-density ones (and vice versa). • Near-zero I: No clear spatial pattern—values are randomly distributed. Running moran.test() gives you the statistic, expected value, variance, and p-value to assess significance

# Moran’s I test
moran_test <- moran.test(davao_map$pop_density, lw, zero.policy = TRUE)
print(moran_test)

## 
##  Moran I test under randomisation
## 
## data:  davao_map$pop_density  
## weights: lw    
## 
## Moran I statistic standard deviate = 0.42265, p-value = 0.3363
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##       0.014974243      -0.005524862       0.002352337

2.4 Local Moran’s I

While Global Moran’s I gives an overall picture, Local Moran’s I zooms in—calculating autocorrelation for each barangay. - localmoran() returns a matrix of values: - Ii: Local Moran’s I - Z.Ii: Z-score (standardized) - P.Ii: p-value (significance) These are added back to the davao_map object, enabling spatial visualization of clusters and outliers.

local_moran <- localmoran(davao_map$pop_density, lw, zero.policy = TRUE)

# Add results back to your sf object
davao_map$Ii <- local_moran[,1]   # Local Moran’s I
davao_map$E.Ii <- local_moran[,2]
davao_map$Var.Ii <- local_moran[,3]
davao_map$Z.Ii <- local_moran[,4] # Z-score
davao_map$P.Ii <- local_moran[,5] # p-value

Using ggplot2, we map the Z-scores from Local Moran’s I: - Red: High positive autocorrelation (hotspots) - Blue: High negative autocorrelation (cold spots or outliers) - White: Near-zero (random or insignificant) This map reveals where population density clusters—guiding urban planning, resource allocation, or disaster preparedness.

ggplot(davao_map) +
  geom_sf(aes(fill = Z.Ii), color = "black") +
  scale_fill_gradient2(low="blue", mid="white", high="red", midpoint=0,
                       name="Local Moran's I Z-score") +
  theme_minimal() +
  labs(title = "Local Spatial Autocorrelation (LISA) - Population Density")