Introduction

This module introduces various R packages for accessing, visualizing, and analyzing spatial data, with a specific focus on examples relevant to Somalia. These skills are crucial for students in Applied Statistics and Medical Statistics & Health Data Science for applications in disease mapping and spatial analysis. This module will guide you through downloading, visualizing, and working with various types of spatial data, including administrative boundaries, climate data, elevation, OpenStreetMap data, and socio-economic data. We will also introduce essential R packages for spatial analysis, focusing on data handling, manipulation, and analysis.

1. Administrative Boundaries

We’ll use the rnaturalearth package to download administrative boundaries for Somalia.

1.1 Install and load required packages

First we install and load necessary R libraries.

# install.packages("devtools")
# devtools::install_github("ropensci/rnaturalearthhires")
library(rnaturalearth)
library(sf)
library(ggplot2)
library(patchwork)

1.2 Download Somalia’s boundary

Here we download the map of Somalia.

somalia_map <- ne_countries(type = "countries", country = "Somalia", scale = "medium", returnclass = "sf")

1.3 Visualize the map

Now we visualize the map of Somalia.

ggplot(somalia_map) +
  geom_sf() +
  ggtitle("Map of Somalia") +
  theme_minimal()


Note: Somalia doesn’t have administrative divisions within rnaturalearth. For subnational boundaries, you’d likely need to find specific sources.

2. Climatic Data

We’ll use the geodata package to download climate data for Somalia.

library(geodata)
library(terra) # raster functionality

2.1 Download minimum temperature data

Here we download the minimum monthly temperature for Somalia.

somalia_tmin <- worldclim_country(country = "Somalia", var = "tmin", path = tempdir())

2.2 Plot temperature data

Here we plot the average minimum monthly temperature.

plot(mean(somalia_tmin), plg = list(title = "Mean Min. Temp"), main = "Mean Minimum Temperature in Somalia")

3. Precipitation Data

Using chirps package, we will download precipitation data for Mogadishu, Somalia.

library(chirps)

3.1 Define coordinates for Mogadishu

Here we provide coordinates for the city of Mogadishu.

mogadishu_coords <- data.frame(long = 45.3254, lat = 2.0469)

3.2 Download daily precipitation data

Here we download daily precipitaion data for Mogadishu for 2 years.

mogadishu_precip <- get_chirps(mogadishu_coords, dates = c("2021-01-01", "2022-12-31"), server = "ClimateSERV")

3.3 Visualize precipitation

Here we plot the daily precipitation using a line graph.

ggplot(mogadishu_precip, aes(x = date, y = chirps)) +
  geom_line() +
  labs(y = "Precipitation (mm)", title = "Daily Precipitation in Mogadishu") +
  theme_minimal()

4. Elevation Data

Let’s obtain elevation data for Somalia using elevatr package.

library(elevatr)

4.1 Download elevation data

We use the previously downloaded Somalia map to download the elevation data.

somalia_elev <- get_elev_raster(locations = somalia_map, z = 7, clip = "locations")

4.2 Visualize elevation

Here we visualise the elevation using terra package.

plot(rast(somalia_elev), plg = list(title = "Elevation (m)"), main = "Elevation in Somalia")

5. OpenStreetMap Data

We’ll use the osmdata package to retrieve OSM data for Mogadishu, Somalia.

library(osmdata)
library(leaflet)

5.1 Define bounding box for Mogadishu

Here we define a bounding box for Mogadishu.

mogadishu_bb <- getbb("Mogadishu, Somalia")
mogadishu_bb
##         min       max
## x 45.181918 45.501918
## y  1.874931  2.194931

5.2 Retrieve hospitals

Here we download the hospitals in Mogadishu area.

mogadishu_hospitals <- mogadishu_bb %>% 
  opq() %>%
  add_osm_feature(key = "amenity", value = "hospital") %>%
  osmdata_sf()

5.3 Retrieve main roads

Here we download main roads (motorways) in Mogadishu area.

mogadishu_motorways <- mogadishu_bb %>%
  opq() %>%
  add_osm_feature(key = "highway", value = "motorway") %>%
  osmdata_sf()

6. World Bank Data

We’ll use the wbstats package to retrieve socio-economic data for Somalia.

library(wbstats)

6.1 Search for relevant indicators

We search for relevant indicators about health and education.

indicators <- wb_search(pattern = "health|education")
# View(indicators)  # Uncomment to view the retrieved indicators

6.2 Download data for Human Development Index

Here we download the Human Development Index data for Somalia for 2019.

somalia_hdi <- wb_data(indicator = "MO.INDEX.HDEV.XQ", start_date = 2019, end_date = 2019)
print(head(somalia_hdi))
## # A tibble: 0 × 0

7. Species Occurrence Data

Using spocc we obtain the occurence data for different species.

library(spocc)
library(sf)

7.1 Download species occurence data

Here we download locations of occurrences of wild dog (Lycaon pictus) in Somalia.

somalia_wild_dogs <- occ(query = "Lycaon pictus", from = "gbif",
                     date = c("2000-01-01", "2023-12-31"),
                     gbifopts = list(country = "SO"),
                      has_coords = TRUE, limit = 1000)
d_dogs <- occ2df(somalia_wild_dogs)

8. Essential R Packages for Spatial Analysis

In addition to the packages used above for data access and visualization, several R packages are essential for spatial analysis. These packages provide tools for spatial statistics, spatial econometrics, and point pattern analysis. Here’s an overview of some key packages:

8.1 spdep

  • Definition: The spdep package provides tools for spatial dependence analysis, including spatial autocorrelation measures, spatial weights matrices, and spatial regression diagnostics.
  • Key Functions:
    • poly2nb(): Creates neighborhood lists from polygon data.
    • nb2listw(): Creates spatial weights lists from neighborhood lists.
    • moran.test(): Performs Moran’s I test for spatial autocorrelation.
    • lm.morantest(): Performs Moran’s I test for linear model residuals.
  • Use Case: Analyzing spatial autocorrelation in areal data, creating spatial weights matrices for spatial regression models.

8.2 spatialreg

  • Definition: The spatialreg package provides functions for fitting spatial regression models, including spatial lag, spatial error, and spatial Durbin models.
  • Key Functions:
    • lagsarlm(): Fits spatial lag models.
    • errorsarlm(): Fits spatial error models.
    • spatial.diagnostics(): Performs diagnostic tests for spatial regression models.
  • Use Case: Modeling spatial dependence in regression analysis, accounting for spatial autocorrelation in the outcome variable or error term.

8.3 spatstat

  • Definition: The spatstat package is a comprehensive suite of tools for analyzing spatial point patterns, including point process models, intensity estimation, and spatial clustering.
  • Key Functions:
    • ppp(): Creates a point pattern object.
    • density(): Estimates the intensity of a point pattern.
    • Kest(): Calculates the K-function for point pattern analysis.
    • ppm(): Fits point process models.
  • Use Case: Analyzing the spatial distribution of disease cases, identifying clusters of events, and modeling point patterns.

8.4 sf (already used)

  • Definition: The sf package provides a unified way to work with spatial vector data, including reading, writing, manipulating, and visualizing spatial data.
  • Key Functions:
    • st_read(): Reads spatial data from various formats.
    • st_write(): Writes spatial data to various formats.
    • st_transform(): Transforms spatial data to a different coordinate reference system.
    • st_join(): Joins spatial data with attribute data.
  • Use Case: Handling spatial data in various formats, performing spatial operations, and preparing data for analysis.

9. Data Handling, Manipulation, and Analysis

These packages provide functions for data handling, manipulation, and analysis:

9.1 Data Import and Export

  • sf: Use st_read() to import spatial data from various formats (e.g., shapefiles, GeoJSON) and st_write() to export spatial data.
  • raster and terra: Use raster() or rast() to import raster data and writeRaster() to export raster data.
  • read.csv() and write.csv(): Use these functions to import and export attribute data.

9.2 Spatial Data Manipulation

  • sf: Use functions like st_transform() for coordinate transformations, st_buffer() for creating buffers, st_intersection() for spatial intersections, and st_union() for spatial unions.
  • dplyr: Use functions like filter(), select(), mutate(), and group_by() for data manipulation.
  • terra: Use functions like crop(), resample(), and aggregate() for raster data manipulation.

9.3 Spatial Analysis

  • spdep: Use functions like moran.test() and localmoran() to assess spatial autocorrelation.
  • spatialreg: Use functions like lagsarlm() and errorsarlm() to fit spatial regression models.
  • spatstat: Use functions like density(), Kest(), and ppm() to analyze point patterns.
  • raster and terra: Use functions for raster calculations, such as overlay() and distance().

9.4 Hands-on Data Analysis

The examples in this module demonstrate how to use these packages for hands-on data analysis. For instance, you can:

  • Calculate spatial autocorrelation of disease rates using spdep.
  • Fit spatial regression models to explore the relationship between disease and risk factors using spatialreg.
  • Analyze the spatial distribution of disease cases using spatstat.
  • Combine spatial data with survey data for thematic mapping.

Conclusion

This module introduced you to several R packages for accessing and working with spatial data. You can apply these skills to various applications in disease mapping and spatial analysis, particularly relevant to countries like Somalia where data can be challenging to obtain. We have also introduced essential R packages for spatial analysis, focusing on data handling, manipulation, and analysis.

Further Exploration

This module provides a foundation for spatial data analysis; it is highly recommended that the students further explore and extend upon these methods.

References