1. Introduction

This is an R Markdown notebook which illustrates how to make thematic maps representing Sustainable Development Goals (SDG) Indicators.

Our interest here relates to the SDG 6 “Ensure access to water and sanitation for all”. In particular, we want to map the Indicator 6.2.1 which is the “proportion of population using (a) safely managed sanitation services and (b) a hand-washing facility with soap and water” in the UN SDG framework.

As you will see, we are going to make a map without having to download any data beforehand. The necessary data downloads will be done inside this R notebook.

2. Setup

First of all, it is convenient to clean the R environment:

rm(list=ls())

Then, we need to install several libraries. Make sure you install them from the R console.

# DO NOT RUN THIS CHUNK FROM HERE
# DO IT FRON THE R CONSOLE
#install.packages("owidR")
#install.packages("dplyr")
#install.packages("ggplot2")
#install.packages("rnaturalearth")

Now, we load the libraries:

# Load the dplyr library
library(owidR)
library(dplyr)
library(ggplot2)
library(rnaturalearth)
library(tidyr)
library(sf)
library(RColorBrewer)
library(leaflet)

3. The OWID dataset

Let’s remind us that Our World in Data (OWID) provides access to official data across all available SDG indicators.

We will use the owidR functions to get global OWID data on access to safely managed sanitation. First by searching in.

owid_search("sanitation")
##       chart_id                                                                                       
##  [1,] "prevalence-of-stunting-vs-improved-sanitation-facilities"                                     
##  [2,] "mortality-rate-attributable-to-wash"                                                          
##  [3,] "death-rate-from-unsafe-sanitation"                                                            
##  [4,] "deaths-due-to-unsafe-sanitation"                                                              
##  [5,] "incidence-of-diarrheal-episodes-vs-access-to-improved-sanitation"                             
##  [6,] "sdg-target-for-access-to-sanitation"                                                          
##  [7,] "rural-without-improved-sanitation"                                                            
##  [8,] "number-without-access-to-improved-sanitation"                                                 
##  [9,] "safe-sanitation-without"                                                                      
## [10,] "death-rate-from-unsafe-sanitation-gbd"                                                        
## [11,] "sanitation-facilities-coverage"                                                               
## [12,] "sanitation-facilities-coverage-in-rural-areas"                                                
## [13,] "sanitation-facilities-coverage-in-urban-areas"                                                
## [14,] "share-deaths-unsafe-sanitation"                                                               
## [15,] "share-using-at-least-basic-sanitation"                                                        
## [16,] "improved-sanitation-facilities-vs-gdp-per-capita"                                             
## [17,] "share-without-improved-sanitation"                                                            
## [18,] "share-of-population-with-improved-sanitation-faciltities"                                     
## [19,] "share-using-safely-managed-sanitation"                                                        
## [20,] "share-of-the-population-with-access-to-sanitation-facilities"                                 
## [21,] "share-of-rural-population-with-improved-sanitation-faciltities"                               
## [22,] "share-of-urban-population-with-improved-sanitation-facilities"                                
## [23,] "urban-vs-rural-population-using-at-least-basic-sanitation"                                    
## [24,] "urban-vs-rural-population-safely-managed-sanitation"                                          
## [25,] "total-oda-for-water-supply-and-sanitation-by-recipient"                                       
## [26,] "number-using-at-least-basic-sanitation"                                                       
## [27,] "death-rates-from-unsafe-sanitation-vs-gdp-per-capita"                                         
## [28,] "share-of-schools-with-access-to-single-sex-basic-sanitation"                                  
## [29,] "share-of-countries-with-procedures-for-community-participation-in-water-sanitation-management"
##       title                                                                                              
##  [1,] "Prevalence of stunting vs. improved sanitation facilities"                                        
##  [2,] "Death rate attributable to unsafe water, sanitation, and hygiene"                                 
##  [3,] "Death rate from unsafe sanitation"                                                                
##  [4,] "Deaths attributed to unsafe sanitation"                                                           
##  [5,] "Diarrheal disease episodes vs. safely managed sanitation"                                         
##  [6,] "Has country already reached SDG target for usage of improved sanitation facilities?"              
##  [7,] "People in rural areas not using improved sanitation facilities"                                   
##  [8,] "People not using improved sanitation facilities"                                                  
##  [9,] "People not using to safely managed sanitation"                                                    
## [10,] "Rate of deaths attributed to unsafe sanitation"                                                   
## [11,] "Sanitation facilities usage"                                                                      
## [12,] "Sanitation facilities usage in rural areas"                                                       
## [13,] "Sanitation facilities usage in urban areas"                                                       
## [14,] "Share of deaths attributed to unsafe sanitation"                                                  
## [15,] "Share of population using at least basic sanitation facilities"                                   
## [16,] "Share of population with improved sanitation vs. GDP per capita"                                  
## [17,] "Share of the population not using improved sanitation"                                            
## [18,] "Share of the population using at least basic sanitation services"                                 
## [19,] "Share of the population using safely managed sanitation facilities"                               
## [20,] "Share of the population using sanitation facilities"                                              
## [21,] "Share of the rural population using at least basic sanitation services"                           
## [22,] "Share of urban population using at least basic sanitation services"                               
## [23,] "Share of urban vs. rural population using at least basic sanitation"                              
## [24,] "Share of urban vs. rural population using safely managed sanitation facilities"                   
## [25,] "Total official financial flows for water supply and sanitation, by recipient"                     
## [26,] "Usage of at least basic sanitation facilities"                                                    
## [27,] "Rate of deaths attributed to unsafe sanitation vs. GDP per capita"                                
## [28,] "Share of schools with access to single-sex basic sanitation"                                      
## [29,] "Share of countries with procedures for community participation in water and sanitation management"

Now using the share-using-safely-managed-sanitation index dataset.

safe_sanit <- owid("share-using-safely-managed-sanitation", rename = NULL, tidy.date = TRUE)

Let’s check what we got:

safe_sanit

4. Data processing

View the table and confirm that it contains 6.2.1 data for different years. Can you say what is the time period for these data?

It is very important to know what are the names the software “sees”. It may be different from what we see. Let’s find out it:

names(safe_sanit)
## [1] "entity"                                                                                                                 
## [2] "code"                                                                                                                   
## [3] "year"                                                                                                                   
## [4] "6.2.1 - Proportion of population using safely managed sanitation services, by urban/rural (%) - SH_SAN_SAFE - All areas"

Now, change names for any column containing empty or “noisy” characters:

safe_sanit %>%  dplyr::rename('ind621' = '6.2.1 - Proportion of population using safely managed sanitation services, by urban/rural (%) - SH_SAN_SAFE - All areas') -> new_sanit
head(new_sanit)

Let’s assume that we are interested only in a few countries. Filtering the selected countries:

(nsa_sanit = dplyr::filter(new_sanit, entity == "Colombia" |  entity == "Venezuels" | entity == "Ecuador" | entity == "Brazil" | entity == "Peru" 
                             | entity == "Bolivia"))

5. Exploratory analysis

It is advisable to visualize temporal changes on the selected indicator:

nsa_sanit %>%
  filter(entity %in% c("Ecuador", "Colombia", "Peru", "Brazil")) %>%
  ggplot(aes(x = year, y = ind621, group = entity, colour = entity)) +
  geom_line()

6. Thematic maps

In order to make a map, first we need to conduct several tasks:

  • Obtain polygon data outlining countries boundaries;
  • Convert attribute data from long to wide format;
  • Join attribute data to polygon data

Getting global polygons:

countries <- ne_countries(returnclass = "sf") %>% select(name, pop_est, geometry)

Checking what we got:

head(countries)

Filtering selected countries:

countries %>%
  filter(name %in% c("Ecuador", "Colombia", "Peru", "Brazil")) -> nsa_countries

Checking what we got:

nsa_countries

Checking type of object:

class(nsa_countries)
## [1] "sf"         "data.frame"

Converting attribute from long to wide format:

wide = nsa_sanit %>% 
  tidyr::spread(year, ind621)
wide

Checking the output:

class(wide)
## [1] "data.frame"

Now, bejore joining the attribute data and the polygon data, let’s understand what is a “lef_join”:

Let’s join the two objects:

nsa_countries_sanit = left_join(nsa_countries, wide, by=c("name"= "entity"))

Checking the output:

nsa_countries_sanit

Finally, it is mapping time. Let’s try a choropleth map.

Renaming the input attribute (to make sure that R undestand it):

nsa_countries_sanit %>%  dplyr::rename(
         'Y2022' = '2022') -> joined_sanit

Creating a leaflet map:

bins <- c(0, 10, 15, 20, 25, 30, 35, 40, 50, 60)
pal <- colorBin("YlGn", domain = joined_sanit$Y2022, bins = bins)

  mapa <- leaflet(data = joined_sanit) %>%
  addTiles() %>%
  addPolygons(label = ~Y2022,
              popup = ~name,
              fillColor = ~pal(Y2022),
              color = "#444444",
              weight = 1,
              smoothFactor = 0.5,
              opacity = 1.0,
              fillOpacity = 0.5,
              highlightOptions = highlightOptions(color = "white", weight = 2, bringToFront = TRUE)
              ) %>%
  addProviderTiles(providers$OpenStreetMap) %>%
  addLegend("bottomright", pal = pal, values = ~Y2022,
    title = "SDG 6.2.1 - Safe sanitation [%] (2022)",
    opacity = 1
  )

Plotting the map:

mapa

7. Saving the joined data

Saving the data to the working directory in geopackage format:

joined_sanit %>%
            st_write("data/nsa_sanitation.gpkg",
            layer = "safe_sanitation",
            append= FALSE)
## Deleting layer `safe_sanitation' using driver `GPKG'
## Writing layer `safe_sanitation' to data source 
##   `data/nsa_sanitation.gpkg' using driver `GPKG'
## Writing 4 features with 26 fields and geometry type Multi Polygon.

Reading the sanitation geopackage:

(tmp =  st_read("data/nsa_sanitation.gpkg"))
## Reading layer `safe_sanitation' from data source 
##   `/Users/ials/Documents/unal/G4D/official/labs/Lab4/data/nsa_sanitation.gpkg' 
##   using driver `GPKG'
## Simple feature collection with 4 features and 26 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -81.41094 ymin: -33.76838 xmax: -34.72999 ymax: 12.4373
## Geodetic CRS:  WGS 84

8. Homework (optional)

Write & publish a new notebook to map a different SDG indicator. You can produce a either a global map or a regional map.

9. Additional resources

You may find useful the following links:

10. Citation

If you reuse this code please cite it as follows: Lizarazo, I. 2024. Making SDG thematic maps in R. Available at: https://rpubs.com/ials2un/sdg_choropleth_maps

11. The environment

sessionInfo()
## R version 4.3.2 (2023-10-31)
## Platform: x86_64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.2.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Bogota
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] leaflet_2.2.1       RColorBrewer_1.1-3  sf_1.0-15          
## [4] tidyr_1.3.0         rnaturalearth_1.0.1 ggplot2_3.4.4      
## [7] dplyr_1.1.3         owidR_1.4.2        
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.8              utf8_1.2.4              generics_0.1.3         
##  [4] xml2_1.3.4              class_7.3-22            KernSmooth_2.23-22     
##  [7] stringi_1.8.3           digest_0.6.34           magrittr_2.0.3         
## [10] evaluate_0.23           grid_4.3.2              fastmap_1.1.1          
## [13] jsonlite_1.8.8          e1071_1.7-14            DBI_1.2.1              
## [16] rvest_1.0.3             httr_1.4.7              selectr_0.4-2          
## [19] purrr_1.0.2             fansi_1.0.6             crosstalk_1.2.1        
## [22] scales_1.3.0            codetools_0.2-19        jquerylib_0.1.4        
## [25] cli_3.6.2               rlang_1.1.3             units_0.8-5            
## [28] ellipsis_0.3.2          munsell_0.5.0           withr_3.0.0            
## [31] cachem_1.0.8            yaml_2.3.8              tools_4.3.2            
## [34] colorspace_2.1-0        curl_5.0.2              vctrs_0.6.5            
## [37] R6_2.5.1                proxy_0.4-27            lifecycle_1.0.4        
## [40] classInt_0.4-10         stringr_1.5.1           leaflet.providers_2.0.0
## [43] htmlwidgets_1.6.4       pkgconfig_2.0.3         terra_1.7-65           
## [46] pillar_1.9.0            bslib_0.6.1             gtable_0.3.4           
## [49] data.table_1.14.8       glue_1.7.0              Rcpp_1.0.12            
## [52] highr_0.10              xfun_0.41               tibble_3.2.1           
## [55] tidyselect_1.2.0        rstudioapi_0.15.0       knitr_1.45             
## [58] farver_2.1.1            htmltools_0.5.7         labeling_0.4.3         
## [61] rmarkdown_2.25          compiler_4.3.2