Choropleth Mapping with R

IS415 Geospatial Analytics and Application

Instructor: Dr. Kam Tin Seong.

Assoc. Professor of Information Systems (Practice)

1.Getting Started

packages = c('sf','tidyverse','tmap')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

2.Importing Data into R

st_read() function is from the sf package

read_csv() function is from the readr package

mpsz <- st_read("data/geospatial", layer= "MP14_SUBZONE_WEB_PL")
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Week 3\Hands-on_Ex03\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS:  SVY21
popagsex <- read_csv("data/aspatial/respopagsex2000to2018.csv")
## Parsed with column specification:
## cols(
##   PA = col_character(),
##   SZ = col_character(),
##   AG = col_character(),
##   Sex = col_character(),
##   Pop = col_double(),
##   Time = col_double()
## )

3.Data Wrangling

Before creating the thematic map, we need to fo the following:

  • extract 2018 records
  • extract males records
  • derive 3 new variables, namely: Young, Economically Active, Aged

We will use these functions:

  • spread() from the tidyr package
  • mutate(), filter(), select() from the dplyr package
  • mutate_at() in this code chunk modifies the exisitng PA and SZ column and apply fun(toupper) to it, changing all the observations to UPPER CASE
popagsex2018_male <- popagsex %>%
  filter(Sex == 'Males') %>%
  filter(Time == 2018) %>%
  spread(AG, Pop) %>%
  mutate(`YOUNG` = `0_to_4`+`5_to_9`+`10_to_14`+
`15_to_19`+`20_to_24`) %>%
  mutate(`ECONOMICALLY ACTIVE` = rowSums(.[9:13])+rowSums(.[15:17]))%>%
  mutate(`AGED`= rowSums(.[18:22])) %>%
  mutate(`TOTAL` = rowSums(.[5:22])) %>%
  mutate(`DEPENDENCY` = (`YOUNG` + `AGED`)/ `ECONOMICALLY ACTIVE`)%>%
  mutate_at(.vars = vars(PA, SZ), .funs = funs(toupper))%>%
  select(`PA`, `SZ`, `YOUNG`, `ECONOMICALLY ACTIVE`, `AGED`, `TOTAL`, `DEPENDENCY` ) %>%
  filter(`ECONOMICALLY ACTIVE`> 0)

mpsz_agemale2018 <- left_join(mpsz, popagsex2018_male, by= c("SUBZONE_N"="SZ"))

4.Choropleth Mapping using tmap

two approaches to prepare thematic map using tmap:

  • using qtm()
  • using tmap elements for highly customisable thematic map
qtm(mpsz_agemale2018, fill ='DEPENDENCY')

The basic building block of tmap are:

  • tm_shape()
  • tm_fill()
  • tm_polygons()
tm_shape(mpsz_agemale2018)+
  tm_polygons()

tm_shape(mpsz_agemale2018)+
  tm_polygons("DEPENDENCY")

for tm_polygons():

  • by default, missing value will be shaded in grey
  • default colour scheme used is “YIOrRd” of ColorBrewer
  • default interval binning used is called “pretty”
  • tm_polygons() is a wrapper of tm_fill() and tm_borders(). tm_fill() gives the default colour scheme and tm_border() adds borders of shapefile
tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY") +
  tm_borders(lwd = 0.1,  alpha = 1)

for tm_borders(), arguments are:

  • alpha -> transparency of the borders from 0, totally transparent to 1, not transparent/opaque
  • col -> border colour
  • lwd -> border line width, default is 1
  • lty -> border line type, default is ‘solid’

5.Data Classification methods of tmap

Classification -> take a large number of observations and group them into data rangers or classes

tmap has 10 data classification methods:

  • fixed
  • sd
  • equal
  • pretty (default)
  • quantile
  • kmeans
  • hclust
  • bclust
  • fisher
  • jenks or natural breaks
tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", n=8, style = "jenks") +
  tm_borders(alpha = 0.5)

tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", n=5, style = "equal") +
  tm_borders(alpha = 0.5)

To implement custom breaks

  • breakpoints can be explicitly set through breaks argument in the tm_fill()
  • tmap breaks include min and max so if we want n categories, n+1 elements must be specified in increasing order
  • always good to observe the descriptive statistics first before setting break points
summary(mpsz_agemale2018$DEPENDENCY)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  0.0000  0.6480  0.6847  0.6708  0.7393  0.9559      90
tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", breaks = c(0,0.6,0.7,0.8,0.9,1.0))+
  tm_borders(alpha = 0.5)

6.Colour Scheme

tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY",
          n = 6,
          style = "quantile",
          palette = "Blues") +
  tm_borders(alpha = 0.5)

To reverse the colour shading, add a “-” prefix

tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY",
          style = "quantile",
          palette = "-Greens") +
  tm_borders(alpha = 0.5)

7.Map Layouts

Map layout elements:

  • objects to be mapped
  • title : tm_layout(main.title)
  • scale bar : tm_scale_bar()
  • compass : tm_compass()
  • grid : tm_grid()
  • margins and aspect ratios
tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", 
          style = "jenks", 
          palette = "Blues", 
          legend.hist = TRUE, 
          legend.is.portrait = TRUE,
          legend.hist.z = 0.1) +
  tm_layout(main.title = "Distribution of Dependency Ratio by planning subzone \n(Jenks classification)",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_borders(alpha = 0.5)

tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", 
          style = "quantile", 
          palette = "-Greens") +
  tm_borders(alpha = 0.5) +
  tmap_style("classic")
## tmap style set to "classic"
## other available styles are: "white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor"

tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", 
          style = "quantile", 
          palette = "Blues",
          title = "No. of persons") +
  tm_layout(main.title = "Distribution of Dependency Ratio \nby planning subzone",
            main.title.position = "center",
            main.title.size = 1.2,
            legend.height = 0.45, 
            legend.width = 0.35,
            frame = TRUE) +
  tm_borders(alpha = 0.5) +
  tm_compass(type="8star", size = 2) +
  tm_scale_bar(width = 0.15) +
  tm_grid(lwd = 0.1, alpha = 0.2) +
  tm_credits("Source: Planning Sub-zone boundary from Urban Redevelopment Authorithy (URA)\n and Population data from Department of Statistics DOS", 
             position = c("left", "bottom"))

8.Small Multiple Choropleth Maps

Also called facet maps, stacked horizontally or vertically sometimes. Small multiple maps can be plotted in 3 ways:

  1. assigning multiple values to at least one of the aesthetic arguments
  2. defining a group-by variable in tm_facets()
  3. creating multiple stand-alone maps with tmap_arrange()

assigning multiple values to at least one of the aesthetic arguments

tm_shape(mpsz_agemale2018)+
  tm_fill(c("YOUNG","AGED"), style = 'equal', palette = 'Blues')+
  tm_layout(legend.position = c("right", "bottom"))+
  tm_borders(alpha=0.5)+
  tmap_style('white')
## tmap style set to "white"
## other available styles are: "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "classic", "watercolor"

tm_shape(mpsz_agemale2018)+
  tm_polygons(c("DEPENDENCY", "AGED"), style = c("equal", "quantile"), palette = list("Blues", "Greens"))+
  tm_layout(legend.position = c("right", "bottom"))

defining a group-by variable in tm_facets()

tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", style='quantile', palette = "Blues", thres.poly = 0)+
  tm_facets(by="REGION_N", free.coords = TRUE, drop.shapes = TRUE)+
  tm_layout(legend.show= FALSE, title.position = c('center', 'center'), title.size = 20)+
  tm_borders(alpha = 0.5)

creating multiple stand-alone maps with tmap_arrange()

youngmap <- tm_shape(mpsz_agemale2018)+
  tm_polygons("YOUNG", style='quantile', palette = "Blues")

agedmap <- tm_shape(mpsz_agemale2018) +
  tm_polygons("AGED", style = "quantile", palette = "Blues")

tmap_arrange(youngmap, agedmap, asp = 1, ncol = 2)

#### mapping spatial object using selection criterion

tm_shape(mpsz_agemale2018)+
  tm_fill("DEPENDENCY", style='quantile', palette = "Blues", legend.hist = TRUE, legend.is.portrait = TRUE, legend.hist.z = 0.1)+
  tm_layout(legend.outside = TRUE, legend.height = 0.45, legend.width = 5.0, legend.position = c("right", "bottom"), frame = FALSE)+
  tm_borders(alpha = 0.5)

Reference

Rpubs link