Basic Choropleth Mapping with GISTools

Overview

GISTools is a set of utilities for creating maps and manipulating spatial data in R. It is very useful as a quick way to get websites with maps published in them, or to make reproducible documents containing maps.

For example, this web page was created from an R Markdown document. Markdown is a simple formatting syntax for authoring web pages. (If you use RStudio then open an R Markdown document and click the MD toolbar button for help on Markdown - when you click the Knit HTML button a web page will be generated that includes both content as well as the output of any embedded R code chunks within the document).

You will need to have installed the GISTools package and its dependencies (maptools, RColorBrewer, rgeos, MASS) before you can work through the example below. If do not already have this package, the installion is probably most easily done by checking the 'install dependencies' option when installing via the GUI, or by using

install.packages('GISTools',depend=T)

from the command line. The package requires a version of R no older than 2.15.0.

Getting started with GISTools

Initially, load the package and select the georgia data set:

library(GISTools)
data(georgia)

georgia is a SpatialPolygonsDataFrame containing various information about counties in the state of Georgia, from the 1990 US Census. One variable record is the median income per county (MedInc):

hist(georgia$MedInc, col = "red")

plot of chunk unnamed-chunk-2


With GISTools you can also create choropleth maps quickly:

choropleth(georgia, georgia$MedInc)

plot of chunk unnamed-chunk-3


That was a very quick but not very helpful map. It is useful for exploration of spatial pattern if you know more about the magnitude of the income values, the name of the area you are examining and so on, but for a wider audience you should add more detail to the map, such as a legend, a north arrow and some explanatory text. This is possible with a few more GISTools function calls:

# Compute a rate for mapping and put it in a variable - re-scale to avoid
# a profusion of zeros
incomes <- georgia$MedInc/1000
# Create a shading scheme
shades <- auto.shading(incomes, n = 6, cutter = rangeCuts, cols = brewer.pal(6, 
    "Greens"))
# Draw a map based on this scheme
choropleth(georgia, incomes, shades)
# Add a legend based on this scheme
choro.legend(-82, 34.87, shades, fmt = "%4.1f", title = "Median Income (1000$'s)")
# Add a title to the map
title("Median Income in Georgia (1990)")
# ... and a little more explanatory text
text(-85.3, 30.4, "Source: US Census")
# ... and a north-point arrow
north.arrow(-80.9, 30.4, 0.1)
# and draw a box around it
box(which = "outer")

plot of chunk unnamed-chunk-4


A Little More Explanation

The choropleth command above pretty much does as the name suggests. Given a SpatialPolygons or SpatialPolygonsDataFrame object and a numeric vector with as many elements as there are polygons, it draws a choropleth map. As a default it automatically provides a shading colour scheme and a set of intervals. However these schemes are only automatic, not necessarily optimal - and in some cases not even appropriate! The intention was just to provide a 'quick look' option for users with some prior knowledge of the variable being mapped. This is very much for private rather than public use.

However, if a shading parameter is also given, more control is offered. This is what happened in the second map example above. The shading parameter is an object of class shading - this contains information about the two key characteristics of choropleth map shading - the colours used and the numerical intervals into which the data are divided. These are passed to the choropleth function, to actually create the map, and also to the choro.legend function, to create the legend. Obviously, for the map to make sense, the same shading object should be used for the map and the legend.

One way of creating a shading object is through the auto.shading command. As the name suggests, this isn't a completely manual approach – in fact this is what the default approach in choropleth uses – but by overriding defaults, quite a few characteristics can be changed. In particular n controls the number of classes, cols specify the shading colours, and cutter is an R function to choose a set of interval cut-points given the data to be plotted. rangeCuts and quantileCuts are two ready-made such functions, and respectively provide equal interval and quantile-based categorisations.

Another way of creating a shading is by using the shading command. This simply takes a list of numerical cut values, and a corresponding list of colours, to create the shading object. This provides full manual control, if none of the automated methods work well. Note that there must be one more colour than there are break values. If the cut points list is c(x1,x2,...,xn) then this implies intervals of c(-Inf,x1),c(x1,x2), … , c(xn,Inf) - so that the number of classes is one more than the number of cut points provided.

As an example, here is a choropleth map based on round numbers of income:

# Create a different shading scheme - this time based on multiples of 10k
shades2 <- shading(c(30, 40, 50, 60, 70), cols = brewer.pal(6, "Blues"))
# Draw a map based on this scheme
choropleth(georgia, incomes, shades2)
# Add a legend based on this scheme
choro.legend(-82, 34.87, shades2, fmt = "%4.1f", title = "Median Income (1000$'s)")
# add the embellishments
title("Median Income in Georgia (1990)")
text(-85.3, 30.4, "Source: US Census")
north.arrow(-80.9, 30.4, 0.1)
box(which = "outer")

plot of chunk unnamed-chunk-5

a final thing to note is that the colour list can consist of elements of any kind of valid colour specification in R. Here, brewer.pal is used to generate lists of palettes from Cynthia Brewer's ColorBrewer map colouring system.


brewer.pal can be used to produce divergent colour schemes, where hue is used to indicate whether mapped values are above or below a datum, and intensity is used to show the amount above or below. A typical datum might be zero, where maps show whether values are positive or negative. The following example, based on the income data, uses the mean value as a datum.

# Function to make symmetric equal size shading intervals centred on the
# mean
symRangeCuts <- function(x, n, params = NA) {
    x.bar <- mean(x)
    cuts.list <- rangeCuts(c(-x + x.bar, x - x.bar), n) + x.bar
    return(cuts.list)
}
# Create a shading scheme using this function
shades3 <- auto.shading(incomes, n = 9, cutter = symRangeCuts, cols = brewer.pal(9, 
    "PuOr"))
# Proceed as before - this time the box is thicker...
choropleth(georgia, incomes, shades3)
choro.legend(-82, 34.87, shades3, fmt = "%3.0f", title = "Median Income (1000$'s)", 
    bty = "n")
title("Median Income in Georgia (1990)", cex.main = 2)
text(-85.3, 30.4, "Source: US Census")
north.arrow(-80.9, 30.4, 0.1)
box(which = "outer", lwd = 8)