Introduction

Howdy y’all! This is a bare-bones work-through, but I hope it’s helpful for everyone. The “intro to R” section is just going to be a live code-through, so refer to the video for that portion.

I’m splitting this tutorial into three sections:

  • Reading in data with sf

  • Making a static map with tmap

  • Making an interactive map with tmap

  • Resources

You’ll get your feet wet with a few basic maps and I will provide you with resources to continue learning. I am assuming that everyone has a basic knowledge of GIS, including what vector data is, what raster data is, and what a projection is. If you need to come up to speed, check out the wonderful (free and online) book Geocomputation with R. This has a basic overview of what you need to know and goes further down the rabbit hole of GIS in R.

Let’s get started!

Reading in data

Before we do anything, let’s load our packages. We are loading the sf package for working with vector data, the dplyr package for data wrangling, the tmap package for mapping, and the here package for file path management. Make sure to install these packages on your home system before loading them with library().

library(sf)
library(tmap)
library(dplyr)
library(here)

Today, we’re visualizing a data set of murders and urban population percentages for each state in the 49 continental states from 1973. We’re using this data because it was easily available from some other R package I have.

To read in vector data, you need the read_sf function. read_sf is smart enough to know what type of file you have, based on the extension! read_sf can read in any common vector data type and at least some obscure data types. I haven’t encountered a situation where read_sf couldn’t read in my file.

usa_murder <- read_sf(here("data", "usa_murder.geojson"))

A quick peak at the data tells us this is a simple feature collection, with 49 features (observations) and 3 fields (variables). sf objects store your vector data as data frames, with the geometry getting its own column. The geometry is a collection of wkt strings, one for each observation, which represents the vectors we’re interested in plotting.

At the top of the output, there is some important information about the object. The geometry type of our object is a MULTIPOLYGON, which means that multiple polygons can represent each observation (e.g. Florida has a mainland and islands, so the “Florida” observation is represented by multiple polygons). The dimension is XY, so we know that we’re working in two dimensions rather than XYZ, which is three dimensions. The bbox specifies the edges of the object, which are the max/min x coordinates and max/min y coordinates. Finally, the CRS is the coordinate reference system, which in this case is EPSG 4326 (WGS 84). This can also be represented by the proj4string if there is no EPSG for your CRS.

usa_murder
## Simple feature collection with 49 features and 3 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -124.7346 ymin: 24.54255 xmax: -66.89192 ymax: 49.36949
## CRS:            4326
## # A tibble: 49 x 4
##    name      urban_pop murder                                           geometry
##    <chr>         <int>  <dbl>                                 <MULTIPOLYGON [°]>
##  1 Minnesota        66    2.7 (((-95.16057 49.36949, -95.10282 49.35394, -94.98…
##  2 Washingt…        73    4   (((-122.6533 48.99251, -122.4334 48.99251, -122.2…
##  3 Idaho            54    2.6 (((-117.0382 48.99251, -116.9382 48.99251, -116.7…
##  4 Montana          53    6   (((-116.0482 48.99251, -115.8391 48.99251, -115.6…
##  5 North Da…        44    0.8 (((-104.0476 48.99262, -103.9695 48.99262, -103.7…
##  6 Michigan         74   12.1 (((-84.4913 46.45749, -84.4783 46.46647, -84.4654…
##  7 Maine            51    2.1 (((-69.22197 47.45969, -69.06966 47.43189, -69.05…
##  8 Ohio             75    7.3 (((-80.52023 41.98446, -80.52023 41.90489, -80.52…
##  9 New Hamp…        56    2.1 (((-71.50585 45.01373, -71.50408 45.01374, -71.50…
## 10 New York         86   11.1 (((-74.71296 44.99925, -74.69588 44.99803, -74.59…
## # … with 39 more rows

Making a static map

Let’s start with a quick example. Here is a map of urban population as a percentage of total population across the continental USA in 1973. We’re using all defaults and it still looks pretty good! There’s a reasonable color palette, aesthetically pleasing background, and a legend. All this with two lines of code is pretty sweet. But what does the code mean?

tm_shape(usa_murder) +
  tm_polygons("urban_pop")

The first thing you have to do with any map is to specify the object with tm_shape. This function creates a spatial data object of your data that you can then add layers to. Layers are added with the + operator. Since our object contains polygons, we’re using the tm_polygons function to add a polygon layer. To specify the variable/field we want to map the polygon fill colors to, we wrap the variable name in quotation marks. There are many layer types you can add, which can be explored with ?'tmap-element'. Let’s add the state names to this layer!

Note about workshop error After some digging, I found what the problem we were having in the workshop was. One of the upstream packages that tmap relies on (the tibble package) created an incompatibility in their latest release (which happened a couple weeks ago). I have an older version of the tibble package on my PC, but RStudio cloud installed the latest release, which is why it worked on my computer, but not on the cloud. They’ve since fixed the issue in tmap, but you need to install the development version of the package to get the fix.

# if you don't have the remotes package, install it first, with 
install.packages("remotes")

# to install the development version of tmap, do this
remotes::install_github("mtennekes/tmap")  
remotes::install_github("mtennekes/tmaptools")

Now we know what the states’ names are! tmap even cleverly colors the text to maximize readability. But, this is super cluttered. How do we fix that?

tm_shape(usa_murder) +
  tm_polygons("urban_pop") +
  tm_text("name")

Each tm_ layer has multiple options for customization. To see the options, use ?. For example, to see the options for tm_text, use ?tm_text. tm_text has a convenient option to shrink the size of the text proportional to the size of the polygon, set with size = "AREA". It definitely reduces the clutter of the plot, but the small states are left out :(.

tm_shape(usa_murder) +
  tm_polygons("urban_pop") +
  tm_text("name", size = "AREA")

tm_text has a lower bound on text size that can be printed, which you can change with the size.lowerbound argument. Or, you can override any bounds with the print.tiny argument! Now, all of the labels get printed. It adds a bit of clutter to the map, but at least everything is on now. There are methods for removing this clutter, like abbreviating the state names or displacing the labels, but that is outside the scope of this tutorial. Going forward, I’m going to assume the viewer knows the names of the small states and remove the print.tiny = TRUE argument.

tm_shape(usa_murder) +
  tm_polygons("urban_pop") +
  tm_text("name", 
          size = "AREA", 
          print.tiny = TRUE)

The color of this plot is nice and everything, but I think it could use some more pizzazz. The viridis palette is all the rage these days, so let’s try that out. To change the color palette for a layer, we use the palette argument. There are many different palette choices that you can specify! To explore your options, check out the tmaptools::palette_explorer() function.

tm_shape(usa_murder) +
  tm_polygons("urban_pop",
              palette = "viridis") +
  tm_text("name", 
          size = "AREA")

Now that the map is looking nice, let’s take care of that legend. We can change the title to something more informative using the title argument of tm_polygons. I would also like it to be in the bottom right portion of the map, which we can specify within the tm_legend function, using the legend.position argument. We need to provide a vector, where the first element is the x-axis position (left, center, or right) and the second element is the y-axis position (top, center, bottom). I think this looks much better!

tm_shape(usa_murder) +
  tm_polygons("urban_pop",
              palette = "viridis",
              title = "% Urban") +
  tm_text("name", 
          size = "AREA") +
  tm_legend(legend.position = c("right", "bottom"))

What if you wanted to visualize two variables side-by-side to see if there are obvious spatial relationships between the two? To do this, we can use tmap_arrange to arrange map panels. First, we need to assign our “Percent urban population” map to its own object. We want to focus on the spatial distribution of each variable, so let’s remove the text labels for now. We’ll call it urban_map.

urban_map <- tm_shape(usa_murder) +
  tm_polygons("urban_pop",
              palette = "viridis",
              title = "% Urban") +
  tm_legend(legend.position = c("right", "bottom")) 

Next, let’s visualize the murder rate per state and assign it to its own variable. We will use the same parameters and change the title to something more relevant. We’ll call it murder_map. It doesn’t look like there is a strong correlation between urban population and murder rate, but let’s plot them side-by-side to make sure.

murder_map <- tm_shape(usa_murder) +
  tm_polygons("murder",
              palette = "viridis",
              title = "Murder rate") +
  tm_legend(legend.position = c("right", "bottom")) 

murder_map

To plot them side-by-side, we’re using tmap_arrange! We just need to supply it with the maps we want to include, separated by commas. This will automatically arrange them according to the plotting screen (in this case, vertically), but what if we want them to arrange in a specific fashion every time?

tmap_arrange(urban_map, murder_map)

To arrange them how we want, we supply the ncol and nrow arguments. Let’s use the ncol argument to arrange the plots horizontally. The legend placement gets a little wonky here, but dive into the tm_legend, tm_layout and tmap_arrange docs to figure out how to get everything just right.

tmap_arrange(urban_map, murder_map, ncol = 2)

There is a lot more functionality and customization options for static maps, but we have to stop here for this workshop! I’ll provide some resources below for you to take your maps further.

Now onto a (very short) intro to interactive mapping!

Making an interactive map

Interactive maps are very effective for sharing and exploring data. They allow you to zoom in on features, explore anomalies, investigate patterns further, and much more. tmap has made creating interactive maps incredibly easy, by allowing you to do it in one command. tmap_mode.

To switch on “interactive mode”, all you need to do is preface your map(s) with tmap_mode("view").

tmap_mode("view")

urban_map

Woah! Some of the base features include scrollover text (in this case we get the state names- yay for not cluttering the plot!); zooming (see top left); a moving legend, and multiple base map options (the three sheet icon in the top left). There are many ways to customize interactive maps, but I’ll leave that up to y’all to explore (Time is short!).

To switch back to “static mode”, you just need to use tmap_mode("plot").

tmap_mode("plot")
## tmap mode set to plotting
urban_map

Neat! The final lesson is going to be how to save your maps. We do this using the tmap_save function. Static maps can be saved in a variety of image formats. You just need to specify the path and the extension. One of my favorites is .png, so we’ll do that here.

tmap_save(urban_map, here("urban_map.png"))

To save your map as an interactive map, you need to use a .html extension.

tmap_save(urban_map, here("urban_map.html"))

And that’s all we have time for! I hope this got everyone stoked on how to make maps in R! Keep reading for more resources to take your map-making and GIS skills further.

Resources

Tutorials

To get more of an overview on what tmap can do, I would first recommend checking out their get started vignette.

To go further, I highly recommend Zev Ross’s workshop Creating beautiful demographic maps in R with the tidycensus and tmap packages. If processing census data isn’t your thing, you can skip to the mapping part with minimal headache. There’s a nice little section on working with raster data, which we didn’t cover today.

Books

First and foremost, I recommend the Geocomputation with R free, online book for mapping and performing basic GIS analyses in R. I use it as a reference all the time!

For a slightly more involved look at analyzing spatial data in R, I recommend the regularly updated free, online book Spatial data science with applications in R. If you want to go beyond a surface level of understanding of GIS operations, mapping, and analyses, definitely read this.

Blogs/Twitter

r-spatial.org is a great blog to keep up with, and is especially useful for their list of spatial R packages. My favorite series of blog posts begins with this one: Drawing beautiful maps programmatically with R, sf and ggplot2.

There are many great spatial Twitter folks to follow. Here’s a list of a few I recommend:

edzerpebesma
CivicAngela
robinlovelace
kyle_e_walker