Introduction

There are a variety of powerful mapping packages for R that can be used to generate informative and eye-catching graphics. These tools can be used to produce static images for print or interactive visuals that can be uploaded to the internet or distributed as an html file. The examples below are commonly used packages and techniques used for plotting spatial data in R.

Choropleth Map

Choropleth maps are thematic maps that display the range of a variable over geographic regions with a color gradient. Using choropleth maps to display spatial data creates an easy-to-follow representation of values that allow an audience to quickly determine the ranking of values and if a specific geographic region has a lower value, a higher value, or a value near the middle of the range. Choropleth maps frequently display data on a continuous scale or with defined classes like Jenks natural breaks or quantiles, quintiles, etc.

The cartography package includes a function (choroLayer), that makes it easy to plot a choropleth map using a spatial dataset. The map created below using this function is choropleth map that displays 5 classes of natural Jenks for state populations in the US, using dating from the 2019 ACS.

# Libraries used for this portion of the tutorial
library(pacman)

p_load(tidyverse,   # Variety of packages for data manipulation
       tidycensus,  # Package for downloading census data
       tigris,      # Package for downloading census shapefiles
       cartography) # Package for plotting maps

# Download total populations for each state (from 2019 ACS)
state_populations <- get_acs(geography = "state", variables = "DP05_0001E") %>%
  rename(Population = "estimate") %>%
  select("GEOID", "Population")

# Download shapefiles for all states (and Puerto Rico)
# shift_geometry() moves and rescales non-contiguous states 
state_shapes <- states(cb = TRUE, resolution = "20m") %>%
  shift_geometry()

# Join state populations to state shapefiles
map_populations <- left_join(state_shapes, state_populations, by = "GEOID") %>%
  mutate(Pop_100K = Population / 100000)

# Plot choropleth map using cartography package
choroLayer(x = map_populations,
           var = "Pop_100K",
           method = "jenks",
           nclass = 5,
           legend.pos = "topleft",
           legend.title.txt = "Population\n(in 100,000)")

# Add layout elements to map
layoutLayer(title = "US Statewide Populations (2019)",
            frame = FALSE)

Hatched Feature

One technique to add another variable–preferably a binary variable–to a choropleth map is to add hatching to a feature. Hatching is when a pattern is applied to a specified shape, which can preserve the color of that shape while adding another visual dimension. The cartography package also includes a function (hatchedLayer) to create these patterned shapes. Here is a useful guide for creating hatched layers in R, and an accompanying guide for creating a legend for the hatched layer. The map below adds a pattern to states with a median age over 40 years over the choropleth map for state population.

# Libraries used for this portion of the tutorial
library(pacman)

p_load(tidyverse,   # Variety of packages for data manipulation
       tidycensus,  # Package for downloading census data
       kableExtra,  # For creating tables
       cartography) # Package for plotting maps

# Download median age
state_median_ages <- get_acs(geography = "state", variables = "DP05_0018E") %>%
  rename(Median_Age = "estimate") %>%
  select("GEOID", "NAME", "Median_Age")

# Select GEOID for states with median age over 40 years
older_states <- state_median_ages %>%
  filter(Median_Age > 40)

# Display data for states with median age > 40
older_states %>%
  arrange(desc(Median_Age)) %>%
  kbl(caption = "States with Median Age > 40 in 2019 ACS", 
      align = "r", row.names = FALSE) %>%
  kable_minimal("hover")
States with Median Age > 40 in 2019 ACS
GEOID NAME Median_Age
23 Maine 44.7
33 New Hampshire 42.9
50 Vermont 42.9
54 West Virginia 42.5
12 Florida 42.0
72 Puerto Rico 41.7
09 Connecticut 41.0
42 Pennsylvania 40.8
10 Delaware 40.6
# Create hatched layers using "hatchedLayer()" from cartography
highlight_states <- hatchedLayer(
  state_shapes[state_shapes$GEOID %in% older_states$GEOID,],
  mode = "sfc",
  pattern = "grid",
  density = 2)

# Plot choropleth map using cartography package
choroLayer(x = map_populations,
           var = "Pop_100K",
           method = "jenks",
           nclass = 5,
           legend.pos = "topleft",
           legend.title.txt = "Population\n(in 100,000)")

# Add hatched layer highlights
plot(highlight_states, col = "yellow", lwd = 1, add = TRUE)
legendHatched(pos = "left",
              title.txt = "",
              categ = "Median\nAge > 40",
              patterns = "grid",
              density = 1,
              col = "yellow",
              ptrn.bg = "black")

# Add layout elements to map
layoutLayer(title = "US Statewide Populations (2019; States with Median Age > 40 Highlighted)",
            frame = FALSE)

Categorical Map

Maps can also be used to display categorical variables that exist without a ranking. Avoid using color gradients for maps like these, which suggests a hierarchial scale to the viewer. Instead, use colors that are clearly different from each other, and try to limit the number of categories (and therefore colors) shown at once to limit potential confusion. The website COLORBREWER 2.0 provides suggestions for colors to use in maps, based on the number of categories, the type of data, and considerations for colorblind users and printer and photocopy safe color schemes. For a categorical map, use the qualitative color scales for ideas.

The map below uses the tmap package to present the majority gender for each state, using the male to female ratio from the 2019 ACS.

Note: be mindful with the colors you choose for demographic variables.

# Libraries used for this portion of the tutorial
library(pacman)

p_load(tidyverse,   # Variety of packages for data manipulation
       tidycensus,  # Package for downloading census data
       tmap)        # For creating categorical maps with tmap

# Download total populations for each state (from 2019 ACS)
state_gender_ratio <- get_acs(geography = "state", variables = "DP05_0004E") %>%
  mutate(Gender = ifelse(estimate > 100, "Male", "Female")) %>%
  select("GEOID", "Gender")

# Join state gender ratio to state shapefiles
map_gender_ratio <- left_join(state_shapes, state_gender_ratio, by = "GEOID")

# Set mode to "plot" to create a static map
tmap_mode("plot")

# Plot with tmap
tm_shape(map_gender_ratio) +
  
  # Add counties with vacancy rate values
  tm_polygons(
    
    # Options for visual display
    title = "Statewide Majority\nGender (2019)",
    "Gender",
    palette = "Spectral",
    lwd = 1,
    border.col = "white",
    legend.hist = TRUE) +
  
  # Layout specifications
  tm_layout(
    frame = FALSE,
    legend.outside = TRUE,
    legend.hist.width = .6)

Adding Features

It may be helpful to add geographic features to a map, either to enhance the visuals, or, if they provide a meaningful addition, for understanding the displayed data.

Below are two common features, provided by the census bureau, that can be downloaded and added to a map for additional context.

Add Roadways

# Libraries used for this portion of the tutorial
library(pacman)

p_load(tidyverse,   # Variety of packages for data manipulation
       tidycensus)  # Package for downloading census data

# Download shapefile for Allegheny County, Pennsylvania
allegheny_county <- counties(state = "PA") %>%
  filter(NAME == "Allegheny")

# Plot Allegheny County, Pennsylvania shapefile
plot(allegheny_county$geometry)

# Download lines for roads in Allegheny County, Pennsylvania
# Note: the variable "RTTYP" indicates the type of road
interstate_highways <- roads(state = "PA", county = "Allegheny", class = "sf") %>%
  filter(RTTYP == "I")

us_highways <- roads(state = "PA", county = "Allegheny", class = "sf") %>%
  filter(RTTYP == "U")

# Download lines for railways and then clip to just Allegheny, County, Pennsylvania
us_rails <- rails()
allegheny_rails <- st_intersection(us_rails, allegheny_county)

# Add roads and rails to county plot
par(mar = c(4,1,3,1))
plot(allegheny_county$geometry,
     main = "Allegheny County, Pennsylvania\nwith Major Transportation Routes")
plot(interstate_highways$geometry, col = "red", add = TRUE)
plot(us_highways$geometry, col = "orange", add = TRUE)
plot(allegheny_rails$geometry, col = "black", lty = "dashed", add = TRUE)
legend("bottom", legend = c("Interstate Highways", "US Highways", "Railways"),
       col = c("red", "orange", "black"),
       lty = c(1, 1, 2),
       box.lty = 0,
       cex = 0.8,
       xpd = TRUE,
       bty = 'n',
       inset = c(0, -.25),
       ncol = 2)

Add Water (Hydropgraphy)

# Download hydrography shapefile for Allegheny County, Pennsylvania
allegheny_water <- area_water(state = "PA", county = "Allegheny", class = "sf")

# Add hydropgraphy shapefile to county plot
plot(allegheny_county$geometry,
     main = "Allegheny County, Pennsylvania\nwith Perennial and Intermittent Water Features")
plot(allegheny_water$geometry, col = "blue", border = "blue", add = TRUE)

Combine Features

Some spatial features or spatial datasets can only be downloaded for one geographic level (e.g. hydrography for counties), so combining those features while mapping out a larger region requires an extra step. The map_df function from the purrr package (part of Tidyverse) allows for repeated downloading of data with a specified input going through a vector of arguments. The results are then binded into a single data frame.

The hydropgrahy shapefile above, which is only available to download by county, is downloaded below for every county in Pennsylvania, and then bound into a single spatial data frame as an example.

# Download shapefile for for all Pennsylvania counties
pa_counties <- counties(state = "PA")

# Plot Pennsylvania counties shapefile
plot(pa_counties$geometry,
     main = "Pennsylvania Counties")

# Download hydrography shapefile for all Pennsylvania counties using "map_df()" from tidyverse
pa_water <- map_df(pa_counties$COUNTYFP, function(x){
  area_water(state = 42,
             county = x,
             class = "sf")})

# Add statewide hydropgraphy shapefile to Pennsylvania counties plot
plot(pa_counties$geometry,
     main = "Pennsylvania Counties with Water Features")
plot(pa_water$geometry,
     border = "blue",
     col = "blue",
     add = TRUE)

Add Specific Locations

Locations with geographic coordinates can also be easily added to maps. If the data only includes street addresses, then coordinates can be identified through geocoding. Below is a simple map of Allegheny County, Pennsylvania with markers to identify each of the UPMC Hospitals in the region. See the page linked above to see how those addresses were geocoded.

# Libraries used for this portion of the tutorial
library(pacman)

p_load(sf)   # Variety of functions for manipulating spatial data

# Convert UPMC Hospital data to spatial points
upmc_points <- st_as_sf(x = upmc_hospitals,
                 coords = c("Longitude", "Latitude"),
                 crs = "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")

# Plot UPMC Hospitals in Southwest Pennsylvania
par(mar = c(4,1,3,1))
plot(allegheny_county$geometry, main = "Allegheny County, Pennsylvania")
plot(allegheny_water$geometry, col = "blue", border = "blue", add = TRUE)
plot(interstate_highways$geometry, col = "red", add = TRUE)
plot(us_highways$geometry, col = "orange", add = TRUE)
plot(allegheny_rails$geometry, col = "black", lty = "dashed", add = TRUE)
plot(upmc_points$geometry, pch = 21, cex = 1.5, col = "black", bg = "purple", add = TRUE)
legend("bottom",
       legend = "UPMC Hospitals",
       cex = .8,
       pch = 21,
       pt.cex = 1.5,
       pt.bg = "purple",
       xpd = TRUE,
       bty = 'n',
       inset = c(0, -.15))

Plot Interactive Map

R also has a variety of packages for creating interactive maps. These maps can pull in backgrounds and icons from online collections, and can be published directly to a website or saved as an html file that can be shared.

Using Leaflet

Leaflet is a popular JavaScript library that allows users to easily build interactive, visually appealing maps that are meant to be shared online. The library was adapted to R, and is one of the most commonly used R packages for displaying geospatial information. Below is an interactive example showing the locations of the UPMC Hospitals from above over a basemap.

# Libraries used for this portion of the tutorial
library(pacman)

p_load(tidyverse, # Variety of packages for data manipulation
       leaflet)   # For creating interactive maps

# Plot UPMC Hospital locations over detailed basemap with leaflet
leaflet() %>%
  
  # Add background map
  addTiles() %>%
  
  # Add markers for hospitals
  addCircleMarkers(lng = upmc_hospitals$Longitude,
                   lat = upmc_hospitals$Latitude,
                   radius = 10,
                   fillOpacity = .75,
                   
                   # Add popup with relevant information
                   popup = paste0("<b>",upmc_hospitals$Location,
                                  "</b><br>",upmc_hospitals$Address),
                   
                   # Add clustering option for nearby locations
                   clusterOptions = markerClusterOptions())

Leaflet can also be used to display geographic polygon data (e.g. county proportions, census tract counts, etc.). It might make more sense to use a less detailed background map while presenting that data. The basemap for Leaflet can be modified with the addProviderTiles function. Leaflet’s R tutorial page has a guide for using basemap options and plugins.

Here is an example using countywide data from the Census Bureau to plot an interactive choropleth map in leaflet.

# Libraries used for this portion of the tutorial
library(pacman)

p_load(tidyverse, # Variety of packages for data manipulation
       stringr,   # For removing ", Pennsylvania" from county names
       leaflet,   # For creating interactive maps
       sf)        # For transforming coordinates

# Download Pennsylvania counties with homeowner vacancy rate
pa_vacancy_rate <- get_acs(geography = "county",
                           variables = "DP04_0004E",
                           state = "PA",
                           geometry = TRUE) %>%
  rename(Homeowner_Vacancy_Rate = estimate) %>%
  mutate(County = str_remove_all(NAME, ", Pennsylvania")) %>%
  select(GEOID, County, Homeowner_Vacancy_Rate)

# Create a color palette for homeowner vacancy rate
palette <- colorNumeric(palette = "viridis",
                        domain = pa_vacancy_rate$Homeowner_Vacancy_Rate)

# Plot counties with homeowner vacancy rates
pa_vacancy_rate %>%
  
  # Transform coordinates
  st_transform(crs = "+init=epsg:4326") %>%
  
  # Plot with leaflet
  leaflet(width = "100%") %>%
  
  # Add background map
  addProviderTiles(provider = "CartoDB.Positron") %>%
  
  # Add countyies with vacancy rate values
  addPolygons(
    
    # Options for visual display
    stroke = TRUE,
    weight = 1,
    color = "white",
    smoothFactor = 0,
    fillOpacity = 0.7,
    fillColor = ~ palette(Homeowner_Vacancy_Rate),
    
    # Add popup with relevant information
    popup = paste0("<b>", pa_vacancy_rate$County,
                   "</b><br> FIPS Code: ", pa_vacancy_rate$GEOID,
                   "<br>", pa_vacancy_rate$Homeowner_Vacancy_Rate,
                   "% homes vacant"),
    
    # Add label while hovering over county
    label = paste0(pa_vacancy_rate$County),
    
    # Highlight county while hovering over it
    highlightOptions = highlightOptions(
            weight = 5,
            color = "black",
            fillOpacity = 0.7,
            bringToFront = TRUE)) %>%
  
  # Add a legend
  addLegend("topright",
            pal = palette,
            values = ~Homeowner_Vacancy_Rate,
            title = "Homeowner<br>Vacancy Rate (%)",
            opacity = 1)

Controlling map elements

Leaflet also allows for the audience to control which layers, legends, backgrounds, etc. are displayed. In the example below, the map created from above has groups added to each element, which allows the final portion of the code, addLayersControl, to toggle on and off those elements.

# Plot counties with homeowner vacancy rates
pa_vacancy_rate %>%
  st_transform(crs = "+init=epsg:4326") %>%
  leaflet(width = "100%") %>%
  
  # Add three backgrounds and specify their groups
  addProviderTiles(provider = "CartoDB.Positron",
                   group = "Positron (Default)") %>%
  addProviderTiles(provider = "Stamen.TonerLite",
                   group = "Toner Lite") %>%
  addTiles(group = "OpenStreetMap") %>%
  
  # Add counties with vacancy rate values as a group
  addPolygons(stroke = TRUE,
              weight = 1,
              color = "white",
              smoothFactor = 0,
              fillOpacity = 0.7,
              fillColor = ~ palette(Homeowner_Vacancy_Rate),
              popup = paste0("<b>", pa_vacancy_rate$County,
                             "</b><br> FIPS Code: ", 
                             pa_vacancy_rate$GEOID,
                             "<br>",
                             pa_vacancy_rate$Homeowner_Vacancy_Rate,
                             "% homes vacant"),
              label = paste0(pa_vacancy_rate$County),
              highlightOptions = highlightOptions(
                weight = 5,
                color = "black",
                fillOpacity = 0.7,
                bringToFront = TRUE),
              group = "Vacancy Rate") %>%

  # Add legend with vacancy rate group
  addLegend("topright",
            values = ~Homeowner_Vacancy_Rate,
            pal = palette,
            opacity = 1,
            title = "Homeowner<br>Vacancy Rate (%)",
            group = "Vacancy Rate") %>%
  
  # Add layers control to allow audience to toggle elements
  addLayersControl(options = layersControlOptions(collapsed = FALSE),
                   position = "topleft",
                   baseGroups = c("Positron (default)",
                                  "Toner Lite",
                                  "OpenStreetMap"),
                   overlayGroups = c("Vacancy Rate"))

Using tmap

The R library tmap can replicate many of Leaflet’s options for map design while specifying “view” mode. Below is an example showing how tmap can be used to recreate the Leaflet graphic above.

# Libraries used in this portion of the tutorial
library(pacman)

p_load(tmap)  # For creating interactive maps with tmap

# Set mode to "view" to create an interactive map
tmap_mode("view")

# Add background map
tm_basemap(leaflet::providers$CartoDB.Positron) +
  
# Plot with tmap
tm_shape(pa_vacancy_rate) +
  
# Add counties with vacancy rate values
tm_polygons(

  # Options for visual display
  "Homeowner_Vacancy_Rate",
  palette = "viridis",
  alpha = 0.7,
  lwd = 1,
  border.col = "white",
  title = "Homeowner<br>Vacancy Rate (%)",
  id = "County",
  group = "Vacancy Rate"
  
  # Add popup with relevant information
  popup.vars = c("Fips Code" = "GEOID",
                 "Homes Vacant (%)" = "Homeowner_Vacancy_Rate"))

Note: tmap’s popup window width can be buggy and does not allow for the same level of customization as Leaflet’s popup windows. This could be an important reason to prefer Leaflet over tmap.

Static version with tmap

tmap can also be used to create static images that are better suited for print as a figure in an article. Below is the same graphic as above, but instead displayed as a static figure, rather than an interactive map. The figure also adds a distribution plot of the categories to enhance the audience’s understanding of the data.

# Set mode to "plot" to create a static map
tmap_mode("plot")

# Plot with tmap
tm_shape(pa_vacancy_rate) +
  
# Add counties with vacancy rate values
tm_polygons(

  # Options for visual display
  title = "Homeowner\nVacancy Rate (%)",
  var = "Homeowner_Vacancy_Rate",
  palette = "viridis",
  alpha = 0.7,
  lwd = 1,
  border.col = "white") +

# Layout specifications
tm_layout(
  frame = FALSE,
  legend.outside = TRUE,
  legend.hist.width = 5)