There are a variety of powerful mapping packages for R that can be used to generate informative and eye-catching graphics. These tools can be used to produce static images for print or interactive visuals that can be uploaded to the internet or distributed as an html file. The examples below are commonly used packages and techniques used for plotting spatial data in R.
Choropleth maps are thematic maps that display the range of a variable over geographic regions with a color gradient. Using choropleth maps to display spatial data creates an easy-to-follow representation of values that allow an audience to quickly determine the ranking of values and if a specific geographic region has a lower value, a higher value, or a value near the middle of the range. Choropleth maps frequently display data on a continuous scale or with defined classes like Jenks natural breaks or quantiles, quintiles, etc.
The cartography package includes a function (choroLayer), that makes it easy to plot a choropleth map using a spatial dataset. The map created below using this function is choropleth map that displays 5 classes of natural Jenks for state populations in the US, using dating from the 2019 ACS.
# Libraries used for this portion of the tutorial
library(pacman)
p_load(tidyverse, # Variety of packages for data manipulation
tidycensus, # Package for downloading census data
tigris, # Package for downloading census shapefiles
cartography) # Package for plotting maps
# Download total populations for each state (from 2019 ACS)
state_populations <- get_acs(geography = "state", variables = "DP05_0001E") %>%
rename(Population = "estimate") %>%
select("GEOID", "Population")
# Download shapefiles for all states (and Puerto Rico)
# shift_geometry() moves and rescales non-contiguous states
state_shapes <- states(cb = TRUE, resolution = "20m") %>%
shift_geometry()
# Join state populations to state shapefiles
map_populations <- left_join(state_shapes, state_populations, by = "GEOID") %>%
mutate(Pop_100K = Population / 100000)
# Plot choropleth map using cartography package
choroLayer(x = map_populations,
var = "Pop_100K",
method = "jenks",
nclass = 5,
legend.pos = "topleft",
legend.title.txt = "Population\n(in 100,000)")
# Add layout elements to map
layoutLayer(title = "US Statewide Populations (2019)",
frame = FALSE)
One technique to add another variable–preferably a binary variable–to a choropleth map is to add hatching to a feature. Hatching is when a pattern is applied to a specified shape, which can preserve the color of that shape while adding another visual dimension. The cartography package also includes a function (hatchedLayer) to create these patterned shapes. Here is a useful guide for creating hatched layers in R, and an accompanying guide for creating a legend for the hatched layer. The map below adds a pattern to states with a median age over 40 years over the choropleth map for state population.
# Libraries used for this portion of the tutorial
library(pacman)
p_load(tidyverse, # Variety of packages for data manipulation
tidycensus, # Package for downloading census data
kableExtra, # For creating tables
cartography) # Package for plotting maps
# Download median age
state_median_ages <- get_acs(geography = "state", variables = "DP05_0018E") %>%
rename(Median_Age = "estimate") %>%
select("GEOID", "NAME", "Median_Age")
# Select GEOID for states with median age over 40 years
older_states <- state_median_ages %>%
filter(Median_Age > 40)
# Display data for states with median age > 40
older_states %>%
arrange(desc(Median_Age)) %>%
kbl(caption = "States with Median Age > 40 in 2019 ACS",
align = "r", row.names = FALSE) %>%
kable_minimal("hover")
| GEOID | NAME | Median_Age |
|---|---|---|
| 23 | Maine | 44.7 |
| 33 | New Hampshire | 42.9 |
| 50 | Vermont | 42.9 |
| 54 | West Virginia | 42.5 |
| 12 | Florida | 42.0 |
| 72 | Puerto Rico | 41.7 |
| 09 | Connecticut | 41.0 |
| 42 | Pennsylvania | 40.8 |
| 10 | Delaware | 40.6 |
# Create hatched layers using "hatchedLayer()" from cartography
highlight_states <- hatchedLayer(
state_shapes[state_shapes$GEOID %in% older_states$GEOID,],
mode = "sfc",
pattern = "grid",
density = 2)
# Plot choropleth map using cartography package
choroLayer(x = map_populations,
var = "Pop_100K",
method = "jenks",
nclass = 5,
legend.pos = "topleft",
legend.title.txt = "Population\n(in 100,000)")
# Add hatched layer highlights
plot(highlight_states, col = "yellow", lwd = 1, add = TRUE)
legendHatched(pos = "left",
title.txt = "",
categ = "Median\nAge > 40",
patterns = "grid",
density = 1,
col = "yellow",
ptrn.bg = "black")
# Add layout elements to map
layoutLayer(title = "US Statewide Populations (2019; States with Median Age > 40 Highlighted)",
frame = FALSE)
Maps can also be used to display categorical variables that exist without a ranking. Avoid using color gradients for maps like these, which suggests a hierarchial scale to the viewer. Instead, use colors that are clearly different from each other, and try to limit the number of categories (and therefore colors) shown at once to limit potential confusion. The website COLORBREWER 2.0 provides suggestions for colors to use in maps, based on the number of categories, the type of data, and considerations for colorblind users and printer and photocopy safe color schemes. For a categorical map, use the qualitative color scales for ideas.
The map below uses the tmap package to present the majority gender for each state, using the male to female ratio from the 2019 ACS.
Note: be mindful with the colors you choose for demographic variables.
# Libraries used for this portion of the tutorial
library(pacman)
p_load(tidyverse, # Variety of packages for data manipulation
tidycensus, # Package for downloading census data
tmap) # For creating categorical maps with tmap
# Download total populations for each state (from 2019 ACS)
state_gender_ratio <- get_acs(geography = "state", variables = "DP05_0004E") %>%
mutate(Gender = ifelse(estimate > 100, "Male", "Female")) %>%
select("GEOID", "Gender")
# Join state gender ratio to state shapefiles
map_gender_ratio <- left_join(state_shapes, state_gender_ratio, by = "GEOID")
# Set mode to "plot" to create a static map
tmap_mode("plot")
# Plot with tmap
tm_shape(map_gender_ratio) +
# Add counties with vacancy rate values
tm_polygons(
# Options for visual display
title = "Statewide Majority\nGender (2019)",
"Gender",
palette = "Spectral",
lwd = 1,
border.col = "white",
legend.hist = TRUE) +
# Layout specifications
tm_layout(
frame = FALSE,
legend.outside = TRUE,
legend.hist.width = .6)
It may be helpful to add geographic features to a map, either to enhance the visuals, or, if they provide a meaningful addition, for understanding the displayed data.
Below are two common features, provided by the census bureau, that can be downloaded and added to a map for additional context.
# Libraries used for this portion of the tutorial
library(pacman)
p_load(tidyverse, # Variety of packages for data manipulation
tidycensus) # Package for downloading census data
# Download shapefile for Allegheny County, Pennsylvania
allegheny_county <- counties(state = "PA") %>%
filter(NAME == "Allegheny")
# Plot Allegheny County, Pennsylvania shapefile
plot(allegheny_county$geometry)
# Download lines for roads in Allegheny County, Pennsylvania
# Note: the variable "RTTYP" indicates the type of road
interstate_highways <- roads(state = "PA", county = "Allegheny", class = "sf") %>%
filter(RTTYP == "I")
us_highways <- roads(state = "PA", county = "Allegheny", class = "sf") %>%
filter(RTTYP == "U")
# Download lines for railways and then clip to just Allegheny, County, Pennsylvania
us_rails <- rails()
allegheny_rails <- st_intersection(us_rails, allegheny_county)
# Add roads and rails to county plot
par(mar = c(4,1,3,1))
plot(allegheny_county$geometry,
main = "Allegheny County, Pennsylvania\nwith Major Transportation Routes")
plot(interstate_highways$geometry, col = "red", add = TRUE)
plot(us_highways$geometry, col = "orange", add = TRUE)
plot(allegheny_rails$geometry, col = "black", lty = "dashed", add = TRUE)
legend("bottom", legend = c("Interstate Highways", "US Highways", "Railways"),
col = c("red", "orange", "black"),
lty = c(1, 1, 2),
box.lty = 0,
cex = 0.8,
xpd = TRUE,
bty = 'n',
inset = c(0, -.25),
ncol = 2)
# Download hydrography shapefile for Allegheny County, Pennsylvania
allegheny_water <- area_water(state = "PA", county = "Allegheny", class = "sf")
# Add hydropgraphy shapefile to county plot
plot(allegheny_county$geometry,
main = "Allegheny County, Pennsylvania\nwith Perennial and Intermittent Water Features")
plot(allegheny_water$geometry, col = "blue", border = "blue", add = TRUE)
Some spatial features or spatial datasets can only be downloaded for one geographic level (e.g. hydrography for counties), so combining those features while mapping out a larger region requires an extra step. The map_df function from the purrr package (part of Tidyverse) allows for repeated downloading of data with a specified input going through a vector of arguments. The results are then binded into a single data frame.
The hydropgrahy shapefile above, which is only available to download by county, is downloaded below for every county in Pennsylvania, and then bound into a single spatial data frame as an example.
# Download shapefile for for all Pennsylvania counties
pa_counties <- counties(state = "PA")
# Plot Pennsylvania counties shapefile
plot(pa_counties$geometry,
main = "Pennsylvania Counties")
# Download hydrography shapefile for all Pennsylvania counties using "map_df()" from tidyverse
pa_water <- map_df(pa_counties$COUNTYFP, function(x){
area_water(state = 42,
county = x,
class = "sf")})
# Add statewide hydropgraphy shapefile to Pennsylvania counties plot
plot(pa_counties$geometry,
main = "Pennsylvania Counties with Water Features")
plot(pa_water$geometry,
border = "blue",
col = "blue",
add = TRUE)
Locations with geographic coordinates can also be easily added to maps. If the data only includes street addresses, then coordinates can be identified through geocoding. Below is a simple map of Allegheny County, Pennsylvania with markers to identify each of the UPMC Hospitals in the region. See the page linked above to see how those addresses were geocoded.
# Libraries used for this portion of the tutorial
library(pacman)
p_load(sf) # Variety of functions for manipulating spatial data
# Convert UPMC Hospital data to spatial points
upmc_points <- st_as_sf(x = upmc_hospitals,
coords = c("Longitude", "Latitude"),
crs = "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")
# Plot UPMC Hospitals in Southwest Pennsylvania
par(mar = c(4,1,3,1))
plot(allegheny_county$geometry, main = "Allegheny County, Pennsylvania")
plot(allegheny_water$geometry, col = "blue", border = "blue", add = TRUE)
plot(interstate_highways$geometry, col = "red", add = TRUE)
plot(us_highways$geometry, col = "orange", add = TRUE)
plot(allegheny_rails$geometry, col = "black", lty = "dashed", add = TRUE)
plot(upmc_points$geometry, pch = 21, cex = 1.5, col = "black", bg = "purple", add = TRUE)
legend("bottom",
legend = "UPMC Hospitals",
cex = .8,
pch = 21,
pt.cex = 1.5,
pt.bg = "purple",
xpd = TRUE,
bty = 'n',
inset = c(0, -.15))
R also has a variety of packages for creating interactive maps. These maps can pull in backgrounds and icons from online collections, and can be published directly to a website or saved as an html file that can be shared.
Leaflet is a popular JavaScript library that allows users to easily build interactive, visually appealing maps that are meant to be shared online. The library was adapted to R, and is one of the most commonly used R packages for displaying geospatial information. Below is an interactive example showing the locations of the UPMC Hospitals from above over a basemap.
# Libraries used for this portion of the tutorial
library(pacman)
p_load(tidyverse, # Variety of packages for data manipulation
leaflet) # For creating interactive maps
# Plot UPMC Hospital locations over detailed basemap with leaflet
leaflet() %>%
# Add background map
addTiles() %>%
# Add markers for hospitals
addCircleMarkers(lng = upmc_hospitals$Longitude,
lat = upmc_hospitals$Latitude,
radius = 10,
fillOpacity = .75,
# Add popup with relevant information
popup = paste0("<b>",upmc_hospitals$Location,
"</b><br>",upmc_hospitals$Address),
# Add clustering option for nearby locations
clusterOptions = markerClusterOptions())
Leaflet can also be used to display geographic polygon data (e.g. county proportions, census tract counts, etc.). It might make more sense to use a less detailed background map while presenting that data. The basemap for Leaflet can be modified with the addProviderTiles function. Leaflet’s R tutorial page has a guide for using basemap options and plugins.
Here is an example using countywide data from the Census Bureau to plot an interactive choropleth map in leaflet.
# Libraries used for this portion of the tutorial
library(pacman)
p_load(tidyverse, # Variety of packages for data manipulation
stringr, # For removing ", Pennsylvania" from county names
leaflet, # For creating interactive maps
sf) # For transforming coordinates
# Download Pennsylvania counties with homeowner vacancy rate
pa_vacancy_rate <- get_acs(geography = "county",
variables = "DP04_0004E",
state = "PA",
geometry = TRUE) %>%
rename(Homeowner_Vacancy_Rate = estimate) %>%
mutate(County = str_remove_all(NAME, ", Pennsylvania")) %>%
select(GEOID, County, Homeowner_Vacancy_Rate)
# Create a color palette for homeowner vacancy rate
palette <- colorNumeric(palette = "viridis",
domain = pa_vacancy_rate$Homeowner_Vacancy_Rate)
# Plot counties with homeowner vacancy rates
pa_vacancy_rate %>%
# Transform coordinates
st_transform(crs = "+init=epsg:4326") %>%
# Plot with leaflet
leaflet(width = "100%") %>%
# Add background map
addProviderTiles(provider = "CartoDB.Positron") %>%
# Add countyies with vacancy rate values
addPolygons(
# Options for visual display
stroke = TRUE,
weight = 1,
color = "white",
smoothFactor = 0,
fillOpacity = 0.7,
fillColor = ~ palette(Homeowner_Vacancy_Rate),
# Add popup with relevant information
popup = paste0("<b>", pa_vacancy_rate$County,
"</b><br> FIPS Code: ", pa_vacancy_rate$GEOID,
"<br>", pa_vacancy_rate$Homeowner_Vacancy_Rate,
"% homes vacant"),
# Add label while hovering over county
label = paste0(pa_vacancy_rate$County),
# Highlight county while hovering over it
highlightOptions = highlightOptions(
weight = 5,
color = "black",
fillOpacity = 0.7,
bringToFront = TRUE)) %>%
# Add a legend
addLegend("topright",
pal = palette,
values = ~Homeowner_Vacancy_Rate,
title = "Homeowner<br>Vacancy Rate (%)",
opacity = 1)
Leaflet also allows for the audience to control which layers, legends, backgrounds, etc. are displayed. In the example below, the map created from above has groups added to each element, which allows the final portion of the code, addLayersControl, to toggle on and off those elements.
# Plot counties with homeowner vacancy rates
pa_vacancy_rate %>%
st_transform(crs = "+init=epsg:4326") %>%
leaflet(width = "100%") %>%
# Add three backgrounds and specify their groups
addProviderTiles(provider = "CartoDB.Positron",
group = "Positron (Default)") %>%
addProviderTiles(provider = "Stamen.TonerLite",
group = "Toner Lite") %>%
addTiles(group = "OpenStreetMap") %>%
# Add counties with vacancy rate values as a group
addPolygons(stroke = TRUE,
weight = 1,
color = "white",
smoothFactor = 0,
fillOpacity = 0.7,
fillColor = ~ palette(Homeowner_Vacancy_Rate),
popup = paste0("<b>", pa_vacancy_rate$County,
"</b><br> FIPS Code: ",
pa_vacancy_rate$GEOID,
"<br>",
pa_vacancy_rate$Homeowner_Vacancy_Rate,
"% homes vacant"),
label = paste0(pa_vacancy_rate$County),
highlightOptions = highlightOptions(
weight = 5,
color = "black",
fillOpacity = 0.7,
bringToFront = TRUE),
group = "Vacancy Rate") %>%
# Add legend with vacancy rate group
addLegend("topright",
values = ~Homeowner_Vacancy_Rate,
pal = palette,
opacity = 1,
title = "Homeowner<br>Vacancy Rate (%)",
group = "Vacancy Rate") %>%
# Add layers control to allow audience to toggle elements
addLayersControl(options = layersControlOptions(collapsed = FALSE),
position = "topleft",
baseGroups = c("Positron (default)",
"Toner Lite",
"OpenStreetMap"),
overlayGroups = c("Vacancy Rate"))
The R library tmap can replicate many of Leaflet’s options for map design while specifying “view” mode. Below is an example showing how tmap can be used to recreate the Leaflet graphic above.
# Libraries used in this portion of the tutorial
library(pacman)
p_load(tmap) # For creating interactive maps with tmap
# Set mode to "view" to create an interactive map
tmap_mode("view")
# Add background map
tm_basemap(leaflet::providers$CartoDB.Positron) +
# Plot with tmap
tm_shape(pa_vacancy_rate) +
# Add counties with vacancy rate values
tm_polygons(
# Options for visual display
"Homeowner_Vacancy_Rate",
palette = "viridis",
alpha = 0.7,
lwd = 1,
border.col = "white",
title = "Homeowner<br>Vacancy Rate (%)",
id = "County",
group = "Vacancy Rate"
# Add popup with relevant information
popup.vars = c("Fips Code" = "GEOID",
"Homes Vacant (%)" = "Homeowner_Vacancy_Rate"))
Note: tmap’s popup window width can be buggy and does not allow for the same level of customization as Leaflet’s popup windows. This could be an important reason to prefer Leaflet over tmap.
tmap can also be used to create static images that are better suited for print as a figure in an article. Below is the same graphic as above, but instead displayed as a static figure, rather than an interactive map. The figure also adds a distribution plot of the categories to enhance the audience’s understanding of the data.
# Set mode to "plot" to create a static map
tmap_mode("plot")
# Plot with tmap
tm_shape(pa_vacancy_rate) +
# Add counties with vacancy rate values
tm_polygons(
# Options for visual display
title = "Homeowner\nVacancy Rate (%)",
var = "Homeowner_Vacancy_Rate",
palette = "viridis",
alpha = 0.7,
lwd = 1,
border.col = "white") +
# Layout specifications
tm_layout(
frame = FALSE,
legend.outside = TRUE,
legend.hist.width = 5)