This file will support R users with the basics of mapping and spatial analysis with the U.S. Census data using the tidycensus() package. The slides for this file are available here.

1 Data

For mapping tasks, we use the American Community Survey (ACS). This is an annual survey that covers topics not available in the decennial US Census data, such as income, education, etc. Estimates are available for both 1- and 5-year periods. The default for the getacs() function is the 5-year ACS estimates.

To change to the 1-year estimates add survey = "acs1" below the geometry = T input line. Walker recommends, however, that you stick with the 5-year ACS file to get spatial coherence and coherence.

The data are delivered as estimates characterized by margins of error.

1.1 tidycensus()

The tidycensus() package created by Walker “automatically downloads and merges Censsus geometries to data for mapping [and] includes a variety of analytic tools to support common Census workflows.”

The states and counties can be requested by name, so there is no need to look up FIPS codes1.

1.2 Mapping tools

There are multiple packages that can be used for cartography. Some of the popular packages that can be used are ggplot2, tmap, and mapsf. Walker, however, built a new package called mapgl.

1.2.1 mapgl

mapgl is a new R package written by Walker for high-performance interactive mapping.


Start by installing/loading packages (as needed) and libraries:

# Install packages as needed
# install.packages(c("tidycensus", "tidyverse", "mapview", "mapgl", "quarto"))

# Load necessary libraries
library(tidycensus)
library(dplyr)
library(ggplot2)
library(sf)

1.3 Gathering data

We’ll start by gathering the data using the get_acs() function.

1.3.1 DC

Given the size of DC, we will use geography = "tract".

dc_income <- get_acs(
  geography = "tract",
  variables = "B19013_001",
  state = "DC",
  year = 2023
)

dc_income
## # A tibble: 206 × 5
##    GEOID       NAME                                      variable estimate   moe
##    <chr>       <chr>                                     <chr>       <dbl> <dbl>
##  1 11001000101 Census Tract 1.01; District of Columbia;… B19013_…   135708 43153
##  2 11001000102 Census Tract 1.02; District of Columbia;… B19013_…   159583 70430
##  3 11001000201 Census Tract 2.01; District of Columbia;… B19013_…       NA    NA
##  4 11001000202 Census Tract 2.02; District of Columbia;… B19013_…   152059 60482
##  5 11001000300 Census Tract 3; District of Columbia; Di… B19013_…   174470 21374
##  6 11001000400 Census Tract 4; District of Columbia; Di… B19013_…   188929 71412
##  7 11001000501 Census Tract 5.01; District of Columbia;… B19013_…   109116 18172
##  8 11001000502 Census Tract 5.02; District of Columbia;… B19013_…   157344 32376
##  9 11001000600 Census Tract 6; District of Columbia; Di… B19013_…   183421 33830
## 10 11001000702 Census Tract 7.02; District of Columbia;… B19013_…    87750 16226
## # ℹ 196 more rows

1.3.2 NC

For NC, we’ll use geography = "county".

nc_income <- get_acs(
  geography = "county",
  variables = "B19013_001",
  state = "NC",
  year = 2023
)

nc_income
## # A tibble: 100 × 5
##    GEOID NAME                             variable   estimate   moe
##    <chr> <chr>                            <chr>         <dbl> <dbl>
##  1 37001 Alamance County, North Carolina  B19013_001    64445  2226
##  2 37003 Alexander County, North Carolina B19013_001    65268  5441
##  3 37005 Alleghany County, North Carolina B19013_001    44272  3954
##  4 37007 Anson County, North Carolina     B19013_001    44245  3895
##  5 37009 Ashe County, North Carolina      B19013_001    50827  3707
##  6 37011 Avery County, North Carolina     B19013_001    57657  4054
##  7 37013 Beaufort County, North Carolina  B19013_001    57997  3520
##  8 37015 Bertie County, North Carolina    B19013_001    45931  6859
##  9 37017 Bladen County, North Carolina    B19013_001    44528  4772
## 10 37019 Brunswick County, North Carolina B19013_001    74034  2420
## # ℹ 90 more rows

1.4 Spatial data

We’ll now add the spatial data to our income data using the simple features sf conditions via geometry = T. Take note that we’ll use the same code above but simply add a new last line of code.

We’ll use geometry=T and update the name of our df.

1.5 DC spatial data

Gathering spatial data for DC at the tract level.

dc_income_sf <- get_acs(
  geography = "tract",
  variables = "B19013_001",
  state = "DC",
  year = 2023,
  geometry = T
)
dc_income_sf
## Simple feature collection with 206 features and 5 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -77.11976 ymin: 38.79165 xmax: -76.9094 ymax: 38.99511
## Geodetic CRS:  NAD83
## First 10 features:
##          GEOID                                                           NAME
## 1  11001000201  Census Tract 2.01; District of Columbia; District of Columbia
## 2  11001010300   Census Tract 103; District of Columbia; District of Columbia
## 3  11001002801 Census Tract 28.01; District of Columbia; District of Columbia
## 4  11001004002 Census Tract 40.02; District of Columbia; District of Columbia
## 5  11001006700    Census Tract 67; District of Columbia; District of Columbia
## 6  11001007707 Census Tract 77.07; District of Columbia; District of Columbia
## 7  11001008803 Census Tract 88.03; District of Columbia; District of Columbia
## 8  11001009302 Census Tract 93.02; District of Columbia; District of Columbia
## 9  11001009509 Census Tract 95.09; District of Columbia; District of Columbia
## 10 11001002802 Census Tract 28.02; District of Columbia; District of Columbia
##      variable estimate   moe                       geometry
## 1  B19013_001       NA    NA POLYGON ((-77.07902 38.9126...
## 2  B19013_001   112778 40979 POLYGON ((-77.03636 38.9748...
## 3  B19013_001    86029 16177 POLYGON ((-77.03645 38.9349...
## 4  B19013_001   167102 45557 POLYGON ((-77.04627 38.9166...
## 5  B19013_001   166250 29702 POLYGON ((-76.99496 38.8898...
## 6  B19013_001    59512 32728 POLYGON ((-76.94486 38.8790...
## 7  B19013_001    91317 14597 POLYGON ((-77.00173 38.9099...
## 8  B19013_001   106364 29915 POLYGON ((-76.99494 38.9239...
## 9  B19013_001   116613 28202 POLYGON ((-77.00201 38.9510...
## 10 B19013_001    82675 14933 POLYGON ((-77.03671 38.9271...

1.6 NC spatial data

Gathering spatial data for NC at the county level.

nc_income_sf <- get_acs(
  geography = "county",
  variables = "B19013_001",
  state = "NC",
  year = 2023,
  geometry = T
)

2 Mapping

We can use our spatial data frames to generate a plot using the plot() function.

2.1 DC

We’ll use the DC income data to plot. Recall that we neeed to use our sf df.

plot(dc_income_sf)

Note here that we get multiple maps, which is not likely what you want. Each of the maps contains information for the variables in our data set. We’ll need to specify that we want the estimates.

plot(dc_income_sf['estimate'])

2.2 NC

We’ll do the same for NC.

plot(nc_income_sf['estimate'])

At the state level, it is very clear what the purpose of our analysis would be. We’ll likely want to follow a specific research question from this point.

I will modify the code to look at a specific county in NC, Mecklenburg County.

Take note of the new rows added: - geography = tract - state = NC - county = “Mecklenburg”

nc_mecklenburg_income_sf <- get_acs(
  geography = "tract",
  state = "NC",
  county = "Mecklenburg",
  variables = "B19013_001",
  year = 2023,
  geometry = T
)

We then view the data.

nc_mecklenburg_income_sf
## Simple feature collection with 305 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -81.05803 ymin: 35.00162 xmax: -80.55035 ymax: 35.51479
## Geodetic CRS:  NAD83
## First 10 features:
##          GEOID                                                   NAME
## 1  37119005200    Census Tract 52; Mecklenburg County; North Carolina
## 2  37119004800    Census Tract 48; Mecklenburg County; North Carolina
## 3  37119001507 Census Tract 15.07; Mecklenburg County; North Carolina
## 4  37119004400    Census Tract 44; Mecklenburg County; North Carolina
## 5  37119003201 Census Tract 32.01; Mecklenburg County; North Carolina
## 6  37119001605 Census Tract 16.05; Mecklenburg County; North Carolina
## 7  37119003108 Census Tract 31.08; Mecklenburg County; North Carolina
## 8  37119005713 Census Tract 57.13; Mecklenburg County; North Carolina
## 9  37119005915 Census Tract 59.15; Mecklenburg County; North Carolina
## 10 37119003203 Census Tract 32.03; Mecklenburg County; North Carolina
##      variable estimate   moe                       geometry
## 1  B19013_001    39688 18884 MULTIPOLYGON (((-80.84089 3...
## 2  B19013_001    33725 10600 MULTIPOLYGON (((-80.85648 3...
## 3  B19013_001    53383  5797 MULTIPOLYGON (((-80.75249 3...
## 4  B19013_001    46768  8288 MULTIPOLYGON (((-80.89605 3...
## 5  B19013_001    79335 22713 MULTIPOLYGON (((-80.87936 3...
## 6  B19013_001    45307 10145 MULTIPOLYGON (((-80.78052 3...
## 7  B19013_001    48762 11738 MULTIPOLYGON (((-80.87627 3...
## 8  B19013_001   123438 15165 MULTIPOLYGON (((-80.68104 3...
## 9  B19013_001    79338  9571 MULTIPOLYGON (((-80.9407 35...
## 10 B19013_001    52359 17710 MULTIPOLYGON (((-80.86911 3...

We then plot the data.

plot(nc_mecklenburg_income_sf['estimate'])

This graph is at a more focused geographic level, similar to the DC map.

3 Improving your map

We can also modify more visually appealing maps.

Let’s add the vidris library to enhance the visualization.

library(viridis)

Then we’ll produce the map using ggplot2.

ggplot(nc_mecklenburg_income_sf) +
  geom_sf(aes(fill = estimate), color = "white", size = 0.1) +
  scale_fill_viridis(option = "magma", name = "Median Household Income") +
  labs(
    title = "Median Household Income in Mecklenburg County, NC",
    subtitle = "By Census Tract (2023)",
    caption = "Data: American Community Survey 5-Year Estimates"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16),
    plot.subtitle = element_text(size = 12),
    axis.title = element_blank(),
    legend.position = "right"
  )

4 Interactive maps

We can also make interactive maps with various specifications.

We’ll first need to load the mapview library to make an interactive map.

library(mapview)

Then we’ll call both our data frame with the sf option and specify that we want the estimate.

mapview(
  dc_income_sf, 
  zcol = "estimate"
)

We’ll do the same for NC.

mapview(
  nc_income_sf, 
  zcol = "estimate"
)

Then finally for Mecklenburg County, NC.

mapview(
  nc_mecklenburg_income_sf, 
  zcol = "estimate"
)

Walker continues from here to explore various advanced options with mapping.

This file may be updated in the future; else, please see the Walker files.


  1. Walker made a note about using FIPS codes based on a change made in 2022. It will be important to review the FIPS codes as some counties have been changed (names) and may be included twice, so be aware of your goals when using the FIPS codes.↩︎