tidycensus

The tidycensus package in R allows for quick and easy retrieval and cleaning of US Census data. In this lab, I will show some of the functionality of the package, specifically the “get_acs()” function.

First, I will load the necessary R packages:

library(tidycensus)
library(tidyverse)
library(scales)
library(plotly)
library(ggiraph)
library(htmlwidgets)
library(mapview)
library(sf)

For my first example, I will examine the percentage of individuals in Michigan who have a graduate degree, by county. The data will be retrieved from the 2021 American Community Survey.

mi_grad_degree <- get_acs(
  geography = "county",
  state = "MI",
  variables = "DP02_0066P",
  year = 2021
)

To examine the counties with the highest and lowest rates of graduate degree attainment, I will use the arrange function.

# top 10 highest counties
arrange(mi_grad_degree, -estimate)
## # A tibble: 83 × 5
##    GEOID NAME                            variable   estimate   moe
##    <chr> <chr>                           <chr>         <dbl> <dbl>
##  1 26161 Washtenaw County, Michigan      DP02_0066P     30.3   0.7
##  2 26125 Oakland County, Michigan        DP02_0066P     21.3   0.3
##  3 26089 Leelanau County, Michigan       DP02_0066P     19.6   1.6
##  4 26065 Ingham County, Michigan         DP02_0066P     18.4   0.6
##  5 26083 Keweenaw County, Michigan       DP02_0066P     16.1   3.3
##  6 26077 Kalamazoo County, Michigan      DP02_0066P     15.8   0.6
##  7 26047 Emmet County, Michigan          DP02_0066P     15.6   1.4
##  8 26055 Grand Traverse County, Michigan DP02_0066P     14.8   1  
##  9 26061 Houghton County, Michigan       DP02_0066P     14.8   1.4
## 10 26111 Midland County, Michigan        DP02_0066P     14.7   1.1
## # ℹ 73 more rows
# top 10 lowest counties
arrange(mi_grad_degree, estimate)
## # A tibble: 83 × 5
##    GEOID NAME                       variable   estimate   moe
##    <chr> <chr>                      <chr>         <dbl> <dbl>
##  1 26085 Lake County, Michigan      DP02_0066P      3.7   0.7
##  2 26117 Montcalm County, Michigan  DP02_0066P      4.3   0.6
##  3 26067 Ionia County, Michigan     DP02_0066P      4.4   0.6
##  4 26113 Missaukee County, Michigan DP02_0066P      4.4   0.8
##  5 26131 Ontonagon County, Michigan DP02_0066P      4.4   0.7
##  6 26157 Tuscola County, Michigan   DP02_0066P      4.4   0.6
##  7 26023 Branch County, Michigan    DP02_0066P      4.5   0.6
##  8 26129 Ogemaw County, Michigan    DP02_0066P      4.5   0.8
##  9 26135 Oscoda County, Michigan    DP02_0066P      4.6   1.3
## 10 26011 Arenac County, Michigan    DP02_0066P      4.7   0.7
## # ℹ 73 more rows

To better visualize this data, I will create a margin of error plot. Because Michigan has so many counties, this method can become a bit cluttered and hard to decipher.

mi_grad_plot <- ggplot(mi_grad_degree, aes(x = estimate,
                               y = reorder(NAME, estimate))) +
  geom_point(color = "darkred", size = 1) +
  scale_y_discrete(labels = function(x) str_remove(x, " County, Michigan")) +
  scale_x_continuous(labels = label_percent(scale = 1, suffix = "%")) +
  geom_errorbar(aes(xmin = estimate - moe, xmax = estimate + moe),
                width = 0.5, linewidth = 0.5) +
  labs(title = "Percentage of residents with a graduate degree, 2017-2021 ACS",
       subtitle = "Counties in Michigan",
       caption = "Data aquired with R and tidycensus. Error bars represent margin of error around estimate",
       x = "ACS estimate",
       y = "") +
  theme_minimal(base_size = 5)

mi_grad_plot

Using the plotly package, we can turn this static chart into an interactive plot

mi_plot_ggiraph <- ggplot(mi_grad_degree, aes(x = estimate,
                                       y = reorder(NAME, estimate),
                                       tooltip = estimate,
                                       data_id = GEOID)) +
  geom_errorbar(aes(xmin = estimate - moe, xmax = estimate + moe),
                width = 0.25, linewidth = 0.25) +
  geom_point_interactive(color = "darkred", size = 0.75) +
  scale_y_discrete(labels = function(x) str_remove(x, " County, Michigan")) +
  scale_x_continuous(labels = label_percent(scale = 1, suffix = "%")) +
  labs(title = "Percent of individuals with a graduate degree or higher, 2017-2021 ACS",
       subtitle = "Counties in Michigan",
       caption = "Data aquired with R and tidycensus. Error bars represent margin of error around estimate",
       x = "ACS estimate",
       y = "") +
  theme_minimal(base_size = 5)

girafe(ggobj = mi_plot_ggiraph) %>%
  girafe_options(opts_hover(css = "fill:cyan;"))

tidycensus also allows users to perform spatial analysis. To find a variable of interest, I will use the “load_variables()” function to see what data is availble.

vars <- load_variables(2021, "acs5")

For this example, I will use “B07001_017” which estimates the number of people who lived in their current home one year ago. This can be a good indicator of economic and geographic mobility.

mi_mobility <- get_acs(
  geography = "county",
  state = "MI",
  variables = c(num_in_home_1_year_ago = "B07001_017"),
  year = 2021,
  geometry = TRUE # return geometry to enable spatial analysis
)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |                                                                      |   1%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  56%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |======================================================================| 100%

The mapview package offers a quick way of interactively viewing this data in a map

mapview(mi_mobility, zcol = "estimate")

Additionally, I can use skills learned from last week’s lab to create a graduated symbol map in ggplot.

# create county centroids
mi_centroids = st_centroid(mi_mobility)
## Warning: st_centroid assumes attributes are constant over geometries
st_geometry_type(mi_centroids, by_geometry = FALSE)
## [1] POINT
## 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE
ggplot() +
  geom_sf(data = mi_centroids, aes(size = estimate)) +
  geom_sf(data = mi_mobility, fill = NA) +
  scale_size_continuous(name = "number of people in current home for at least one year", range = c(1,10)) +
  theme_minimal() +
  labs(title = "Mobility of Michiganders, 2017-2021 ACS",
       caption = "Data aquired with R and tidycensus. Error bars represent margin of error around estimate")

Hopefully these simple examples show the power of tidycensus!