Mapping Race in R with the 2020 Census Redistricting File (PL)

Kara Joyner

Pre-Assignment Assessment: What is this showing?

Mark all that apply

  • a. The percent of each state’s population comprised of non-Hispanic whites

  • b. The distribution of the non-Hispanic white population across states

  • c. The percent of the non-Hispanic white population that resides in different states

  • d. The rate at which people in each state identify as non-Hispanic white

Learning Goals

  • You will see examples of how the racial and ethnic make up of the US is depicted in maps.

  • You will learn how to sign up for a U.S. Census Data API Key.

  • You will learn how to read US Census data directly into R.

  • You will learn different ways to map the racial/ethnic makeup of a county.

  • You will learn to publish your document on Quarto Pub.

Maps of Race/Ethnic Composition

Geography in tidycensus

Today’s Data Visualizations

  • Most prevalent racial/ethnic group in each census tract of Bexar County

  • Percent of population in each tract of Bexar County identifying with a particular racial/ethnic group

  • Number of people from each racial/ethnic group across Bexar County tracts

  • Distribution of Hispanics across states

Mapping Resources

Listing Libraries

library(haven)
library(janitor)
library(dplyr)
library(terra)
library(tmap)
library(tidycensus)
library(utils)
library(mapview)
library(tmap)
library(ggplot2)
library(posterior)
library(plyr)
library(sf)
options(tigris_use_cache = TRUE)

Using the Census Data API Key

  • The API key gives you access raw data from the US Census

  • Obtain a key here: Request a U.S. Census Data API Key

  • Use the code below to install (first time) or overwrite (subsequent times) the key.

census_api_key(key =  "8f054fa4225c982d7a9f27aa67fd88f8bb5f77ec", overwrite = TRUE)  #census_api_key(key =  "8f054fa4225c982d7a9f27aa67fd88f8bb5f77ec", install=TRUE)

Scanning Variables in the 2020 Census Redistricting Data

  • This code creates an object that includes all the variable names.
vars <- load_variables(year=2020, dataset = "pl")
head(vars)
# A tibble: 6 × 3
  name    label                                             concept         
  <chr>   <chr>                                             <chr>           
1 H1_001N " !!Total:"                                       OCCUPANCY STATUS
2 H1_002N " !!Total:!!Occupied"                             OCCUPANCY STATUS
3 H1_003N " !!Total:!!Vacant"                               OCCUPANCY STATUS
4 P1_001N " !!Total:"                                       RACE            
5 P1_002N " !!Total:!!Population of one race:"              RACE            
6 P1_003N " !!Total:!!Population of one race:!!White alone" RACE            

Reading the 2020 Census Redistricting Data into R

  • This is code is explained on page 129 of Kyle Walker’s book.
bexar_race <- get_decennial(
  geography = "tract",
  state = "TX",
  county = "Bexar",
  variables = c(
    Hispanic = "P2_002N",
    White = "P2_005N",
    Black = "P2_006N",
    Native = "P2_007N",
    Asian = "P2_008N"
  ),
  summary_var = "P2_001N",
  year = 2020,
  geometry = TRUE) %>%
  mutate(percent = 100 * (value / summary_value))

Becoming Acquainted with the Data

  • What is this showing?
nrow(bexar_race)
[1] 1875
head(bexar_race)
Simple feature collection with 6 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -98.62401 ymin: 29.40333 xmax: -98.55362 ymax: 29.49398
Geodetic CRS:  NAD83
# A tibble: 6 × 7
  GEOID     NAME  variable value summary_value                  geometry percent
  <chr>     <chr> <chr>    <dbl>         <dbl>        <MULTIPOLYGON [°]>   <dbl>
1 48029171… Cens… Hispanic  3862          4284 (((-98.62398 29.41355, -…  90.1  
2 48029171… Cens… White      183          4284 (((-98.62398 29.41355, -…   4.27 
3 48029171… Cens… Black      150          4284 (((-98.62398 29.41355, -…   3.50 
4 48029171… Cens… Native      17          4284 (((-98.62398 29.41355, -…   0.397
5 48029171… Cens… Asian       17          4284 (((-98.62398 29.41355, -…   0.397
6 48029180… Cens… Hispanic  4306          5617 (((-98.58517 29.47894, -…  76.7  

Filtering the Data

  • Above we see that for each tract there are 5 rows that correspond to 5 different racial/ethnic groups.
  • The code below selects for each tract the row that has the largest number.
  • This enables us to produce a map that shows the most common race/ethnicity in each tract.
bexar_new<-bexar_race %>%
group_by(GEOID) %>%
slice(which.max(value))

Viewing the Filtered Data

nrow(bexar_new) 
[1] 375
head(bexar_new)
Simple feature collection with 6 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -98.5152 ymin: 29.40677 xmax: -98.46093 ymax: 29.44854
Geodetic CRS:  NAD83
# A tibble: 6 × 7
# Groups:   GEOID [6]
  GEOID     NAME  variable value summary_value                  geometry percent
  <chr>     <chr> <chr>    <dbl>         <dbl>        <MULTIPOLYGON [°]>   <dbl>
1 48029110… Cens… Hispanic  1758          3812 (((-98.50131 29.4258, -9…    46.1
2 48029110… Cens… Hispanic  1589          2401 (((-98.48895 29.41608, -…    66.2
3 48029110… Cens… Hispanic  1982          2369 (((-98.51479 29.41527, -…    83.7
4 48029110… Cens… Hispanic  4763          7284 (((-98.51358 29.42691, -…    65.4
5 48029110… Cens… Hispanic   933          1210 (((-98.51508 29.44412, -…    77.1
6 48029111… Cens… Hispanic  1403          2356 (((-98.47861 29.43792, -…    59.6

Creating a Map Using the Filtered Data

  • What is a limitation of this map?
mapview(bexar_new, zcol = "variable")

Creating a Map for Each Racial/Ethnic Group

  • This code retrieves the first data set that was produced earlier (bexar_race).
  • The ggplot function produces maps of the percent of population in each tract in each racial/ethnic group.
faceted_choro <- ggplot(bexar_race, aes(fill = percent)) + 
  geom_sf(color = NA) + 
  theme_void() + 
  scale_fill_viridis_c(option = "rocket") + 
  facet_wrap(~variable) + 
  labs(title = "Race / ethnicity by Census tract",
       subtitle = "Bexar County, Texas",
       fill = "Census value (%)",
       caption = "2020 Census Redistricting Data")

Displaying the Maps

  • What is a limitation of this way of showing racial/ethnic concentration?
faceted_choro

Preparing the Data for a Dot Map

  • This code creates dots that represent the number of people in each tract who identify with each racial/ethnic group.
  • You must specify the number of people that each dot represents.
bexar_dots <-bexar_race %>%
  as_dot_density(
    value = "value",
    values_per_dot = 100,
    group = "variable"
  )

Creating a Dot Map

  • This code produces a dot map.
dot_density_map <- ggplot() + 
  geom_sf(data = bexar_race, color = "lightgrey", fill = "white") + 
  geom_sf(data = bexar_dots, aes(color = variable), size = 0.01) + 
  scale_color_brewer(palette = "Set1") + 
  guides(color = guide_legend(override.aes = list(size = 3))) + 
  theme_void() + 
  labs(color = "Race / ethnicity",
       caption = "2020 Redistricing Data | 1 dot = approximately 100 people")

Displaying the Dot Map

  • Where are the dots placed?
dot_density_map

Re-reading the 2020 Census Redistricting Data into R

  • This code is similar but refers to geography at the state level.
Hispanics_all <- get_decennial(   
  geography = "us",   
  variables = c("P2_002N"),   
  year = 2020) 

Hispanics_state <- get_decennial(   
  geography = "state",   
  variables = c(     
  Hispanic = "P2_002N"),   
  year = 2020)

Hispanics_state$proportion<-(Hispanics_state$value)/(Hispanics_all$value)

Joining to Spatial Data

library(urbnmapr)  
library(urbnthemes)  
library(janitor)  
library(sf)    

Hispanics_state$state_name<-Hispanics_state$NAME

spatial_data <- left_join(Hispanics_state, get_urbn_map(map = "states", sf = TRUE), by = "state_name")

Creating a Map

  • Here is the code for producing a map based on the Hispanic data.
Hispanics<-ggplot() + 
geom_sf(spatial_data, 
mapping = aes(fill = proportion, geometry=geometry), 
color = "#ffffff", size = 0.25) + 
scale_fill_gradientn(labels = scales::percent) + 
labs(fill = "Percent") + 
coord_sf(datum = NA) + 
labs(title="Distribution of Hispanics across States",caption="Source: 2020 Redistricting File")

Displaying the Map

  • What is this showing?
Hispanics

Publishing Your Document in Quarto Pub

  • Render your document to find any problems.

  • See the instructions here: Quarto Pub

  • Sign up here: Publish and Share

  • Set working directory under Session to source file and select new terminal under Tools

  • Under terminal type: quarto publish quarto-pub name.qmd

  • Choose yes.

Post-Assignment Assessment: What is this showing?

Mark all that apply

  • a. The percent of each county’s population comprised of non-Hispanic whites

  • b. The distribution of the non-Hispanic white population across counties

  • c. The percent of the non-Hispanic white population that resides in different counties

  • d. The rate at which people in each county identify as non-Hispanic white