Mapping Ancestry in R with 2020 Census Data (DDHCA)

Kara Joyner

Learning Goals

  • You will see examples of how ancestry is depicted in maps.

  • You will be work with a 2020 Census file that was recently released (DDHCA).

  • You will learn different ways to map the concentration of an ancestry group in a county.

  • You will learn the challenges of measuring ancestral groups.

Maps of Ancestry

Today’s Data Visualizations

  • Number and percent of population in each tract identifying as Chinese

  • Number of people identifying as Chinese across Bexar County

  • Distribution of people identifying as Chinese across states

Resources

Listing Libraries

library(haven)
library(janitor)
library(dplyr)
library(terra)
library(tmap)
library(tidycensus)
library(utils)
library(mapview)
library(tmap)
library(ggplot2)
library(posterior)
library(plyr)
library(sf)
options(tigris_use_cache = TRUE)

Using the Census Data API Key

  • The API key gives you access raw data from the US Census

  • Obtain a key here: Request a U.S. Census Data API Key

  • Use the code below to install (first time) or overwrite (subsequent times) the key.

census_api_key(key =  "8f054fa4225c982d7a9f27aa67fd88f8bb5f77ec", overwrite = TRUE)  #census_api_key(key =  "8f054fa4225c982d7a9f27aa67fd88f8bb5f77ec", install=TRUE)

Scanning Variables in the 2020 Census File with Ancestry Groups (DDHCA)

  • This code creates an object that includes all the variable names.
vars <- load_variables(year=2020, dataset = "dhc") 
head(vars)
# A tibble: 6 × 3
  name     label                                                         concept
  <chr>    <chr>                                                         <chr>  
1 H10_001N " !!Total:"                                                   TENURE…
2 H10_002N " !!Total:!!Owner occupied:"                                  TENURE…
3 H10_003N " !!Total:!!Owner occupied:!!Householder who is White alone"  TENURE…
4 H10_004N " !!Total:!!Owner occupied:!!Householder who is Black or Afr… TENURE…
5 H10_005N " !!Total:!!Owner occupied:!!Householder who is American Ind… TENURE…
6 H10_006N " !!Total:!!Owner occupied:!!Householder who is Asian alone"  TENURE…

Reading in the 2020 Census File with Ancestry Groups (DDHCA)

bexar_all <- get_decennial(
  geography = "tract",
  state = "TX",
  county = "Bexar",
  variables ="P10_001N",
  year = 2020,
  sumfile = "dhc",
  geometry = TRUE)

Viewing the New Data

  • How can you find the codes for different ancestry groups?
bexar_all
Simple feature collection with 375 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -98.80655 ymin: 29.11444 xmax: -98.1169 ymax: 29.76071
Geodetic CRS:  NAD83
# A tibble: 375 × 5
   GEOID       NAME                     variable value                  geometry
   <chr>       <chr>                    <chr>    <dbl>        <MULTIPOLYGON [°]>
 1 48029171601 Census Tract 1716.01; B… P10_001N  2912 (((-98.62398 29.41355, -…
 2 48029180604 Census Tract 1806.04; B… P10_001N  4521 (((-98.58517 29.47894, -…
 3 48029151600 Census Tract 1516; Bexa… P10_001N  5467 (((-98.50388 29.34895, -…
 4 48029150502 Census Tract 1505.02; B… P10_001N  2766 (((-98.53415 29.37502, -…
 5 48029181712 Census Tract 1817.12; B… P10_001N  2897 (((-98.66604 29.48512, -…
 6 48029180102 Census Tract 1801.02; B… P10_001N  1549 (((-98.54874 29.46871, -…
 7 48029121508 Census Tract 1215.08; B… P10_001N  3772 (((-98.37162 29.5104, -9…
 8 48029121404 Census Tract 1214.04; B… P10_001N  3749 (((-98.4041 29.47818, -9…
 9 48029191505 Census Tract 1915.05; B… P10_001N  1761 (((-98.56706 29.55566, -…
10 48029190504 Census Tract 1905.04; B… P10_001N  1940 (((-98.516 29.46644, -98…
# ℹ 365 more rows

Reading in the Data for People from China

  • Specify the variable name and group number.
bexar_chinese <- get_decennial(
  geography = "tract",
  variables = "T01001_001N",
  state = "TX",
  county = "Bexar",
  year = 2020,
  sumfile = "ddhca",
  pop_group = "3822",
  pop_group_label = TRUE,
  geometry = TRUE)

Creating a Map

  • Which tracts have no color?
mapview(bexar_chinese, zcol = "value")

Calculating Percentages

  • The data set above just obtains number of people in each tract who identify as Chinese.
  • To obtain a percent we need to join this with the earlier file.
bexar_all$total<-bexar_all$value
bexar_all <- subset(bexar_all, select = c("GEOID","total"))
bexar_chinese_new <- subset(bexar_chinese, select = c("GEOID","value"))
bexar_chinese_new<-st_drop_geometry(bexar_chinese_new)
combined<-left_join(bexar_all,bexar_chinese_new, by=c("GEOID"))
combined<-replace(combined, is.na(combined),0)
combined$percent<-(combined$value)/(combined$total)
combined<-replace(combined, is.na(combined),0)

Creating a Map

  • What is this showing?
mapview(combined, zcol = "percent")

Preparing the Data for a Dot Map

  • This code creates dots that represent the number of people in each tract who identify as Chinese.
  • It specifies that 1 dot represents 100 people.
bexar_dots <- as_dot_density(
  bexar_chinese,
  value = "value",
  values_per_dot = 100)

Producing a Dot Map

mapview(bexar_dots, cex = 0.01, layer.name = "Chinese population 1 dot = 100 people",
col.regions = "navy", color = "navy")

Re-reading the Data into R

  • This code refers to geography at the state level.
Chinese_all <- get_decennial(
  geography = "us",
  variables = "T01001_001N",
  year = 2020,
  sumfile = "ddhca",
  pop_group = "3822",
  pop_group_label = TRUE)
Chinese_state <- get_decennial(
  geography = "state",
  variables = "T01001_001N",
  year = 2020,
  sumfile = "ddhca",
  pop_group = "3822",
  pop_group_label = TRUE)
Chinese_state$proportion<-(Chinese_state$value)/(Chinese_all$value)

Joining to Spatial Data

library(urbnmapr)   
library(urbnthemes)   
library(janitor)   
library(sf)      
Chinese_state$state_name<-Chinese_state$NAME  
spatial_data <- left_join(Chinese_state, get_urbn_map(map = "states", sf = TRUE), by = "state_name")

Creating a Map

  • Here is the code for producing a map based on the Chinese sample.
Chinese<-ggplot() +  
geom_sf(spatial_data,  
mapping = aes(fill = proportion, geometry=geometry),  
color = "#ffffff", size = 0.25) +  
  scale_fill_gradientn(labels = scales::percent) +  
  labs(fill = "Percent") +  coord_sf(datum = NA) +  
  labs(title="Chinese",caption="Source: 2020 Redistricting File")

Displaying the Map

  • What is this showing?
Chinese