1 Abstract

We examine various correlates of community resilience estimates (CREs) to understand and assess their comparative performance across different quantitative models. The CRE datasets were developed by the U.S. Census Bureau to assess the social vulnerability and resilience of neighborhoods across the United States in response to disasters, such as COVID-19, extreme weather events, and economic shocks. These estimates measure the capacity of individuals and households within a community to absorb, endure, and recover from external stresses. The CREs combine granular data from the American Community Survey (ACS) and the Population Estimates Program (PEP) to identify social and economic vulnerabilities at a detailed geographic level.

In this case analysis, we explore relationships in DC and NC.

knitr::opts_chunk$set(echo = TRUE)

library(tidyverse)
library(tidycensus)
library(sf)
library(tidyverse)
library(ggplot2)
library(weights)
library(dplyr)
library(stringr)

2 Data

Information on the community resilience esitmates datasets can be found online here. Update (Feb 6 2025), the CRE data was removed from the Census site, so we will not have access to 2023 data for now; modifying the code below to provide access to the 2022 data.

2.1 Community Resilience Estimates

Pulling in 2022 CRE data directly from GitHub.

# I am adding a link to the raw data on Github
link <- "https://raw.githubusercontent.com/quant-shop/census/refs/heads/main/data/cre.2022.tract.csv"
df <- read.csv(link)
str(df)

## 'data.frame':    84415 obs. of  19 variables:
##  $ GEO_ID     : chr  "1400000US01001020100" "1400000US01001020200" "1400000US01001020300" "1400000US01001020400" ...
##  $ STATE      : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY     : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ TRACT      : int  20100 20200 20300 20400 20501 20502 20503 20600 20700 20801 ...
##  $ NAME       : chr  "Census Tract 201, Autauga County, Alabama" "Census Tract 202, Autauga County, Alabama" "Census Tract 203, Autauga County, Alabama" "Census Tract 204, Autauga County, Alabama" ...
##  $ WATER_TRACT: int  NA NA NA NA NA NA NA NA NA NA ...
##  $ POPUNI     : int  1797 1984 3278 4297 4393 3364 3680 3807 3470 3163 ...
##  $ PRED0_E    : int  545 751 1283 1836 2002 1268 1586 1787 1399 927 ...
##  $ PRED0_M    : num  218 259 425 501 504 ...
##  $ PRED0_PE   : num  30.3 37.9 39.1 42.7 45.6 ...
##  $ PRED0_PM   : num  12.1 13.1 13 11.7 11.5 ...
##  $ PRED12_E   : int  912 850 1327 1736 1622 1578 1444 1088 1162 1393 ...
##  $ PRED12_M   : num  230 266 424 499 500 ...
##  $ PRED12_PE  : num  50.8 42.8 40.5 40.4 36.9 ...
##  $ PRED12_PM  : num  12.8 13.4 12.9 11.6 11.4 ...
##  $ PRED3_E    : int  340 383 668 725 769 518 650 932 909 843 ...
##  $ PRED3_M    : num  199 222 338 395 389 ...
##  $ PRED3_PE   : num  18.9 19.3 20.4 16.9 17.5 ...
##  $ PRED3_PM   : num  11.1 11.2 10.31 9.2 8.86 ...

2.1.1 Filter by state

FIPS codes by state.

State Abbreviations and FIPS Codes
AL (01)	HI (15)	MI (26)	NC (37)	UT (49)
AK (02)	ID (16)	MN (27)	ND (38)	VT (50)
AZ (04)	IL (17)	MS (28)	OH (39)	VA (51)
AR (05)	IN (18)	MO (29)	OK (40)	WA (53)
CA (06)	IA (19)	MT (30)	OR (41)	WV (54)
CO (08)	KS (20)	NE (31)	PA (42)	WI (55)
CT (09)	KY (21)	NV (32)	RI (44)	WY (56)
DE (10)	LA (22)	NH (33)	SC (45)
DC (11)	ME (23)	NJ (34)	SD (46)
FL (12)	MD (24)	NM (35)	TN (47)
GA (13)	MA (25)	NY (36)	TX (48)

Adding the state codes for DC and NC. You should change this to match your state of analysis.

# CRE data for DC
cre.2022.dc <- df %>%
  filter(STATE == "11") # FIPS code for DC

# CRE data for NC
cre.2022.nc <- df %>%
  filter(STATE == "37") # FIPS code for NC

2.1.2 Filter by county or census tract

2.1.2.1 DC

Given the size of DC, we will analyze all of DC and the Anacostia neighborhood in Southeast (SE).

# CRE data for Anacostia, DC
cre.2022.anacostia <- df %>%
  filter(STATE == "11" & COUNTY == "001" & TRACT %in% c("7503", "7504", "7903"))

To find census tract information for any state, you can use the U.S. Census Bureau’s official website (census.gov). Specifically, their “Geography” section provides access to TIGER/Line Shapefiles and Reference Maps, which contain detailed census tract data for all states and can be downloaded or viewed online.

2.1.2.2 NC

In the case of NC, we will focus on Mecklenburg county.

# CRE data for Mecklenburg County, NC
cre.2022.mecklenburg <- df %>%
  filter(STATE == "37" & COUNTY == "119") # FIPS code for NC and Mecklenburg County

You will need to View the state-level df to examine county-level info. The county numbers in FIPS codes are derived from the alphabetical order of counties within each state. County FIPS codes are assigned in alphabetic order, counting by odd numbers. The first county alphabetically in a state is always 001, the second is 003, the third is 005, and so on. This method leaves space between each county for insertion of renamed or newly created counties while preserving the alphabetic ordering.

2.2 Census data

2.2.1 CRE correlates

I am now adding a draft list of potential census variables that we would like to assess in relation to the CRE.

2.2.1.1 DC

# draft list of CRE correlates
cre_correlates_dc <- get_acs(
  geography = "tract",
  state = "DC",
  year = 2023,
  survey = "acs5",
  variables = c(
    median_income = "B19013_001",     # Median household income in the past 12 months
    poverty_rate = "B17001_002",      # Number of people below poverty level
    unemployment_rate = "B23025_005", # Number of civilians (16 years and over) unemployed
    no_health_insurance = "B27010_033", # Number of people with no health insurance coverage
    disability_status = "B18101_001", # Total civilian non-institutionalized population for whom disability status is determined
    education_less_than_hs = "B15003_002", # Population 25 years and over with less than 9th grade education
    median_age = "B01002_001",        # Median age
    housing_cost_burden = "B25070_010", # Housing units spending 50% or more of income on rent
    no_vehicle = "B08201_002",        # Households with no vehicle available
    black_population = "B02001_003",  # Black or African American alone population
    median_rent = "B25058_001"        # Median contract rent
  ),
  summary_var = "B02001_001",         # Total population (for calculating proportions)
  output = "wide",
  geometry = FALSE # set to FALSE since CRE data doesn't have geometry
)

## Getting data from the 2019-2023 5-year ACS

# Calculate proportion of Black population
cre_correlates_dc <- cre_correlates_dc %>%
  mutate(prop_black = black_populationE / summary_est)

# Print first few rows
head(cre_correlates_dc)

## # A tibble: 6 × 27
##   GEOID       NAME     median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 11001000101 Census …         135708          43153            37            41
## 2 11001000102 Census …         159583          70430           108           109
## 3 11001000201 Census …             NA             NA            64             5
## 4 11001000202 Census …         152059          60482           465           180
## 5 11001000300 Census …         174470          21374           372           173
## 6 11001000400 Census …         188929          71412           242           115
## # ℹ 21 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

2.2.1.2 Mecklenburg County, NC

# draft list of CRE correlates for Mecklenburg County, NC
cre_correlates_mecklenburg <- get_acs(
  geography = "tract",
  state = "NC",
  county = "Mecklenburg",
  year = 2023,
  survey = "acs5",
  variables = c(
    median_income = "B19013_001",     # Median household income in the past 12 months
    poverty_rate = "B17001_002",      # Number of people below poverty level
    unemployment_rate = "B23025_005", # Number of civilians (16 years and over) unemployed
    no_health_insurance = "B27010_033", # Number of people with no health insurance coverage
    disability_status = "B18101_001", # Total civilian non-institutionalized population for whom disability status is determined
    education_less_than_hs = "B15003_002", # Population 25 years and over with less than 9th grade education
    median_age = "B01002_001",        # Median age
    housing_cost_burden = "B25070_010", # Housing units spending 50% or more of income on rent
    no_vehicle = "B08201_002",        # Households with no vehicle available
    black_population = "B02001_003",  # Black or African American alone population
    median_rent = "B25058_001"        # Median contract rent
  ),
  summary_var = "B02001_001",         # Total population (for calculating proportions)
  output = "wide",
  geometry = FALSE # set to FALSE since CRE data doesn't have geometry
)

## Getting data from the 2019-2023 5-year ACS

# Calculate proportion of Black population
cre_correlates_mecklenburg <- cre_correlates_mecklenburg %>%
  mutate(prop_black = black_populationE / summary_est)

# Print first few rows
head(cre_correlates_mecklenburg)

## # A tibble: 6 × 27
##   GEOID       NAME     median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 37119000101 Census …         106601          16694           170           155
## 2 37119000102 Census …         129693          16100           195           153
## 3 37119000103 Census …         135533          25153             0            14
## 4 37119000104 Census …         109805           8019           158            94
## 5 37119000301 Census …          97411          16618            69            34
## 6 37119000302 Census …          78341          13225           110            70
## # ℹ 21 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

2.3 Merge data

We then merge the CRE estimates with the CRE correlates, which will provide us with some talking points as we consider the technical documentation.

2.3.1 Modify data for DC

We first need to modify the GEO_ID in the CRE data by extracting some of the characters.

# Modify the GEO_ID in cre.2022.dc
cre.2022.dc_modified <- cre.2022.dc %>%
  mutate(GEOID_modified = str_extract(GEO_ID, "\\d+$")) %>%
  mutate(GEOID_modified = str_sub(GEOID_modified, -11)) %>% 
  relocate(GEOID_modified)

2.3.1.1 Join data for DC

We then join the modified data with the correlates.

# Join data
merged_dc <- cre_correlates_dc %>%
  left_join(cre.2022.dc_modified, by = c("GEOID" = "GEOID_modified"))

# Print the first few rows of the merged dataset
head(merged_dc)

## # A tibble: 6 × 46
##   GEOID       NAME.x   median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 11001000101 Census …         135708          43153            37            41
## 2 11001000102 Census …         159583          70430           108           109
## 3 11001000201 Census …             NA             NA            64             5
## 4 11001000202 Census …         152059          60482           465           180
## 5 11001000300 Census …         174470          21374           372           173
## 6 11001000400 Census …         188929          71412           242           115
## # ℹ 40 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

2.3.2 Modify data for Mecklenburg County, NC

We first need to modify the GEO_ID in the CRE data by extracting some of the characters.

# Modify the GEO_ID in cre.2022.nc
cre.2022.nc.mecklenburg_modified <- cre.2022.mecklenburg %>%
  mutate(GEOID_modified = str_extract(GEO_ID, "\\d+$")) %>%
  mutate(GEOID_modified = str_sub(GEOID_modified, -11)) %>% 
  relocate(GEOID_modified)

2.3.2.1 Join data for Mecklenburg County, NC

We then join the modified data with the correlates.

# Join data
merged_nc_mecklenburg <- cre_correlates_mecklenburg %>%
  left_join(cre.2022.nc.mecklenburg_modified, by = c("GEOID" = "GEOID_modified"))

# Print the first few rows of the merged dataset
head(merged_nc_mecklenburg)

## # A tibble: 6 × 46
##   GEOID       NAME.x   median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 37119000101 Census …         106601          16694           170           155
## 2 37119000102 Census …         129693          16100           195           153
## 3 37119000103 Census …         135533          25153             0            14
## 4 37119000104 Census …         109805           8019           158            94
## 5 37119000301 Census …          97411          16618            69            34
## 6 37119000302 Census …          78341          13225           110            70
## # ℹ 40 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

3 Visualization

We explore some initial visualizations to test initial hypotheses.

3.1 Median income

3.1.1 Median income in DC

ggplot(merged_dc, aes(x = median_incomeE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "DC: Median Income vs. Community Resilience Estimate",
       x = "Median Income",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.1.2 Median income in Mecklenburg County, NC

ggplot(merged_nc_mecklenburg, aes(x = median_incomeE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "Mecklenburg County, NC: Median Income vs. Community Resilience Estimate",
       x = "Median Income",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.2 Median rent

3.2.1 Median rent in DC

# For DC
ggplot(merged_dc, aes(x = median_rentE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "DC: Median Rent vs. Community Resilience Estimate",
       x = "Median Rent Estimate",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.2.2 Median rent in Meckelenburg County, NC

# For Mecklenburg County, NC
ggplot(merged_nc_mecklenburg, aes(x = median_rentE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "Mecklenburg County, NC: Median Rent vs. Community Resilience Estimate",
       x = "Median Rent Estimate",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.3 Black population

3.3.1 Black population in DC

# For DC
ggplot(merged_dc, aes(x = prop_black, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "DC: Proportion of Black Population vs. Community Resilience Estimate",
       x = "Proportion of Black Population",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.3.2 Black population in Meckelenburg County, NC

# For Mecklenburg County, NC
ggplot(merged_nc_mecklenburg, aes(x = prop_black, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "Mecklenburg County, NC: Proportion of Black Population vs. Community Resilience Estimate",
       x = "Proportion of Black Population",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

4 Analysis

Forthcoming.

Evaluating Social Vulnerability Through Census Community Resilience Estimates

Nathan Alexander, PhD

Center for Applied Data Science and Analytics

Howard University