1 Abstract

We examine various correlates of community resilience estimates (CREs) to understand and assess their comparative performance across different quantitative models. The CRE data sets were developed by the U.S. Census Bureau to assess the social vulnerability and resilience of neighborhoods across the United States in response to disasters, such as COVID-19, extreme weather events, and economic shocks. These estimates measure the capacity of individuals and households within a community to absorb, endure, and recover from external stresses. The CREs combine granular data from the American Community Survey (ACS) and the Population Estimates Program (PEP) to identify social and economic vulnerabilities at a detailed geographic level. We argue, through the concept of the proxy war that estimates such as the CREs provide useful tools to help add substantive insights over the use of single indicators that are largely used to split individuals into a small set of discrete groupings. We model, however, the importance of maintaining the critical language and analyses that serve the historical linkages of ideas to their maturation in the research literature, such as would be found in qualitative studies. We assess a set of models of social vulnerability and examine their relationships to a series of single indicators.

In this draft case analysis, we explore relationships in DC and NC.

knitr::opts_chunk$set(echo = TRUE)

library(tidyverse)
library(tidycensus)
library(sf)
library(tidyverse)
library(ggplot2)
library(weights)
library(dplyr)
library(stringr)

2 Data

Information on the community resilience esitmates datasets can be found online here. Update (Feb 6 2025), the CRE data was removed from the Census site, so we will not have access to 2023 data for now; modifying the code below to provide access to the 2022 data.

2.1 Community Resilience Estimates

Pulling in 2022 CRE data directly from GitHub.

# I am adding a link to the raw data on Github
link <- "https://raw.githubusercontent.com/quant-shop/census/refs/heads/main/data/cre.2022.tract.csv"
df <- read.csv(link)
str(df)

2.1.1 Filter by state

FIPS codes by state.

## Warning: 'xfun::attr()' is deprecated.
## Use 'xfun::attr2()' instead.
## See help("Deprecated")

State Abbreviations and FIPS Codes
AL (01)	HI (15)	MI (26)	NC (37)	UT (49)
AK (02)	ID (16)	MN (27)	ND (38)	VT (50)
AZ (04)	IL (17)	MS (28)	OH (39)	VA (51)
AR (05)	IN (18)	MO (29)	OK (40)	WA (53)
CA (06)	IA (19)	MT (30)	OR (41)	WV (54)
CO (08)	KS (20)	NE (31)	PA (42)	WI (55)
CT (09)	KY (21)	NV (32)	RI (44)	WY (56)
DE (10)	LA (22)	NH (33)	SC (45)
DC (11)	ME (23)	NJ (34)	SD (46)
FL (12)	MD (24)	NM (35)	TN (47)
GA (13)	MA (25)	NY (36)	TX (48)

Adding the state codes for DC and NC. You should change this to match your state of analysis.

# CRE data for DC
cre.2022.dc <- df %>%
  filter(STATE == "11") # FIPS code for DC

# CRE data for NC
cre.2022.nc <- df %>%
  filter(STATE == "37") # FIPS code for NC

2.1.2 Filter by county or census tract

2.1.2.1 DC

Given the size of DC, we will analyze all of DC and the Anacostia neighborhood in Southeast (SE).

# CRE data for Anacostia, DC
cre.2022.anacostia <- df %>%
  filter(STATE == "11" & COUNTY == "001" & TRACT %in% c("7503", "7504", "7903"))

To find census tract information for any state, you can use the U.S. Census Bureau’s official website (census.gov). Specifically, their “Geography” section provides access to TIGER/Line Shapefiles and Reference Maps, which contain detailed census tract data for all states and can be downloaded or viewed online.

2.1.2.2 NC

In the case of NC, we will focus on Mecklenburg County.

# CRE data for Mecklenburg County, NC
cre.2022.mecklenburg <- df %>%
  filter(STATE == "37" & COUNTY == "119") # FIPS code for NC and Mecklenburg County

You will need to View the state-level df to examine county-level info. The county numbers in FIPS codes are derived from the alphabetical order of counties within each state. County FIPS codes are assigned in alphabetic order, counting by odd numbers. The first county alphabetically in a state is always 001, the second is 003, the third is 005, and so on. This method leaves space between each county for insertion of renamed or newly created counties while preserving the alphabetic ordering.

2.2 Census data

2.2.1 CRE correlates

I am now adding a draft list of potential census variables that we would like to assess in relation to the CRE.

2.2.1.1 DC

# draft list of CRE correlates
cre_correlates_dc <- get_acs(
  geography = "tract",
  state = "DC",
  year = 2023,
  survey = "acs5",
  variables = c(
    median_income = "B19013_001",     # Median household income in the past 12 months
    poverty_rate = "B17001_002",      # Number of people below poverty level
    unemployment_rate = "B23025_005", # Number of civilians (16 years and over) unemployed
    no_health_insurance = "B27010_033", # Number of people with no health insurance coverage
    disability_status = "B18101_001", # Total civilian non-institutionalized population for whom disability status is determined
    education_less_than_hs = "B15003_002", # Population 25 years and over with less than 9th grade education
    median_age = "B01002_001",        # Median age
    housing_cost_burden = "B25070_010", # Housing units spending 50% or more of income on rent
    no_vehicle = "B08201_002",        # Households with no vehicle available
    black_population = "B02001_003",  # Black or African American alone population
    median_rent = "B25058_001"        # Median contract rent
  ),
  summary_var = "B02001_001",         # Total population (for calculating proportions)
  output = "wide",
  geometry = FALSE # set to FALSE since CRE data doesn't have geometry
)

## Getting data from the 2019-2023 5-year ACS

## Warning: • You have not set a Census API key. Users without a key are limited to 500
## queries per day and may experience performance limitations.
## ℹ For best results, get a Census API key at
## http://api.census.gov/data/key_signup.html and then supply the key to the
## `census_api_key()` function to use it throughout your tidycensus session.
## This warning is displayed once per session.

# Calculate proportion of Black population
cre_correlates_dc <- cre_correlates_dc %>%
  mutate(prop_black = black_populationE / summary_est)

# Print first few rows
head(cre_correlates_dc)

## # A tibble: 6 × 27
##   GEOID       NAME     median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 11001000101 Census …         135708          43153            37            41
## 2 11001000102 Census …         159583          70430           108           109
## 3 11001000201 Census …             NA             NA            64             5
## 4 11001000202 Census …         152059          60482           465           180
## 5 11001000300 Census …         174470          21374           372           173
## 6 11001000400 Census …         188929          71412           242           115
## # ℹ 21 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

2.2.1.2 Mecklenburg County, NC

# draft list of CRE correlates for Mecklenburg County, NC
cre_correlates_mecklenburg <- get_acs(
  geography = "tract",
  state = "NC",
  county = "Mecklenburg",
  year = 2023,
  survey = "acs5",
  variables = c(
    median_income = "B19013_001",     # Median household income in the past 12 months
    poverty_rate = "B17001_002",      # Number of people below poverty level
    unemployment_rate = "B23025_005", # Number of civilians (16 years and over) unemployed
    no_health_insurance = "B27010_033", # Number of people with no health insurance coverage
    disability_status = "B18101_001", # Total civilian non-institutionalized population for whom disability status is determined
    education_less_than_hs = "B15003_002", # Population 25 years and over with less than 9th grade education
    median_age = "B01002_001",        # Median age
    housing_cost_burden = "B25070_010", # Housing units spending 50% or more of income on rent
    no_vehicle = "B08201_002",        # Households with no vehicle available
    black_population = "B02001_003",  # Black or African American alone population
    median_rent = "B25058_001"        # Median contract rent
  ),
  summary_var = "B02001_001",         # Total population (for calculating proportions)
  output = "wide",
  geometry = FALSE # set to FALSE since CRE data doesn't have geometry
)

## Getting data from the 2019-2023 5-year ACS

# Calculate proportion of Black population
cre_correlates_mecklenburg <- cre_correlates_mecklenburg %>%
  mutate(prop_black = black_populationE / summary_est)

# Print first few rows
head(cre_correlates_mecklenburg)

## # A tibble: 6 × 27
##   GEOID       NAME     median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 37119000101 Census …         106601          16694           170           155
## 2 37119000102 Census …         129693          16100           195           153
## 3 37119000103 Census …         135533          25153             0            14
## 4 37119000104 Census …         109805           8019           158            94
## 5 37119000301 Census …          97411          16618            69            34
## 6 37119000302 Census …          78341          13225           110            70
## # ℹ 21 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

2.3 Merge data

We then merge the CRE estimates with the CRE correlates, which will provide us with some talking points as we consider the technical documentation.

2.3.1 Modify data for DC

We first need to modify the GEO_ID in the CRE data by extracting some of the characters.

# Modify the GEO_ID in cre.2022.dc
cre.2022.dc_modified <- cre.2022.dc %>%
  mutate(GEOID_modified = str_extract(GEO_ID, "\\d+$")) %>%
  mutate(GEOID_modified = str_sub(GEOID_modified, -11)) %>% 
  relocate(GEOID_modified)

2.3.1.1 Join data for DC

We then join the modified data with the correlates.

# Join data
merged_dc <- cre_correlates_dc %>%
  left_join(cre.2022.dc_modified, by = c("GEOID" = "GEOID_modified"))

# Print the first few rows of the merged dataset
head(merged_dc)

## # A tibble: 6 × 46
##   GEOID       NAME.x   median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 11001000101 Census …         135708          43153            37            41
## 2 11001000102 Census …         159583          70430           108           109
## 3 11001000201 Census …             NA             NA            64             5
## 4 11001000202 Census …         152059          60482           465           180
## 5 11001000300 Census …         174470          21374           372           173
## 6 11001000400 Census …         188929          71412           242           115
## # ℹ 40 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

2.3.2 Modify data for Mecklenburg County, NC

We first need to modify the GEO_ID in the CRE data by extracting some of the characters.

# Modify the GEO_ID in cre.2022.nc
cre.2022.nc.mecklenburg_modified <- cre.2022.mecklenburg %>%
  mutate(GEOID_modified = str_extract(GEO_ID, "\\d+$")) %>%
  mutate(GEOID_modified = str_sub(GEOID_modified, -11)) %>% 
  relocate(GEOID_modified)

2.3.2.1 Join data for Mecklenburg County, NC

We then join the modified data with the correlates.

# Join data
merged_nc_mecklenburg <- cre_correlates_mecklenburg %>%
  left_join(cre.2022.nc.mecklenburg_modified, by = c("GEOID" = "GEOID_modified"))

# Print the first few rows of the merged dataset
head(merged_nc_mecklenburg)

## # A tibble: 6 × 46
##   GEOID       NAME.x   median_incomeE median_incomeM poverty_rateE poverty_rateM
##   <chr>       <chr>             <dbl>          <dbl>         <dbl>         <dbl>
## 1 37119000101 Census …         106601          16694           170           155
## 2 37119000102 Census …         129693          16100           195           153
## 3 37119000103 Census …         135533          25153             0            14
## 4 37119000104 Census …         109805           8019           158            94
## 5 37119000301 Census …          97411          16618            69            34
## 6 37119000302 Census …          78341          13225           110            70
## # ℹ 40 more variables: unemployment_rateE <dbl>, unemployment_rateM <dbl>,
## #   no_health_insuranceE <dbl>, no_health_insuranceM <dbl>,
## #   disability_statusE <dbl>, disability_statusM <dbl>,
## #   education_less_than_hsE <dbl>, education_less_than_hsM <dbl>,
## #   median_ageE <dbl>, median_ageM <dbl>, housing_cost_burdenE <dbl>,
## #   housing_cost_burdenM <dbl>, no_vehicleE <dbl>, no_vehicleM <dbl>,
## #   black_populationE <dbl>, black_populationM <dbl>, median_rentE <dbl>, …

3 Visualization

We explore some initial visualizations to test initial hypotheses.

3.1 Median income

3.1.1 Median income in DC

ggplot(merged_dc, aes(x = median_incomeE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "DC: Median Income vs. Community Resilience Estimate",
       x = "Median Income",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.1.2 Median income in Mecklenburg County, NC

ggplot(merged_nc_mecklenburg, aes(x = median_incomeE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "Mecklenburg County, NC: Median Income vs. Community Resilience Estimate",
       x = "Median Income",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.2 Median rent

3.2.1 Median rent in DC

# For DC
ggplot(merged_dc, aes(x = median_rentE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "DC: Median Rent vs. Community Resilience Estimate",
       x = "Median Rent Estimate",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.2.2 Median rent in Meckelenburg County, NC

# For Mecklenburg County, NC
ggplot(merged_nc_mecklenburg, aes(x = median_rentE, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "Mecklenburg County, NC: Median Rent vs. Community Resilience Estimate",
       x = "Median Rent Estimate",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.3 Black population

3.3.1 Black population in DC

# For DC
ggplot(merged_dc, aes(x = prop_black, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "DC: Proportion of Black Population vs. Community Resilience Estimate",
       x = "Proportion of Black Population",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

3.3.2 Black population in Meckelenburg County, NC

# For Mecklenburg County, NC
ggplot(merged_nc_mecklenburg, aes(x = prop_black, y = PRED3_PE)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", aes(color = "Linear"), se = FALSE) +
  geom_smooth(method = "loess", aes(color = "Non-linear"), se = FALSE) +
  labs(title = "Mecklenburg County, NC: Proportion of Black Population vs. Community Resilience Estimate",
       x = "Proportion of Black Population",
       y = "% Population with 3+ Risk Factors",
       color = "Fit Type") +
  scale_color_manual(values = c("Linear" = "red", "Non-linear" = "blue")) +
  theme_minimal()

4 Analysis

ggplot(df, aes(x = median_incomeE)) +
  geom_point(aes(y = PRED3_PE, color = "CRE"), alpha = 0.5) +
  geom_point(aes(y = percent_black, color = "Black"), alpha = 0.5) +
  geom_smooth(aes(y = PRED3_PE, color = "CRE"), method = "lm", se = FALSE) +
  geom_smooth(aes(y = percent_black, color = "Black"), method = "lm", se = FALSE) +
  scale_color_manual(values = c("CRE" = "red", "Black" = "blue")) +
  labs(title = "Median Income vs. CRE and % Black Population in DC",
       x = "Median Income Estimate",
       y = "Percentage",
       color = "Variable") +
  theme_minimal()

Non-linear relationship between the variables.

ggplot(df, aes(x = estimate)) +
  geom_point(aes(y = PRED3_PE, color = "CRE"), alpha = 0.5) +
  geom_point(aes(y = percent_black, color = "Black"), alpha = 0.5) +
  geom_smooth(aes(y = PRED3_PE, color = "CRE"), method = "loess", se = FALSE) +
  geom_smooth(aes(y = percent_black, color = "% Black"), method = "loess", se = FALSE) +
  scale_color_manual(values = c("CRE" = "red", "Black" = "blue")) +
  labs(title = "Median Rent vs. CRE and % Black Population in DC (Non-linear)",
       x = "Median Rent Estimate",
       y = "Percentage",
       color = "Variable") +
  theme_minimal()

Evaluating Social Vulnerability in Census Community Resilience Estimates

Nathan Alexander, PhD

Center for Applied Data Science and Analytics

Howard University