Mapping Multidimensionality: Using Census Data to Understand Neighborhood Communities

Quantitative Histories Workshop

N. Alexander

Howard University

B. Onabajo

Howard University

J. Gupta

Howard University

K. Nichols-Smith

Morehouse College

H. Jang

Morehouse College

Abstract

We explore how multidimensional measures of local communities – such as those provided by the Census Community Resilience Estimates (CREs) – can be used to frame and model dynamic changes in neighborhood communities using an intersectional lens.

We build on prior research inspired by work on the United States as a “Patchwork Nation.”

Quantitative Histories Workshop

curriculum & software development collective

and

research lab

Project background: Information and spatial segregation

Information and theory

Information theory is a branch of applied mathematics and computer science that deals with the quantification, storage, transmission, and manipulation of information.

We take an abstract approach to our study of information.

  • Information theory seeks to measure the amount of information contained in a message or signal and how efficiently it can be transmitted or stored.

  • This project seeks to define information using a critical computational perspective.

  • Namely, how might we leverage computation and quantification to transmit information efficiently while maintaining the roots of complex theories and histories?

Mapping Single Dimensions

Theoretical framework: A Patchwork Nation

If you pay attention to the complexity of the USA, its diversity and differences you soon realize that the ways we try to understand it – red and blue, Northeast and Midwest – are too simplistic. They are inadequate and misleading.” -Patchwork Nation Project

Theoretical framework: A Patchwork Nation

  • Boom Towns: Rapidly expanding communities

  • Campus and Careers: Areas with a significant presence of higher education institutions

  • Immigration Nation: Areas with high concentrations of immigrant populations

  • Industrial Metropolis: Large urban areas with a strong industrial base

  • Emptying Nests: Communities with an aging population

  • Minority Central: Areas with large minority populations

  • Monied Burbs: Affluent suburban areas

Analytic framework: Community Resilience Estimates

  • The Census Community Resilience Estimates (CRE) data sets were developed to assess the social vulnerability and resilience of neighborhoods in response to disasters or shocks.

    – Households with an income-to-poverty ratio less than 130%

    – Less than one individual living in the household is aged 18–64

    – Household crowding, defined as more than 0.75 persons per room

    – Households with limited education

    – No one in the household is employed full-time year-round

    – Individual with a disability posing a constraint to significant life activity

    – Individual with no health insurance

    – Individual aged 65 or older

    – Households without a vehicle

    – Households without broadband internet access

Analytic framework: Community Resilience Estimates

  • CRE estimates are a measure of the capacity of individuals and households within a community to absorb, endure, and recover from external stresses.

  • The CRE data combine American Community Survey (ACS) and the Population Estimates Program (PEP) data to identify social and economic vulnerabilities by geography.

  • There is a nice CRE Interactive Tool that allows for a quick overview of local contexts.

cre_correlates_dc <- get_acs(
  geography = "tract", state = "DC", year = 2023, survey = "acs5",
  variables = c(
    median_income = "B19013_001",       # Median household income in the past 12 months
    poverty_rate = "B17001_002",        # Number of people below poverty level
    unemployment_rate = "B23025_005",   # Number of civilians (16 years and over) unemployed
    no_health_insurance = "B27010_033", # Number of people with no health insurance coverage
    educ_less_than_hs = "B15003_002",   # Population 25 years and over with less than 9th grade education
    median_age = "B01002_001",          # Median age
    housing_cost_burden = "B25070_010", # Housing units spending 50% or more of income on rent
    no_vehicle = "B08201_002",          # Households with no vehicle available
    black_population = "B02001_003",    # Black or African American alone population
    median_rent = "B25058_001"),        # Median contract rent
  summary_var = "B02001_001",           # Total population (for calculating proportions)
  output = "wide", geometry = FALSE)

Dimensionality in Spatial Models

Model development

– Base spatial model formulation: \[ \boldsymbol{y} = \boldsymbol{X}\beta + \tau + \epsilon \]

  • \(\boldsymbol{y}\) is a \(n\) x \(1\) response vector

  • \(\boldsymbol{X}\) is a design matrix that contains explanatory variables

  • \(\beta\) represents fixed effects coefficients

  • \(\tau\) denotes spatially dependent random errors

  • \(\epsilon\) represents independent random errors

Dimensionality in Spatial Models

Model development

Response vector structure (\(\boldsymbol{y}\)):

\[ \begin{align} \boldsymbol{y} &= \begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{bmatrix} \end{align} \]

  • Each element, \(y_i\), represents the observed response at a neighborhood’s location \(i\)

  • These are ordered by adjacency relationships to preserve the geographical context

  • Also, review of distributions, spatial autocorrelation (i.e., \(Cov(y_i, y_j)\)), and decomposition

Dimensionality in Spatial Models

Model development

Design matrix of explanatory variables structure (\(\boldsymbol{X}\)):

\[ \boldsymbol{X} = \begin{bmatrix} 1 & x_{1, 1} & \ldots & x_{1, p} \\ 1 & x_{2, 1} & \ldots & x_{2, p} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_{n, 1} & \ldots & x_{n, p} \\ \end{bmatrix} \]

  • First column is the intercept term

  • Subsequent columns represent \(p\) explanatory variables

  • Each row corresponds to a specific neighborhood’s covariates

Dimensionality in Spatial Models

Sample design matrix of explanatory variables

\[ \boldsymbol{X} = \begin{bmatrix} 1 & 65,000 & 0.62 & 3,200 \\ 1 & 28,000 & 0.32 & 5,100 \\ \vdots & \vdots & \vdots & \vdots \\ 1 & 127,000 & 0.75 & 6,840 \\ \end{bmatrix} \]

  • Column 1 is the expected value of \(\boldsymbol{y}\) when all other predictors are zero

  • Variable 1 (column 2) as median income

  • Variable 2 (column 3) as the proportion of residents with a high school diploma

  • Variable 3 (column 4) as population density (residents/sq. mi)

Information and spatial segregation

Model selection

There are multiple models for consideration:

Spatial regression using intersectional interactions

Structural Equation Modeling (SEM) with CRE components

Multilevel Analysis of Individual Heterogeneity and Discriminatory Analysis (MAIHDA)

  • Evans et al. (2024). A Tutorial for Conducting MAIHDA. Population Health, Vol. 26, 101664

  • Combines intersectional stratification with neighborhood-level clustering

  • Models individuals nested within: Intersectional strata (e.g., low-income Black men), community typologies from framework (e.g., Patchwork Nation) classifications

Overview

Case analysis: Health

Hypertension

Hypertension, also known as high blood pressure, is a condition in which the force of blood pushing against the walls of the arteries is consistently too high.

The condition is a compounding health concern in the United States.

Between 2017-2020, an estimated 115.3 million US adults had high blood pressure, representing up to 45% of the adult population.

The prevalence of high blood pressure fluctuations over time:

  • Between 1999–2000, high blood pressure was highest at 47.9%
  • It reached its lowest points between 2009-2010 and 2013-2014 at 43%
  • As of 2017-2020, high blood pressure had a national average of around 48%

Hypertension rate in the US

Time Trend of Hypertension Mortality Rates Over Time in the US

Comparative rates in the DMV

Bar chart of Hypertension Morbidity Rates in Proximal States

CRE and hypertension morbidity

While the Community Resilience Estimates (CRE) do not directly measure hypertension, there are several indirect connections between hypertension and community resilience:

  • Health Insurance: One of the CRE risk factors is lack of health insurance. Individuals without health insurance are less likely to receive regular blood pressure screenings and treatment for hypertension.

  • Socioeconomic Factors: The CRE includes factors like poverty and employment, which are known to influence hypertension rates.

  • Education: Limited education is a CRE risk factor. Lower educational attainment is associated with higher rates of hypertension.

  • Age: The CRE considers households with individuals aged 65 or older as a risk factor. Hypertension increases with age, making older populations more vulnerable to its effects.

CRE and hypertension morbidity for DC

Correlation Matrix of Hypertension and Socioeconomic Factors
Variables
hypertension_rate POVERTY_RATE UNEMPLOYMENT_RATE EDUCATION_RATE
hypertension_rate 1.000 0.240 0.565 -0.706
POVERTY_RATE 0.240 1.000 -0.152 -0.006
UNEMPLOYMENT_RATE 0.565 -0.152 1.000 -0.642
EDUCATION_RATE -0.706 -0.006 -0.642 1.000

Racial Differences and Hypertension Morbidity for DC

Correlation Matrix of Hypertension Rate and Racial Demographics in DC
Variables
hypertension_rate white black asian native hawai_pac
hypertension_rate 1.000 -0.876 0.898 -0.673 0.098 -0.113
white -0.876 1.000 -0.966 0.567 -0.103 0.112
black 0.898 -0.966 1.000 -0.650 0.074 -0.124
asian -0.673 0.567 -0.650 1.000 -0.050 0.163
native 0.098 -0.103 0.074 -0.050 1.000 -0.031
hawai_pac -0.113 0.112 -0.124 0.163 -0.031 1.000

Hypertension and CRE

Race and Ethnicity

We are also developing a dashboard for internal use to automate some processes.

US Census Demographics Data Dashboard

Political analysis

Examining the impact of broader political shifts on neighborhood measures.

  • Project title: Examining Polling Location Changes After The Shelby County Decision: How a Lack of Federal Oversight Impacts Poll Accessibility in Black and Brown Communities

  • Abstract: Since the Supreme Court’s Shelby Decision in 2013, states are no longer required to have polling station changes or closures federally reviewed. Given that over 1600 polling locations have been closed or changed since 2013, we ask what factors contribute to these polling location changes and closures? We examine the driving forces behind these changes, and use Census data to determine what implications these polling location changes may have on accessibility in communities of color.

Special Thanks

Research assistants: Myles Ndiritu (Morehouse College), Zoe Williams (Howard University), Kade Davis (Morehouse College), Amari Gray (Morehouse College)

Lab manager: Lyrric Jackson (Spelman College)

Funding: Alfred P. Sloan Foundation, AUC Data Science Initiative, Data.org

Partners: The Carpentries

References

Chinni, D., & Gimpel, J. (2010). Our Patchwork Nation: The Surprising Truth about the “Real” America. Gotham Books.

Evans, C. R., Leckie, G., Subramanian, S. V., Bell, A., & Merlo, J. (2024). A tutorial for conducting intersectional multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA). SSM - Population Health, 26, Article 101664. https://doi.org/10.1016/j.ssmph.2024.101664.

U.S. Census Bureau. (2024). Community Resilience Estimates. Retrieved March 26, 2025, from https://www.census.gov/programs-surveys/community-resilience-estimates/about.html.