UrbanFinal

The Relationship Between Eviction and Incarceration in NYC

Introduction/Background

This project investigates the relationship between housing instability and criminal legal system involvement in New York City, New York. Specifically, it asks whether higher rates of eviction filings in New York City neighborhoods correlate with increased police contact, arrests, or jail admissions, and how these patterns intersect with race and income. One hypothesis is that eviction functions as a destabilizing force that may increase the likelihood of shelter entry and criminal justice contact, which is known as the public housing to prison pipeline.

New York City is one of America’s most populous and major cities, and understanding this relationship there is certainly urgent, as it is one of the most prominent nexuses of the affordable housing crisis, shelter overcrowding, and policing of low-income communities. By focusing on evictions, low-level arrests, and shelter entries, this project seeks to show how housing policy and criminal justice outcomes are deeply intertwined. Data from this analysis comes from several official sources from the NYC Open Data site and the census (ACS). This report puts forth a set of initial findings focused on eviction filings in NYC and how they relate to neighborhood racial composition and income.

Figures and Tables

library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(lubridate)

Attaching package: 'lubridate'
The following objects are masked from 'package:base':

    date, intersect, setdiff, union
library(tidyr)
library(tidycensus)
library(stringr)

evictions <- read.csv("newevictions.csv")
arrests <- read.csv("nycarrests.csv")
shelter <- read.csv("sheltercensus.csv")

census_api_key("e56623b11ab31e749244318a1e292fb8e7ce936b", install = TRUE, overwrite = TRUE)
Your original .Renviron will be backed up and stored in your R HOME directory if needed.
Your API key has been stored in your .Renviron and can be accessed by Sys.getenv("CENSUS_API_KEY"). 
To use now, restart R or run `readRenviron("~/.Renviron")`
[1] "e56623b11ab31e749244318a1e292fb8e7ce936b"
evictions_clean <- evictions %>%
  mutate (Executed.Date = mdy(Executed.Date),                         
    year = year(Executed.Date),                                 
    month = month(Executed.Date, label = TRUE, abbr = TRUE), 
    Borough = toupper(BOROUGH)) %>%
  select(Executed.Date, year, month, Borough, Eviction.Postcode,
         Latitude, Longitude, Census.Tract, NTA)

arrests_clean <- arrests %>%
  filter(JURISDICTION_CODE %in% c(0,1,2)) %>%           
  mutate (ARREST_DATE = mdy(ARREST_DATE),
    year = year(ARREST_DATE),
    month = month(ARREST_DATE, label = TRUE, abbr = TRUE),
    Borough = case_when(
      ARREST_BORO == "B" ~ "BRONX",
      ARREST_BORO == "K" ~ "BROOKLYN",
      ARREST_BORO == "M" ~ "MANHATTAN",
      ARREST_BORO == "Q" ~ "QUEENS",
      ARREST_BORO == "S" ~ "STATEN ISLAND")) %>%
  filter(!is.na(Latitude), Latitude != 0) %>%
  filter(!is.na(ARREST_DATE))

shelter_clean <- shelter %>%
  mutate (Date = mdy(Date.of.Census),
    year = year(Date),
    month = month(Date, label = TRUE, abbr = TRUE)) %>%
  group_by(year, month) %>%
  summarize (avg_total_sheltered = mean(Total.Individuals.in.Shelter),
    .groups = "drop")

nyc_census <- get_acs (geography = "tract",
  variables = c (median_income = "B19013_001",
    total_pop = "B01003_001",
    black_pop = "B02001_003",
    latinx_pop = "B03003_003"),
  state = "NY",
  county = c("New York", "Kings", "Bronx", "Queens", "Richmond"),
  year = 2022,
  survey = "acs5",
  geometry = FALSE) %>%
  select(-moe) %>%
  pivot_wider(names_from = variable, values_from = estimate) %>%
  mutate(pct_black = black_pop / total_pop,
    pct_latinx = latinx_pop / total_pop,
    tract_id = str_sub(GEOID, nchar(GEOID) -5) ) # trying to match tract_id format
Getting data from the 2018-2022 5-year ACS
evictions_clean <-evictions_clean %>%
  mutate(tract_id = str_pad(as.character(Census.Tract), width = 6, pad = "0"))

evictions_census <- evictions_clean %>%
  inner_join(nyc_census, by = "tract_id", relationship = "many-to-many")

evictions_by_tract <- evictions_census %>%
  group_by(tract_id) %>%
  summarize(evictions = n(),
    median_income = median(median_income, na.rm = TRUE),
    pct_black = mean(pct_black, na.rm = TRUE),
    pct_latinx = mean(pct_latinx, na.rm = TRUE))

Table of Summary Statistics by Race/Income Quartile

Table 1: Evictions by Income Quartile and Demographics
income_quartile mean_evictions mean_pct_black mean_pct_latinx
1 106.75281 0.3077011 0.4812977
2 99.04494 0.2208775 0.3997142
3 46.85393 0.1389895 0.2157302
4 35.64045 0.0696088 0.1385671

Evictions and Percent Black Population by Census Tract

`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 3 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_point()`).
Figure 1: Evictions and % Black Population (NYC Tracts)

Evictions and Demographics Across Income Quartiles

Figure 2: Evictions and Demographics across Income Quartiles

Discussion & Conclusion

Table 1 summarizes average eviction counts by income quartile, along with the average percentage of Black and Latinx residents in each quartile. It shows that eviction counts are highest in the neighborhoods with lower income levels and decline as income increases. For instance, neighborhoods in the lowest quartile experience an average of over 106 evictions over the time period analyzed, compared to just 35 in the highest-income areas. These areas also differ sharply in demographic composition; the lowest-income tracts are nearly 48% Latinx and 31% Black, and these percentages fall dramatically in higher-income neighborhoods. This table provides support for the idea that housing instability is highly racial and class-based in NYC. Figure 1 plots eviction counts against the percentage of Black residents in each census tract. The relationship shows a slight positive slope, suggesting that areas with larger Black populations tend to have more evictions. While most census tracts fall in the lower left corner of the graph, which represents both low evictions and low Black population, some outliers with both high eviction counts and high Black population share suggest areas of housing displacement along racial lines.Figure 2 further highlights these patterns by combining mean evictions with racial composition across income quartiles. It essentially presents the information in Table 1 in a visual manner, showing that evictions decrease sharply as income increases, while the proportion of Black and Latinx residents also drops. Interestingly, the percentage of Black residents increases slightly from the third to the highest-income quartile, which could be due to a few tracts with high incomes and Black populations, possibly gentrified areas. 

These initial findings provide strong support for one of the project’s main hypotheses: that evictions are concentrated in lower-income, majority-Black and Latinx neighborhoods, reflecting structural inequality in the housing system. The results already point toward racial inequalities in evictions that may expose residents to other systems of state control, so next, the project will examine how these eviction patterns correlate with arrests, and whether shelter entry helps explain the possible connection between eviction and contact with the criminal legal system. This will involve merging housing and policing datasets by geography and time, and applying regression models to the data to assess correlation, confounding effects, and possible mediation. The  ultimate goal is to show how housing policy could serve as a tool for criminal justice reform, and how eviction prevention might help reduce carceral exposure in vulnerable communities.