Introduction
Import and Prepare Brooklyn NTA Boundary Data
Preliminary Map of Brooklyn NTAs
Import and Clean ACS Data
Convert ACS Variables to Numeric Format
Get Brooklyn Census Tract Geometry
Prepare ACS Data for Joining
Join ACS Data to Census Tract Geometry
Map 1: SNAP Participation by Census Tract
Map 2: Poverty Rate by Census Tract
Create Standardized Need Variables
Create Composite Food-Related Need Index
Map 3: Composite Food-Related Need Index
Current Progress Summary

# -----------------------------
# Load Required Libraries
# -----------------------------

library(tidyverse)   # data manipulation and plotting
library(sf)          # spatial data handling
library(readxl)      # importing Excel files
library(stringr)     # string manipulation for GEOID cleaning
library(tidycensus)  # census data and tract geometry
library(scales)      # transparency settings for maps

Introduction

This R Markdown document presents the current progress for my spatial analysis project focused on food-related need in Brooklyn, New York. The project examines patterns of economic vulnerability using American Community Survey data, especially SNAP participation and poverty rates. These indicators are mapped at the census tract level and displayed with Neighborhood Tabulation Area boundaries for neighborhood-level context.

At this stage, the project focuses on preparing the spatial and demographic data, joining ACS indicators to census tract geometry, and creating initial visualizations. Later steps will incorporate food infrastructure data, such as farmers markets, community gardens, community fridges, and other food resources, to compare areas of need with areas of food access.

Import and Prepare Brooklyn NTA Boundary Data

The first dataset used in this project is the 2020 Neighborhood Tabulation Area boundary file. This file contains a geometry column stored as Well-Known Text. I convert the data into an sf spatial object so that it can be mapped and used as a geographic reference layer.

# Import NTA data
nta <- read_csv("Data/2020_Neighborhood_Tabulation_Areas_(NTAs)_20260501.csv", show_col_types = FALSE)

# Convert to spatial object
nta_sf <- st_as_sf(nta, wkt = "the_geom", crs = 4326)

# Filter to Brooklyn only
bk_nta <- nta_sf %>%
  filter(BoroName == "Brooklyn")

Preliminary Map of Brooklyn NTAs

This first map is used to confirm that the NTA spatial data imported correctly and that the Brooklyn filter worked. This is an important early data validation step because it confirms that the project’s geographic boundary layer is usable.

# Plot Brooklyn NTAs
ggplot() +
  geom_sf(data = bk_nta, fill = "lightblue", color = "black") +
  labs(title = "Brooklyn NTA Boundaries") +
  theme_minimal()

Import and Clean ACS Data

The next step is to import American Community Survey data. The ACS file contains demographic and economic indicators related to SNAP participation, poverty, and race and ethnicity. These variables are used to measure food-related need across Brooklyn.

The Excel file includes an extra descriptive row that is not part of the actual data, so that row is removed. Then, only the variables needed for this stage of the analysis are selected and renamed.

# -----------------------------
# Import and Clean ACS Data
# -----------------------------

# Import ACS Excel file (Census data)
acs <- read_excel("Data/MINE_ACSST5Y2020.S2201.xlsx")

# Remove the second row (contains descriptive labels, not actual data)
acs_clean <- acs %>%
  slice(-1)

# Select only relevant variables and rename for easier use
acs_clean <- acs_clean %>%
  select(
    GEO_ID,
    NAME,
    snap = S2201_C04_001E,        # % households receiving SNAP
    poverty = S2201_C02_021E,     # % households below poverty
    pct_black = S2201_C02_026E,   # % Black households
    pct_hispanic = S2201_C02_032E # % Hispanic households
  )

Convert ACS Variables to Numeric Format

The selected ACS variables originally imported as character values. Since these variables need to be mapped and analyzed as percentages, they are converted into numeric format. Some non-numeric Census values may become NA, which is expected when the original dataset contains missing or suppressed values.

# Convert selected variables from character to numeric
# (necessary for mapping and analysis)
acs_clean <- acs_clean %>%
  mutate(
    snap = as.numeric(snap),
    poverty = as.numeric(poverty),
    pct_black = as.numeric(pct_black),
    pct_hispanic = as.numeric(pct_hispanic)
  )

# Filter ACS data to Brooklyn (Kings County census tracts only)
acs_bk <- acs_clean %>%
  filter(str_detect(NAME, "Kings County"))

# Check cleaned dataset
#glimpse(acs_bk)

# Check summary statistics (data validation step)
# summary(acs_clean$snap)
# summary(acs_clean$poverty)

Get Brooklyn Census Tract Geometry

The ACS Excel data contains tract-level values but does not contain spatial geometry. To map these values, I use tidycensus to retrieve Brooklyn census tract boundaries. The population variable is pulled mainly so that the function returns tract-level geometry.

options(tigris_use_cache = TRUE)
# -----------------------------
# Get Brooklyn Census Tract Geometry
# -----------------------------

# Get Brooklyn census tract boundaries using ACS 2020 data
bk_tracts <- get_acs(
  geography = "tract",
  variables = "B01003_001",
  state = "NY",
  county = "Kings",
  year = 2020,
  survey = "acs5",
  geometry = TRUE
)

# Check tract geometry
#glimpse(bk_tracts)

Prepare ACS Data for Joining

The ACS file uses a GEO_ID field that includes a Census prefix. The tract geometry file uses a shorter GEOID field. To join the two datasets correctly, I create a matching GEOID field by removing the Census prefix from the ACS identifier.

I also remove identical duplicate tract records as a data validation step. This ensures that each tract appears only once before the spatial join.

# -----------------------------
# Prepare ACS Data for Joining
# -----------------------------

# Create a GEOID column from the ACS GEO_ID field
# This removes the Census prefix and keeps only the tract ID
acs_bk <- acs_bk %>%
  mutate(GEOID = str_remove(GEO_ID, "1400000US"))

# Remove identical duplicate tract records if any are present
# This prevents the spatial join from doubling tract rows
acs_bk <- acs_bk %>%
  distinct(GEOID, .keep_all = TRUE)

# Check that each census tract appears only once
#acs_bk %>%
  #count(GEOID) %>%
  #filter(n > 1)

# Check that ACS GEOID was created correctly
#glimpse(acs_bk)

Join ACS Data to Census Tract Geometry

This step joins the cleaned ACS variables to the Brooklyn census tract geometry. The result is a spatial dataset that contains both geometry and demographic indicators. This joined dataset is the main dataset used for the maps in the progress report.

# -----------------------------
# Join ACS Data to Census Tract Geometry
# -----------------------------

# Join SNAP, poverty, and demographic variables to the Brooklyn census tract shapes
# This creates one spatial dataset that has both geometry and ACS data
bk_tracts_acs <- bk_tracts %>%
  left_join(acs_bk, by = "GEOID")

# Confirm the join did not duplicate rows
#nrow(bk_tracts_acs)

# Check the joined spatial dataset
# This lets us confirm that the ACS variables were added to the tract geometry
#glimpse(bk_tracts_acs)

Map 1: SNAP Participation by Census Tract

This map shows the percentage of households receiving SNAP benefits by census tract in Brooklyn. SNAP participation is used as one indicator of food-related economic need. Darker areas represent higher SNAP participation, while lighter areas represent lower SNAP participation. NTA boundaries are included as a black outline to provide neighborhood-level context.

# -----------------------------
# Map SNAP Participation with Visible Tract Boundaries
# -----------------------------

# Create a SNAP choropleth map and draw tract boundaries on top
# This keeps missing-data tracts visible while showing their individual outlines
ggplot() +
  # Fill census tracts by SNAP percentage
  geom_sf(data = bk_tracts_acs, aes(fill = snap), color = NA) +
  
  # Draw all census tract boundaries on top of the fill layer
  geom_sf(data = bk_tracts_acs, fill = NA, color = "white", size = 0.08) +
  
  # Draw Brooklyn outer boundary in black
  geom_sf(data = bk_nta, fill = NA, color = "black", size = 0.5) +
  
  # Set color scale
  scale_fill_gradient(
    low = "lightblue",
    high = "darkblue",
    na.value = scales::alpha("grey60", 0.45)
  ) +
  
  labs(
    title = "Households Receiving SNAP Benefits by Census Tract 
              in Brooklyn (with NTA Boundaries)",
    fill = "% SNAP"
  ) +
  
  theme_minimal()

Map 2: Poverty Rate by Census Tract

This map shows the percentage of households below poverty by census tract in Brooklyn. Poverty is used as a broader measure of economic vulnerability. Comparing this map with the SNAP map helps identify whether food assistance participation and poverty follow similar spatial patterns.

# -----------------------------
# Map Poverty Rate by Census Tract
# -----------------------------

# Create a choropleth map showing percent of households below poverty
ggplot() +
  geom_sf(data = bk_tracts_acs, aes(fill = poverty), color = NA) +
  
  # Add tract boundaries
  geom_sf(data = bk_tracts_acs, fill = NA, color = "white", size = 0.08) +
  
  # Add NTA boundaries
  geom_sf(data = bk_nta, fill = NA, color = "black", size = 0.4) +
  
  scale_fill_gradient(
    low = "lightgreen",
    high = "darkgreen",
    na.value = scales::alpha("grey60", 0.45)
  ) +
  
  labs(
    title = "Households Below Poverty by Census Tract in Brooklyn (with NTA Boundaries)",
    fill = "% Poverty"
  ) +
  
  theme_minimal()

Create Standardized Need Variables

To combine SNAP and poverty into one index, both variables need to be placed on the same scale. I standardize each variable using z-scores. This makes the two indicators comparable before combining them.

# -----------------------------
# Create Standardized Need Variables
# -----------------------------

bk_tracts_acs <- bk_tracts_acs %>%
  mutate(
    snap_z = scale(snap),
    poverty_z = scale(poverty)
  )

Create Composite Food-Related Need Index

The composite need index combines standardized SNAP participation and standardized poverty rates. Higher values represent higher combined food-related need. This index is an early version of the project’s broader goal of identifying areas with high need.

# -----------------------------
# Create Composite Need Index
# -----------------------------

# Combine standardized SNAP and poverty into one index
# Higher values = higher overall need
bk_tracts_acs <- bk_tracts_acs %>%
  mutate(
    need_index = (snap_z + poverty_z) / 2
  )

Map 3: Composite Food-Related Need Index

This map shows the composite food-related need index by census tract. The index combines SNAP participation and poverty rates into one measure. This map is useful because it summarizes multiple indicators of need into a single spatial pattern.

# -----------------------------
# Map Composite Need Index
# -----------------------------

# Create a choropleth map showing combined food-related need
# The index combines standardized SNAP participation and poverty rates
# Higher values indicate greater overall need
ggplot() +
  geom_sf(data = bk_tracts_acs, aes(fill = need_index), color = NA) +
  
  # Add tract boundaries
  geom_sf(data = bk_tracts_acs, fill = NA, color = "white", size = 0.08) +
  
  # Add NTA boundaries for neighborhood context
  geom_sf(data = bk_nta, fill = NA, color = "black", size = 0.4) +
  
  scale_fill_gradient(
    low = "pink",
    high = "red",
    na.value = scales::alpha("grey60", 0.45)
  ) +
  
  labs(
    title = "Composite Food-Related Need Index by Census Tract in Brooklyn (with NTA Boundaries)",
    fill = "Need Index"
  ) +
  
  theme_minimal()

Current Progress Summary

So far, I have completed the main data preparation steps for the need side of the project. I imported and converted Brooklyn NTA boundaries, cleaned ACS demographic data, pulled Brooklyn census tract geometry, joined the ACS variables to the tract boundaries, and created initial maps of SNAP participation, poverty, and a composite need index.

The next step will be to incorporate food infrastructure datasets, including CEANYC food resource data and farmers market locations. These point datasets will be converted into spatial objects and compared against the need index to identify areas where high need may not be matched by nearby food infrastructure.

Progress Report - Identifying Spatial Gaps in Food Access and Infrastructure in Brooklyn

Amall Ali

05/01/2026