library(sf)
library(tidyverse)
library(readxl)
library(stringr)
library(tidycensus)
library(tidygeocoder)
library(scales)
library(knitr)

Introduction

Access to affordable and nutritious food is an important part of neighborhood well-being, public health, and everyday quality of life. However, food access is not evenly distributed across urban communities. In many cities, neighborhoods with higher poverty, higher reliance on food assistance programs, and longer histories of disinvestment may not have the same level of access to food infrastructure as more advantaged areas.

This project examines spatial patterns of food-related need and food infrastructure across Brooklyn, New York. The main goal is to identify places where food-related need is high but food infrastructure is relatively limited. Rather than simply mapping poverty or counting food resources, this project compares need and access together in order to identify possible food access gaps.

The analysis uses census tracts as the main unit of analysis because census tracts provide detailed local geography. Neighborhood Tabulation Areas, or NTAs, are also used for neighborhood context on maps and for summarizing the final results at a more recognizable neighborhood scale.

Data Sources and Study Area

The study area for this project is Brooklyn, New York. The project uses American Community Survey data, New York City spatial boundary data, and food infrastructure datasets. The demographic data come from the 2020 American Community Survey five-year estimates, specifically table S2201. The spatial boundary data include Brooklyn census tract geometry and New York City Neighborhood Tabulation Areas. The food infrastructure data include food pantries, farmers markets, and CEANYC community food resources.

Food infrastructure in this analysis includes farmers markets, CSAs, community gardens, community fridges, and food pantries. These resources are not all the same. They differ in cost, accessibility, reliability, and target population. For example, farmers markets and CSAs may be more common in higher-income neighborhoods, while food pantries and community fridges may play a more direct role in emergency food access. For that reason, this project treats the food resource count as a broad measure of food infrastructure, not as a perfect measure of food access.

Import and Prepare Brooklyn NTA Boundary Data

The first spatial dataset used in this project is the 2020 New York City Neighborhood Tabulation Area boundary file. This file contains a geometry column stored as Well-Known Text. I converted the file into an sf spatial object and filtered it to Brooklyn.

nta <- read_csv("Data/2020_Neighborhood_Tabulation_Areas_(NTAs)_20260501.csv")
nta_sf <- st_as_sf(nta, wkt = "the_geom", crs = 4326)
bk_nta <- nta_sf %>%
  filter(BoroName == "Brooklyn")

Preliminary Map of Brooklyn NTAs

This preliminary map was created to confirm that the NTA spatial data imported correctly and that the Brooklyn filter worked.

ggplot() +
  geom_sf(data = bk_nta, fill = "lightblue", color = "black") +
  labs(title = "Brooklyn NTA Boundaries") +
  theme_minimal()

Import and Clean ACS Data

The American Community Survey data were imported from an Excel file. The file contained an extra descriptive row that was not part of the actual tract-level data, so that row was removed. I selected the variables needed for the analysis, including SNAP participation, poverty, and selected racial and ethnic household composition variables.

acs <- read_excel("Data/MINE_ACSST5Y2020.S2201.xlsx")
acs_clean <- acs %>%
  slice(-1)
acs_clean <- acs_clean %>%
  select(
    GEO_ID,
    NAME,
    snap = S2201_C04_001E,
    poverty = S2201_C02_021E,
    pct_black = S2201_C02_026E,
    pct_hispanic = S2201_C02_032E
  )

Convert ACS Variables to Numeric Format

The ACS variables imported as character values, so I converted them into numeric format before mapping and analysis. Values that could not be converted became missing values, which is expected when Census files include unavailable or suppressed entries.

acs_clean <- acs_clean %>%
  mutate(
    snap = as.numeric(snap),
    poverty = as.numeric(poverty),
    pct_black = as.numeric(pct_black),
    pct_hispanic = as.numeric(pct_hispanic)
  )
acs_bk <- acs_clean %>%
  filter(str_detect(NAME, "Kings County"))

Retrieve Brooklyn Census Tract Geometry

The ACS Excel file contains tract-level values but does not include spatial geometry. I used tidycensus to retrieve Brooklyn census tract boundaries. The variable B01003_001 represents total population and was used later to calculate per capita food infrastructure measures.

bk_tracts <- readRDS("Data/bk_tracts.rds")

Prepare ACS Data for Joining

The ACS table and tract geometry use different versions of the tract identifier. The ACS file uses a GEO_ID field with a Census prefix, while the tract geometry uses a shorter GEOID field. I removed the prefix from the ACS identifier so the two datasets could be joined correctly. I also removed duplicate tract records to avoid multiplying rows during the join.

acs_bk <- acs_bk %>%
  mutate(GEOID = str_remove(GEO_ID, "1400000US"))
acs_bk <- acs_bk %>%
  distinct(GEOID, .keep_all = TRUE)

Join ACS Data to Census Tract Geometry

The cleaned ACS data were joined to the Brooklyn census tract geometry. This created the main tract-level spatial dataset used for the need maps and later food access analysis.

bk_tracts_acs <- bk_tracts %>%
  left_join(acs_bk, by = "GEOID")
bk_tracts_acs <- bk_tracts_acs %>%
  mutate(
    population = ifelse(population == 0, NA, population)
  )

Map SNAP Participation by Census Tract

This map shows the percentage of households receiving SNAP benefits by census tract in Brooklyn. SNAP participation is one indicator of food-related economic need.

ggplot() +
  geom_sf(data = bk_tracts_acs, aes(fill = snap), color = NA) +
  geom_sf(data = bk_tracts_acs, fill = NA, color = "white", size = 0.08) +
  geom_sf(data = bk_nta, fill = NA, color = "black", size = 0.5) +
  scale_fill_gradient(
    low = "lightblue",
    high = "darkblue",
    na.value = scales::alpha("grey60", 0.45)
  ) +
  labs(
    title = "Households Receiving SNAP Benefits by Census Tract
              in Brooklyn (with NTA Boundaries)",
    fill = "% SNAP"
  ) +
  theme_minimal()

Map Poverty Rate by Census Tract

This map shows the percentage of households below poverty by census tract in Brooklyn. Poverty is used as a broader measure of economic vulnerability.

ggplot() +
  geom_sf(data = bk_tracts_acs, aes(fill = poverty), color = NA) +
  geom_sf(data = bk_tracts_acs, fill = NA, color = "white", size = 0.08) +
  geom_sf(data = bk_nta, fill = NA, color = "black", size = 0.4) +
  scale_fill_gradient(
    low = "lightgreen",
    high = "darkgreen",
    na.value = scales::alpha("grey60", 0.45)
  ) +
  labs(
    title = "Households Below Poverty by Census Tract in Brooklyn (with NTA Boundaries)",
    fill = "% Poverty"
  ) +
  theme_minimal()

Import Food Infrastructure Data

The food infrastructure datasets include food pantries, CEANYC community food resources, and farmers markets.

food_pantries <- read_excel("Data/DYCD_Food_Pantries.xlsx")
ceanyc <- read_csv("Data/CEANYC_Seeding_Solidarity_Map_20260324.csv", show_col_types = FALSE)
farmers_markets <- read_csv("Data/NYC_Farmers_Markets_20260501.csv", show_col_types = FALSE)

Clean Food Pantry Addresses

The food pantry dataset did not initially include coordinates. To include pantries in the spatial analysis, I cleaned the address fields and created complete address strings for geocoding.

food_pantries_clean <- food_pantries %>%
  filter(Borough == "Brooklyn") %>%
  mutate(
    Street_Address = str_squish(Street_Address),
    ZIP = str_squish(as.character(ZIP))
  ) %>%
  mutate(
    full_address = paste(
      Street_Address,
      "Brooklyn",
      "NY",
      ZIP,
      sep = ", "
    )
  )

Geocode Food Pantry Addresses

Food pantry addresses were geocoded using OpenStreetMap through the tidygeocoder package. This allowed the food pantries to be converted into spatial points.

food_pantries_geo <- food_pantries_clean %>%
  geocode(
    address = full_address,
    method = "osm",
    lat = latitude,
    long = longitude
  )

Inspect Failed Food Pantry Geocodes

Some pantry addresses did not geocode successfully on the first attempt. I inspected those records and created simplified addresses by removing floor, suite, and unit information that could interfere with geocoding.

food_pantries_retry <- food_pantries_geo %>%
  filter(is.na(latitude) | is.na(longitude)) %>%
  mutate(
    simplified_address = full_address %>%
      str_remove(", 2nd Floor.*") %>%
      str_remove(", 3rd Floor.*") %>%
      str_remove(" Flr.*") %>%
      str_remove(" Suite.*") %>%
      str_remove(" #.*") %>%
      str_remove(" 2FL.*")
  )

Retry Geocoding Simplified Food Pantry Addresses

The simplified pantry addresses were geocoded again, and successful retry coordinates were merged back into the main pantry dataset.

food_pantries_retry_geo <- food_pantries_retry %>%
  geocode(
    address = simplified_address,
    method = "osm",
    lat = latitude_retry,
    long = longitude_retry
  )
food_pantries_geo_final <- food_pantries_geo %>%
  left_join(
    food_pantries_retry_geo %>%
      select(Name, latitude_retry, longitude_retry),
    by = "Name"
  ) %>%
  mutate(
    latitude = ifelse(is.na(latitude), latitude_retry, latitude),
    longitude = ifelse(is.na(longitude), longitude_retry, longitude)
  )

Manually Fix Remaining Failed Pantry Addresses

A small number of remaining pantry addresses required manual correction. These corrections were made to improve geocoding accuracy without changing the underlying pantry records.

food_pantries_geo_final <- food_pantries_geo_final %>%
  mutate(
    manual_address = case_when(
      Name == "Community Help in Park Slope, Inc." ~ "200 4th Avenue, Brooklyn, NY 11217",
      Name == "Mixteca Organization, Inc." ~ "245 23rd Street, Brooklyn, NY 11215",
      TRUE ~ full_address
    )
  )
food_pantries_manual_retry <- food_pantries_geo_final %>%
  filter(is.na(latitude) | is.na(longitude)) %>%
  geocode(
    address = manual_address,
    method = "osm",
    lat = latitude_manual,
    long = longitude_manual
  )
food_pantries_geo_final <- food_pantries_geo_final %>%
  left_join(
    food_pantries_manual_retry %>%
      select(Name, latitude_manual, longitude_manual),
    by = "Name"
  ) %>%
  mutate(
    latitude = ifelse(is.na(latitude), latitude_manual, latitude),
    longitude = ifelse(is.na(longitude), longitude_manual, longitude)
  )

Fix Food Pantries Geocoded Outside New York City

Some pantry records geocoded outside the expected New York City coordinate range. These locations were reviewed and corrected using clearer street addresses.

food_pantries_geo_final <- food_pantries_geo_final %>%
  mutate(
    manual_address = case_when(
      Name == "Apna Brooklyn Community Center, Inc." ~ "1236 Neptune Avenue, Brooklyn, NY 11235",
      Name == "Health Essential Association, Inc." ~ "2101 East 16th Street, Brooklyn, NY 11229",
      TRUE ~ manual_address
    )
  )
food_pantries_location_retry <- food_pantries_geo_final %>%
  filter(Name %in% c(
    "Apna Brooklyn Community Center, Inc.",
    "Health Essential Association, Inc."
  )) %>%
  geocode(
    address = manual_address,
    method = "osm",
    lat = latitude_location_fix,
    long = longitude_location_fix
  )
food_pantries_geo_final <- food_pantries_geo_final %>%
  left_join(
    food_pantries_location_retry %>%
      select(Name, latitude_location_fix, longitude_location_fix),
    by = "Name"
  ) %>%
  mutate(
    latitude = ifelse(!is.na(latitude_location_fix), latitude_location_fix, latitude),
    longitude = ifelse(!is.na(longitude_location_fix), longitude_location_fix, longitude)
  )

Convert Food Pantries to Spatial Points

After the pantry coordinates were corrected, the food pantry dataset was converted into spatial points and transformed to match the census tract coordinate reference system.

food_pantries_sf <- food_pantries_geo_final %>%
  st_as_sf(
    coords = c("longitude", "latitude"),
    crs = 4326
  )
food_pantries_sf <- st_transform(
  food_pantries_sf,
  st_crs(bk_tracts_acs)
)

Assign Food Pantries to Census Tracts

Food pantry points were spatially joined to Brooklyn census tracts and then counted by tract.

food_pantries_tracts <- food_pantries_sf %>%
  st_join(
    bk_tracts_acs %>% select(GEOID),
    join = st_within,
    left = FALSE
  )
food_pantry_counts <- food_pantries_tracts %>%
  st_drop_geometry() %>%
  count(GEOID, name = "food_pantry_count")

Filter CEANYC to Relevant Food Resource Types

The CEANYC dataset includes multiple resource types. For this project, I kept only gardens, CSAs, and community fridges.

ceanyc_clean <- ceanyc %>%
  filter(Sector %in% c("Gardens", "CSAs", "Community Fridges"))

Convert CEANYC Resources to Spatial Points

The CEANYC records were converted into spatial points using longitude and latitude.

ceanyc_sf <- ceanyc_clean %>%
  st_as_sf(coords = c("lon", "lat"), crs = 4326)

Clean CEANYC Coordinates

Some CEANYC points were outside a reasonable New York City coordinate range, so they were removed.

ceanyc_sf <- ceanyc_sf %>%
  filter(
    st_coordinates(.)[, "X"] > -75,
    st_coordinates(.)[, "X"] < -72,
    st_coordinates(.)[, "Y"] > 40,
    st_coordinates(.)[, "Y"] < 42
  )

Filter CEANYC Points to Brooklyn

The CEANYC layer was transformed to match the tract layer. Since the CEANYC dataset did not have a borough column, Brooklyn resources were identified spatially by keeping only points that fell within Brooklyn census tracts.

ceanyc_sf <- st_transform(ceanyc_sf, st_crs(bk_tracts_acs))
ceanyc_bk <- ceanyc_sf %>%
  st_join(
    bk_tracts_acs %>% select(GEOID),
    join = st_within,
    left = FALSE
  )

Prepare Farmers Market Data

The farmers market dataset was filtered to Brooklyn. Invalid or missing coordinates were removed, and only records from 2025 were retained to avoid counting repeated markets from earlier years.

farmers_markets_bk <- farmers_markets %>%
  filter(Borough == "Brooklyn")
farmers_markets_bk_clean <- farmers_markets_bk %>%
  filter(
    !is.na(Latitude),
    !is.na(Longitude),
    Latitude > 40,
    Latitude < 42,
    Longitude > -75,
    Longitude < -72
  )
farmers_markets_bk_clean <- farmers_markets_bk_clean %>%
  filter(Year == 2025)

Convert Farmers Markets to Spatial Points

The cleaned farmers market data were converted into spatial points and transformed to match the census tract layer.

farmers_markets_sf <- farmers_markets_bk_clean %>%
  st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326)
farmers_markets_sf <- st_transform(farmers_markets_sf, st_crs(bk_tracts_acs))

Assign Farmers Markets to Census Tracts

Farmers market points were spatially joined to Brooklyn census tracts and counted by tract.

farmers_markets_tracts <- farmers_markets_sf %>%
  st_join(
    bk_tracts_acs %>% select(GEOID),
    join = st_within,
    left = FALSE
  )
farmers_market_counts <- farmers_markets_tracts %>%
  st_drop_geometry() %>%
  count(GEOID, name = "farmers_market_count")

Count CEANYC Resources by Census Tract

CEANYC resources were counted by census tract.

ceanyc_counts <- ceanyc_bk %>%
  st_drop_geometry() %>%
  count(GEOID, name = "ceanyc_count")

Join Food Resource Counts to Census Tracts

Food pantry, farmers market, and CEANYC count tables were joined back to the main census tract dataset. Missing counts were replaced with zero because a missing value means that no matching resource was found in that tract.

bk_tracts_food <- bk_tracts_acs %>%
  left_join(ceanyc_counts, by = "GEOID") %>%
  left_join(farmers_market_counts, by = "GEOID") %>%
  left_join(food_pantry_counts, by = "GEOID") %>%
  mutate(
    ceanyc_count = replace_na(ceanyc_count, 0),
    farmers_market_count = replace_na(farmers_market_count, 0),
    food_pantry_count = replace_na(food_pantry_count, 0),
    total_food_resources =
      ceanyc_count +
      farmers_market_count +
      food_pantry_count
  )

Calculate Food Resources Per 1,000 Residents

Because census tracts vary in population size, raw counts alone are not enough to compare access across tracts. I calculated food resources per 1,000 residents.

bk_tracts_food <- bk_tracts_food %>%
  mutate(
    food_resources_per_1000 = (total_food_resources / population) * 1000
  )

Examine Low-Population Census Tracts

Per capita measures can be unstable in very low-population areas. I checked how many tracts had populations below 100, 250, 500, and 1,000 residents.

bk_tracts_food %>%
  st_drop_geometry() %>%
  summarize(
    tracts_under_100 = sum(population < 100, na.rm = TRUE),
    tracts_under_250 = sum(population < 250, na.rm = TRUE),
    tracts_under_500 = sum(population < 500, na.rm = TRUE),
    tracts_under_1000 = sum(population < 1000, na.rm = TRUE)
  ) %>%
  kable(caption = "Number of Brooklyn census tracts below selected population thresholds.")
Number of Brooklyn census tracts below selected population thresholds.
tracts_under_100 tracts_under_250 tracts_under_500 tracts_under_1000
4 4 5 8

Remove Low-Population Census Tracts

Census tracts with fewer than 500 residents were removed from the final access and gap analysis to reduce distortion from unstable per capita values.

bk_tracts_food <- bk_tracts_food %>%
  filter(population >= 500)

Standardize Food Access

Food resources per 1,000 residents were standardized using z-scores so that food access could be compared directly with the need index.

bk_tracts_food <- bk_tracts_food %>%
  mutate(
    access_z = as.numeric(scale(food_resources_per_1000))
  )

Create the Food Access Gap Index

The food access gap index was calculated by subtracting standardized access from the need index. Higher values indicate tracts where food-related need is high relative to food infrastructure access.

bk_tracts_food <- bk_tracts_food %>%
  mutate(
    gap_index = as.numeric(need_index) - access_z
  )

Inspect Highest and Lowest Gap Index Tracts

To better understand the final index, I inspected the tracts with the highest and lowest gap values.

lowest_gap_tracts <- bk_tracts_food %>%
  st_drop_geometry() %>%
  select(GEOID, NAME.x, population, need_index, food_resources_per_1000, access_z, gap_index, total_food_resources) %>%
  arrange(gap_index) %>%
  head(10)
highest_gap_tracts <- bk_tracts_food %>%
  st_drop_geometry() %>%
  select(GEOID, NAME.x, population, need_index, food_resources_per_1000, access_z, gap_index, total_food_resources) %>%
  arrange(desc(gap_index)) %>%
  head(10)
kable(highest_gap_tracts, caption = "Ten census tracts with the highest food access gap index values.")
Ten census tracts with the highest food access gap index values.
GEOID NAME.x population need_index food_resources_per_1000 access_z gap_index total_food_resources
36047035200 Census Tract 352, Kings County, New York 1162 4.431523 0 -0.4839898 4.915513 0
36047121000 Census Tract 1210, Kings County, New York 4203 3.482719 0 -0.4839898 3.966709 0
36047080800 Census Tract 808, Kings County, New York 1668 3.454304 0 -0.4839898 3.938293 0
36047025902 Census Tract 259.02, Kings County, New York 3650 3.111843 0 -0.4839898 3.595832 0
36047091000 Census Tract 910, Kings County, New York 4199 3.082848 0 -0.4839898 3.566838 0
36047038200 Census Tract 382, Kings County, New York 5714 3.064796 0 -0.4839898 3.548786 0
36047008500 Census Tract 85, Kings County, New York 7577 2.993533 0 -0.4839898 3.477523 0
36047053101 Census Tract 531.01, Kings County, New York 2487 2.992971 0 -0.4839898 3.476960 0
36047091200 Census Tract 912, Kings County, New York 5549 2.906133 0 -0.4839898 3.390123 0
36047048900 Census Tract 489, Kings County, New York 3591 2.882646 0 -0.4839898 3.366636 0
kable(lowest_gap_tracts, caption = "Ten census tracts with the lowest food access gap index values.")
Ten census tracts with the lowest food access gap index values.
GEOID NAME.x population need_index food_resources_per_1000 access_z gap_index total_food_resources
36047011901 Census Tract 119.01, Kings County, New York 1358 -0.6405380 2.945508 7.884159 -8.524697 4
36047115000 Census Tract 1150, Kings County, New York 2717 1.2046176 3.312477 8.926713 -7.722095 9
36047005100 Census Tract 51, Kings County, New York 2815 -1.2461460 2.131439 5.571399 -6.817545 6
36047040500 Census Tract 405, Kings County, New York 1512 -0.5810074 1.984127 5.152888 -5.733896 3
36047029900 Census Tract 299, Kings County, New York 2297 0.0752661 2.176752 5.700134 -5.624868 5
36047005900 Census Tract 59, Kings County, New York 1143 -0.8783265 1.749781 4.487115 -5.365442 2
36047050001 Census Tract 500.01, Kings County, New York 2154 -1.2696329 1.392758 3.472816 -4.742449 3
36047012901 Census Tract 129.01, Kings County, New York 2246 -1.2702234 1.335708 3.310738 -4.580962 3
36047051500 Census Tract 515, Kings County, New York 2608 -1.2015384 1.150307 2.784016 -3.985555 3
36047102801 Census Tract 1028.01, Kings County, New York 1652 -0.9667787 1.210654 2.955461 -3.922240 2

Correlation Test Between Need and Food Access

A Pearson correlation test was used to examine whether census tracts with higher food-related need also tend to have more food resources per 1,000 residents. This does not prove causation, but it helps summarize whether need and food infrastructure are aligned.

cor_test_result <- cor.test(
  bk_tracts_food$need_index,
  bk_tracts_food$food_resources_per_1000,
  use = "complete.obs"
)
cor_test_result
## 
##  Pearson's product-moment correlation
## 
## data:  bk_tracts_food$need_index and bk_tracts_food$food_resources_per_1000
## t = 1.1821, df = 771, p-value = 0.2375
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.02806616  0.11270992
## sample estimates:
##        cor 
## 0.04253299

The correlation is very close to zero, indicating little to no linear relationship between food-related need and food infrastructure per capita. This suggests that food resources are not systematically located in areas with higher levels of need across Brooklyn.

Distribution of the Gap Index

This histogram shows the distribution of food access gap scores across Brooklyn census tracts. Positive scores indicate higher need relative to food infrastructure access.

bk_tracts_food %>%
  st_drop_geometry() %>%
  filter(!is.na(as.numeric(gap_index))) %>%
  ggplot(aes(x = as.numeric(gap_index))) +
  geom_histogram(
    bins = 28,
    fill = "indianred3",
    color = "white",
    linewidth = 0.3,
    alpha = 0.85
  ) +
  geom_vline(
    xintercept = 0,
    linetype = "dashed",
    color = "black",
    linewidth = 0.7
  ) +
  annotate(
    "text",
    x = 0.15,
    y = 115,
    label = "0 = average balance\nbetween need and access",
    hjust = 0,
    size = 3.5
  ) +
  labs(
    title = "Distribution of Food Access Gap Scores Across Brooklyn",
    subtitle = "Positive scores indicate higher need relative to food infrastructure access",
    x = "Food Access Gap Index",
    y = "Number of Census Tracts"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank()
  )

Scatterplot: Need vs Food Access

This scatterplot compares the food-related need index with food resources per 1,000 residents. Each point represents a census tract. The trend line shows whether food resources increase or decrease as need increases.

cor_label <- paste0("Correlation = ", round(as.numeric(cor_test_result$estimate), 2))
bk_tracts_food %>%
  st_drop_geometry() %>%
  filter(
    !is.na(as.numeric(need_index)),
    !is.na(as.numeric(food_resources_per_1000)),
    !is.na(as.numeric(gap_index))
  ) %>%
  ggplot(aes(
    x = as.numeric(need_index),
    y = as.numeric(food_resources_per_1000)
  )) +
  geom_vline(
    xintercept = 0,
    linetype = "dashed",
    color = "grey40",
    linewidth = 0.6
  ) +
  geom_hline(
    yintercept = mean(bk_tracts_food$food_resources_per_1000, na.rm = TRUE),
    linetype = "dashed",
    color = "grey40",
    linewidth = 0.6
  ) +
  geom_point(aes(color = as.numeric(gap_index)), alpha = 0.65, size = 2) +
  geom_smooth(
    method = "lm",
    color = "black",
    linewidth = 1,
    se = FALSE
  ) +
  scale_color_gradient2(
    low = "blue",
    mid = "white",
    high = "red",
    midpoint = 0,
    name = "Gap Index"
  ) +
  annotate(
    "text",
    x = min(as.numeric(bk_tracts_food$need_index), na.rm = TRUE),
    y = max(bk_tracts_food$food_resources_per_1000, na.rm = TRUE),
    label = cor_label,
    hjust = 0,
    vjust = 1,
    size = 4
  ) +
  labs(
    title = "Relationship Between Food Need and Food Access in Brooklyn",
    subtitle = "Each point represents a census tract; red points indicate higher need relative to food access",
    x = "Food-Related Need Index",
    y = "Food Resources per 1,000 Residents"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank()
  )

Join NTA Names to Census Tracts

Although the main analysis is conducted at the census tract level, NTA names were added to make the final results easier to interpret at a neighborhood scale. I used representative points inside each tract to avoid duplicating tracts that touch more than one NTA boundary.

bk_tracts_food <- bk_tracts_food %>%
  select(-any_of("NTAName"))
bk_nta <- st_transform(bk_nta, st_crs(bk_tracts_food))
tract_points <- st_point_on_surface(bk_tracts_food)
tract_nta_lookup <- tract_points %>%
  st_join(
    bk_nta %>% select(NTAName),
    join = st_within,
    left = TRUE
  ) %>%
  st_drop_geometry() %>%
  select(GEOID, NTAName)
bk_tracts_food <- bk_tracts_food %>%
  left_join(tract_nta_lookup, by = "GEOID")

Summarize Gap Index and Need Index by NTA

After adding NTA names, I summarized the average gap index and average need index for each neighborhood.

nta_summary <- bk_tracts_food %>%
  st_drop_geometry() %>%
  filter(!is.na(NTAName)) %>%
  group_by(NTAName) %>%
  summarize(
    mean_gap = mean(as.numeric(gap_index), na.rm = TRUE),
    mean_need = mean(as.numeric(need_index), na.rm = TRUE),
    tract_count = n()
  ) %>%
  arrange(desc(mean_gap))
kable(
  nta_summary %>% head(15),
  caption = "Top 15 Brooklyn NTAs by Average Food Access Gap Index"
)
Top 15 Brooklyn NTAs by Average Food Access Gap Index
NTAName mean_gap mean_need tract_count
South Williamsburg 2.2994719 2.1553055 10
Coney Island-Sea Gate 1.5527298 1.3755191 12
Brighton Beach 1.4224115 1.0299955 8
Borough Park 1.2013610 0.8872324 21
Spring Creek-Starrett City 1.1769006 0.9980540 3
Sunset Park (East)-Borough Park (West) 1.0879354 0.6039456 10
East New York-City Line 0.8973269 0.7934019 12
Gravesend (South) 0.8271401 0.5166298 7
Sunset Park (Central) 0.7862316 0.3022419 13
East Flatbush-Remsen Village 0.6938158 0.4135765 9
Brownsville 0.5972298 1.7870418 14
Gravesend (West) 0.4160755 0.0728520 16
Bensonhurst 0.3972562 0.0280856 28
Gravesend (East)-Homecrest 0.3604554 -0.1235344 18
Sunset Park (West) 0.3212744 0.1092701 15

Distribution of Average Gap Index by NTA

This chart shows whether neighborhood-level food access gaps are spread evenly or concentrated among a smaller number of higher-gap neighborhoods.

ggplot(nta_summary, aes(x = mean_gap)) +
  geom_histogram(
    bins = 20,
    fill = "steelblue",
    color = "white"
  ) +
  labs(
    title = "Distribution of Average Gap Index by Brooklyn NTA",
    x = "Average Gap Index",
    y = "Number of NTAs"
  ) +
  theme_minimal()

Top 15 NTAs by Average Gap Index

This bar chart shows the 15 Brooklyn neighborhoods with the highest average food access gap scores.

nta_summary %>%
  slice_max(mean_gap, n = 15) %>%
  mutate(NTAName = fct_reorder(NTAName, mean_gap)) %>%
  ggplot(aes(x = NTAName, y = mean_gap, fill = mean_gap)) +
  geom_col(width = 0.75) +
  coord_flip() +
  scale_fill_gradient(
    low = "lightsalmon",
    high = "darkred"
  ) +
  labs(
    title = "Top 15 Brooklyn Neighborhoods by Average Food Access Gap Index",
    subtitle = "Higher values indicate greater mismatch between food need and food infrastructure",
    x = "Neighborhood (NTA)",
    y = "Average Gap Index",
    fill = "Gap Index"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank(),
    axis.text.y = element_text(size = 10)
  )

# Top 15 NTAs by Average Need Index This bar chart shows the 15 Brooklyn neighborhoods with the highest average food-related need scores.

nta_summary %>%
  slice_max(mean_need, n = 15) %>%
  mutate(NTAName = fct_reorder(NTAName, mean_need)) %>%
  ggplot(aes(x = NTAName, y = mean_need, fill = mean_need)) +
  geom_col(width = 0.75) +
  coord_flip() +
  scale_fill_gradient(
    low = "lightyellow",
    high = "darkred"
  ) +
  labs(
    title = "Top 15 Brooklyn Neighborhoods by Average Food-Related Need Index",
    subtitle = "Based on standardized SNAP participation and poverty rates",
    x = "Neighborhood (NTA)",
    y = "Average Need Index",
    fill = "Need Index"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank(),
    axis.text.y = element_text(size = 10)
  )

kable(
  nta_summary %>% slice_max(mean_need, n = 15),
  caption = "Top 15 NTAs by Average Food-Related Need Index"
)
Top 15 NTAs by Average Food-Related Need Index
NTAName mean_gap mean_need tract_count
South Williamsburg 2.2994719 2.1553055 10
Brownsville 0.5972298 1.7870418 14
Coney Island-Sea Gate 1.5527298 1.3755191 12
Brighton Beach 1.4224115 1.0299955 8
Spring Creek-Starrett City 1.1769006 0.9980540 3
East New York (North) -0.9898533 0.9678449 13
Borough Park 1.2013610 0.8872324 21
East New York-City Line 0.8973269 0.7934019 12
East New York-New Lots 0.1912495 0.7896870 14
Sunset Park (East)-Borough Park (West) 1.0879354 0.6039456 10
Ocean Hill 0.2108762 0.5979447 10
Bushwick (East) -0.1573075 0.5339574 16
Gravesend (South) 0.8271401 0.5166298 7
Bedford-Stuyvesant (West) -0.2410075 0.4600780 21
Cypress Hills 0.2259705 0.4199031 14

Identify Top NTA Labels

The final maps label the top 10 neighborhoods by average gap index and average need index. This makes the maps easier to interpret.

top_gap_ntas <- nta_summary %>%
  slice_max(mean_gap, n = 10) %>%
  pull(NTAName)
top_need_ntas <- nta_summary %>%
  slice_max(mean_need, n = 10) %>%
  pull(NTAName)
bk_nta_labeled <- bk_nta %>%
  left_join(nta_summary, by = "NTAName")
bk_nta_centroids <- bk_nta_labeled %>%
  st_centroid()

Final Map: Food Access Gap Index

This final gap map shows where food-related need is high relative to food infrastructure access. Higher values indicate higher need and lower relative access.

top_gap_centroids <- bk_nta_centroids %>%
  filter(NTAName %in% top_gap_ntas)
top_gap_polygons <- bk_nta %>%
  filter(NTAName %in% top_gap_ntas)
top_gap_centroids_regular <- top_gap_centroids %>%
  filter(NTAName != "Sunset Park (Central)")
sunset_park_central_label <- top_gap_centroids %>%
  filter(NTAName == "Sunset Park (Central)") %>%
  mutate(
    x = st_coordinates(.)[, 1],
    y = st_coordinates(.)[, 2] + 0.005
  ) %>%
  st_drop_geometry()
ggplot() +
  geom_sf(
    data = bk_tracts_food,
    fill = "grey90",
    color = NA
  ) +
  geom_sf(
    data = bk_tracts_food,
    aes(fill = as.numeric(gap_index)),
    color = NA
  ) +
  geom_sf(
    data = bk_tracts_food,
    fill = NA,
    color = "white",
    size = 0.08
  ) +
  geom_sf(
    data = bk_nta,
    fill = NA,
    color = "black",
    size = 0.35
  ) +
  geom_sf(
    data = top_gap_polygons,
    fill = NA,
    color = "black",
    size = 0.8
  ) +
  geom_sf_label(
    data = top_gap_centroids_regular,
    aes(label = NTAName),
    size = 2.4,
    fill = "white",
    color = "black",
    alpha = 0.92,
    label.size = 0.18,
    label.padding = unit(0.12, "lines")
  ) +
  geom_label(
    data = sunset_park_central_label,
    aes(x = x, y = y, label = NTAName),
    size = 2.4,
    fill = "white",
    color = "black",
    alpha = 0.92,
    linewidth = 0.18,
    label.padding = unit(0.12, "lines")
  ) +
  scale_fill_gradient2(
    low = "blue",
    mid = "white",
    high = "red",
    midpoint = 0,
    na.value = "grey75",
    breaks = c(-8, 0, 4),
    labels = c("-8", "0", "4")
  ) +
  labs(
    title = "Food Access Gap Index by Census Tract in Brooklyn",
    subtitle = "Higher values indicate higher food-related need and lower food infrastructure access
    Labels and bold outlines show the 10 neighborhoods with greatest need-to-access mismatch",
    fill = "Gap Index"
  ) +
  coord_sf(clip = "off") +
  theme_minimal(base_size = 13) +
  theme(
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks = element_blank()
  )

Results and Interpretation

The SNAP and poverty maps show that food-related economic vulnerability is not evenly distributed across Brooklyn. Several parts of the borough show higher SNAP participation and higher poverty, especially in central, eastern, and parts of southern Brooklyn. These patterns support the decision to combine SNAP participation and poverty into a broader need index.

The food-related need index identifies areas where multiple indicators of vulnerability overlap. Neighborhoods such as South Williamsburg, Brownsville, East New York, Borough Park, and parts of southern Brooklyn appear as important high-need areas in the analysis.

The food access gap index answers a different question. It does not only show where need is high. Instead, it shows where need is high relative to available food infrastructure. This distinction matters because a neighborhood can have high need but still have comparatively more food infrastructure than another high-need neighborhood.

The correlation analysis and scatterplot show a very weak relationship between the need index and food resources per 1,000 residents. In simple terms, higher-need census tracts do not consistently have more food resources per person. This suggests that food infrastructure is not systematically aligned with the areas of greatest food-related need across Brooklyn.

The neighborhood-level summary adds another layer of interpretation. South Williamsburg ranks highly for both average need and average gap, making it a key priority area in the analysis. Other neighborhoods, such as Brownsville and East New York, also show high need, but their gap scores differ. This shows that need and access mismatch are related, but they are not identical.

Limitations

One limitation of this analysis is that food access is measured based on resources located within census tract boundaries. In real life, residents are not limited to their own census tract and may travel to nearby areas for food. This means the tract-based approach may underestimate access in places where resources are close by but fall outside the tract boundary.

Another limitation is that the project combines different types of food infrastructure into one total count. Food pantries, farmers markets, CSAs, community gardens, and community fridges do not all provide the same kind of access. They differ in cost, hours, eligibility, reliability, and purpose. For that reason, the total food resource count should be interpreted as a broad measure of food infrastructure rather than a perfect measure of actual food access.

Finally, this project measures where resources are located, but not whether residents actually use them. Future research could improve the analysis by adding distance-based or network-based accessibility measures, such as walking distance, transit access, or travel time.

Conclusion

This project used R to analyze spatial gaps between food-related need and food infrastructure across Brooklyn. By combining ACS demographic indicators, spatial boundary data, food infrastructure records, geocoding, spatial joins, per capita measures, and index construction, the project identified census tracts and neighborhoods where food-related need is high relative to available food resources.

The results show that food-related need is spatially clustered and that food infrastructure is not consistently aligned with the areas of greatest need. The weak correlation between need and food resources per 1,000 residents supports this finding. The food access gap index provides the clearest summary of the project’s main finding: some Brooklyn neighborhoods face a stronger mismatch between food-related vulnerability and food infrastructure access than others.

Overall, this project demonstrates how spatial analysis in R can move beyond simply mapping poverty or counting resources. By comparing need and access together, the analysis provides a more useful way to identify potential food access gaps and areas that may benefit from additional food-related resources, planning attention, or community investment.

References

Alaimo, K., Beavers, A. W., Crawford, C., Hodges Snyder, E., & Litt, J. S. (2016). Amplifying health through community gardens: A framework for advancing multicomponent, behaviorally based neighborhood interventions. Current Environmental Health Reports, 3(3), 302–312.

Pebesma, E. (2018). Simple features for R: Standardized support for spatial vector data. The R Journal, 10(1), 439–446.

Walker, K. (2023). tidycensus: Load US Census Boundary and Attribute Data as tidyverse and sf-ready data frames. R package.

Walker, R. E., Keane, C. R., & Burke, J. G. (2010). Disparities and access to healthy food in the United States: A review of food deserts literature. Health & Place, 16(5), 876–884.

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., & Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.

Zenk, S. N., Schulz, A. J., Israel, B. A., James, S. A., Bao, S., & Wilson, M. L. (2005). Neighborhood racial composition, neighborhood poverty, and the spatial accessibility of supermarkets in metropolitan Detroit. American Journal of Public Health, 95(4), 660–667.