library(sf)
library(tidyverse)
library(readxl)
library(stringr)
library(tidycensus)
library(tidygeocoder)
library(scales)
library(knitr)
Access to affordable and nutritious food is an important part of neighborhood well-being, public health, and everyday quality of life. However, food access is not evenly distributed across urban communities. In many cities, neighborhoods with higher poverty, higher reliance on food assistance programs, and longer histories of disinvestment may not have the same level of access to food infrastructure as more advantaged areas.
This project examines spatial patterns of food-related need and food infrastructure across Brooklyn, New York. The main goal is to identify places where food-related need is high but food infrastructure is relatively limited. Rather than simply mapping poverty or counting food resources, this project compares need and access together in order to identify possible food access gaps.
The analysis uses census tracts as the main unit of analysis because census tracts provide detailed local geography. Neighborhood Tabulation Areas, or NTAs, are also used for neighborhood context on maps and for summarizing the final results at a more recognizable neighborhood scale.
The study area for this project is Brooklyn, New York. The project uses American Community Survey data, New York City spatial boundary data, and food infrastructure datasets. The demographic data come from the 2020 American Community Survey five-year estimates, specifically table S2201. The spatial boundary data include Brooklyn census tract geometry and New York City Neighborhood Tabulation Areas. The food infrastructure data include food pantries, farmers markets, and CEANYC community food resources.
Food infrastructure in this analysis includes farmers markets, CSAs, community gardens, community fridges, and food pantries. These resources are not all the same. They differ in cost, accessibility, reliability, and target population. For example, farmers markets and CSAs may be more common in higher-income neighborhoods, while food pantries and community fridges may play a more direct role in emergency food access. For that reason, this project treats the food resource count as a broad measure of food infrastructure, not as a perfect measure of food access.
The first spatial dataset used in this project is the 2020 New York
City Neighborhood Tabulation Area boundary file. This file contains a
geometry column stored as Well-Known Text. I converted the file into an
sf spatial object and filtered it to Brooklyn.
nta <- read_csv("Data/2020_Neighborhood_Tabulation_Areas_(NTAs)_20260501.csv")
nta_sf <- st_as_sf(nta, wkt = "the_geom", crs = 4326)
bk_nta <- nta_sf %>%
filter(BoroName == "Brooklyn")
This preliminary map was created to confirm that the NTA spatial data imported correctly and that the Brooklyn filter worked.
ggplot() +
geom_sf(data = bk_nta, fill = "lightblue", color = "black") +
labs(title = "Brooklyn NTA Boundaries") +
theme_minimal()
The American Community Survey data were imported from an Excel file. The file contained an extra descriptive row that was not part of the actual tract-level data, so that row was removed. I selected the variables needed for the analysis, including SNAP participation, poverty, and selected racial and ethnic household composition variables.
acs <- read_excel("Data/MINE_ACSST5Y2020.S2201.xlsx")
acs_clean <- acs %>%
slice(-1)
acs_clean <- acs_clean %>%
select(
GEO_ID,
NAME,
snap = S2201_C04_001E,
poverty = S2201_C02_021E,
pct_black = S2201_C02_026E,
pct_hispanic = S2201_C02_032E
)
The ACS variables imported as character values, so I converted them into numeric format before mapping and analysis. Values that could not be converted became missing values, which is expected when Census files include unavailable or suppressed entries.
acs_clean <- acs_clean %>%
mutate(
snap = as.numeric(snap),
poverty = as.numeric(poverty),
pct_black = as.numeric(pct_black),
pct_hispanic = as.numeric(pct_hispanic)
)
acs_bk <- acs_clean %>%
filter(str_detect(NAME, "Kings County"))
The ACS Excel file contains tract-level values but does not include
spatial geometry. I used tidycensus to retrieve Brooklyn
census tract boundaries. The variable B01003_001 represents
total population and was used later to calculate per capita food
infrastructure measures.
bk_tracts <- readRDS("Data/bk_tracts.rds")
The ACS table and tract geometry use different versions of the tract
identifier. The ACS file uses a GEO_ID field with a Census
prefix, while the tract geometry uses a shorter GEOID
field. I removed the prefix from the ACS identifier so the two datasets
could be joined correctly. I also removed duplicate tract records to
avoid multiplying rows during the join.
acs_bk <- acs_bk %>%
mutate(GEOID = str_remove(GEO_ID, "1400000US"))
acs_bk <- acs_bk %>%
distinct(GEOID, .keep_all = TRUE)
The cleaned ACS data were joined to the Brooklyn census tract geometry. This created the main tract-level spatial dataset used for the need maps and later food access analysis.
bk_tracts_acs <- bk_tracts %>%
left_join(acs_bk, by = "GEOID")
bk_tracts_acs <- bk_tracts_acs %>%
mutate(
population = ifelse(population == 0, NA, population)
)
This map shows the percentage of households receiving SNAP benefits by census tract in Brooklyn. SNAP participation is one indicator of food-related economic need.
ggplot() +
geom_sf(data = bk_tracts_acs, aes(fill = snap), color = NA) +
geom_sf(data = bk_tracts_acs, fill = NA, color = "white", size = 0.08) +
geom_sf(data = bk_nta, fill = NA, color = "black", size = 0.5) +
scale_fill_gradient(
low = "lightblue",
high = "darkblue",
na.value = scales::alpha("grey60", 0.45)
) +
labs(
title = "Households Receiving SNAP Benefits by Census Tract
in Brooklyn (with NTA Boundaries)",
fill = "% SNAP"
) +
theme_minimal()
This map shows the percentage of households below poverty by census tract in Brooklyn. Poverty is used as a broader measure of economic vulnerability.
ggplot() +
geom_sf(data = bk_tracts_acs, aes(fill = poverty), color = NA) +
geom_sf(data = bk_tracts_acs, fill = NA, color = "white", size = 0.08) +
geom_sf(data = bk_nta, fill = NA, color = "black", size = 0.4) +
scale_fill_gradient(
low = "lightgreen",
high = "darkgreen",
na.value = scales::alpha("grey60", 0.45)
) +
labs(
title = "Households Below Poverty by Census Tract in Brooklyn (with NTA Boundaries)",
fill = "% Poverty"
) +
theme_minimal()
The food infrastructure datasets include food pantries, CEANYC community food resources, and farmers markets.
food_pantries <- read_excel("Data/DYCD_Food_Pantries.xlsx")
ceanyc <- read_csv("Data/CEANYC_Seeding_Solidarity_Map_20260324.csv", show_col_types = FALSE)
farmers_markets <- read_csv("Data/NYC_Farmers_Markets_20260501.csv", show_col_types = FALSE)
The food pantry dataset did not initially include coordinates. To include pantries in the spatial analysis, I cleaned the address fields and created complete address strings for geocoding.
food_pantries_clean <- food_pantries %>%
filter(Borough == "Brooklyn") %>%
mutate(
Street_Address = str_squish(Street_Address),
ZIP = str_squish(as.character(ZIP))
) %>%
mutate(
full_address = paste(
Street_Address,
"Brooklyn",
"NY",
ZIP,
sep = ", "
)
)
Food pantry addresses were geocoded using OpenStreetMap through the
tidygeocoder package. This allowed the food pantries to be
converted into spatial points.
food_pantries_geo <- food_pantries_clean %>%
geocode(
address = full_address,
method = "osm",
lat = latitude,
long = longitude
)
Some pantry addresses did not geocode successfully on the first attempt. I inspected those records and created simplified addresses by removing floor, suite, and unit information that could interfere with geocoding.
food_pantries_retry <- food_pantries_geo %>%
filter(is.na(latitude) | is.na(longitude)) %>%
mutate(
simplified_address = full_address %>%
str_remove(", 2nd Floor.*") %>%
str_remove(", 3rd Floor.*") %>%
str_remove(" Flr.*") %>%
str_remove(" Suite.*") %>%
str_remove(" #.*") %>%
str_remove(" 2FL.*")
)
The simplified pantry addresses were geocoded again, and successful retry coordinates were merged back into the main pantry dataset.
food_pantries_retry_geo <- food_pantries_retry %>%
geocode(
address = simplified_address,
method = "osm",
lat = latitude_retry,
long = longitude_retry
)
food_pantries_geo_final <- food_pantries_geo %>%
left_join(
food_pantries_retry_geo %>%
select(Name, latitude_retry, longitude_retry),
by = "Name"
) %>%
mutate(
latitude = ifelse(is.na(latitude), latitude_retry, latitude),
longitude = ifelse(is.na(longitude), longitude_retry, longitude)
)
A small number of remaining pantry addresses required manual correction. These corrections were made to improve geocoding accuracy without changing the underlying pantry records.
food_pantries_geo_final <- food_pantries_geo_final %>%
mutate(
manual_address = case_when(
Name == "Community Help in Park Slope, Inc." ~ "200 4th Avenue, Brooklyn, NY 11217",
Name == "Mixteca Organization, Inc." ~ "245 23rd Street, Brooklyn, NY 11215",
TRUE ~ full_address
)
)
food_pantries_manual_retry <- food_pantries_geo_final %>%
filter(is.na(latitude) | is.na(longitude)) %>%
geocode(
address = manual_address,
method = "osm",
lat = latitude_manual,
long = longitude_manual
)
food_pantries_geo_final <- food_pantries_geo_final %>%
left_join(
food_pantries_manual_retry %>%
select(Name, latitude_manual, longitude_manual),
by = "Name"
) %>%
mutate(
latitude = ifelse(is.na(latitude), latitude_manual, latitude),
longitude = ifelse(is.na(longitude), longitude_manual, longitude)
)
Some pantry records geocoded outside the expected New York City coordinate range. These locations were reviewed and corrected using clearer street addresses.
food_pantries_geo_final <- food_pantries_geo_final %>%
mutate(
manual_address = case_when(
Name == "Apna Brooklyn Community Center, Inc." ~ "1236 Neptune Avenue, Brooklyn, NY 11235",
Name == "Health Essential Association, Inc." ~ "2101 East 16th Street, Brooklyn, NY 11229",
TRUE ~ manual_address
)
)
food_pantries_location_retry <- food_pantries_geo_final %>%
filter(Name %in% c(
"Apna Brooklyn Community Center, Inc.",
"Health Essential Association, Inc."
)) %>%
geocode(
address = manual_address,
method = "osm",
lat = latitude_location_fix,
long = longitude_location_fix
)
food_pantries_geo_final <- food_pantries_geo_final %>%
left_join(
food_pantries_location_retry %>%
select(Name, latitude_location_fix, longitude_location_fix),
by = "Name"
) %>%
mutate(
latitude = ifelse(!is.na(latitude_location_fix), latitude_location_fix, latitude),
longitude = ifelse(!is.na(longitude_location_fix), longitude_location_fix, longitude)
)
After the pantry coordinates were corrected, the food pantry dataset was converted into spatial points and transformed to match the census tract coordinate reference system.
food_pantries_sf <- food_pantries_geo_final %>%
st_as_sf(
coords = c("longitude", "latitude"),
crs = 4326
)
food_pantries_sf <- st_transform(
food_pantries_sf,
st_crs(bk_tracts_acs)
)
Food pantry points were spatially joined to Brooklyn census tracts and then counted by tract.
food_pantries_tracts <- food_pantries_sf %>%
st_join(
bk_tracts_acs %>% select(GEOID),
join = st_within,
left = FALSE
)
food_pantry_counts <- food_pantries_tracts %>%
st_drop_geometry() %>%
count(GEOID, name = "food_pantry_count")
The CEANYC dataset includes multiple resource types. For this project, I kept only gardens, CSAs, and community fridges.
ceanyc_clean <- ceanyc %>%
filter(Sector %in% c("Gardens", "CSAs", "Community Fridges"))
The CEANYC records were converted into spatial points using longitude and latitude.
ceanyc_sf <- ceanyc_clean %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326)
Some CEANYC points were outside a reasonable New York City coordinate range, so they were removed.
ceanyc_sf <- ceanyc_sf %>%
filter(
st_coordinates(.)[, "X"] > -75,
st_coordinates(.)[, "X"] < -72,
st_coordinates(.)[, "Y"] > 40,
st_coordinates(.)[, "Y"] < 42
)
The CEANYC layer was transformed to match the tract layer. Since the CEANYC dataset did not have a borough column, Brooklyn resources were identified spatially by keeping only points that fell within Brooklyn census tracts.
ceanyc_sf <- st_transform(ceanyc_sf, st_crs(bk_tracts_acs))
ceanyc_bk <- ceanyc_sf %>%
st_join(
bk_tracts_acs %>% select(GEOID),
join = st_within,
left = FALSE
)
The farmers market dataset was filtered to Brooklyn. Invalid or missing coordinates were removed, and only records from 2025 were retained to avoid counting repeated markets from earlier years.
farmers_markets_bk <- farmers_markets %>%
filter(Borough == "Brooklyn")
farmers_markets_bk_clean <- farmers_markets_bk %>%
filter(
!is.na(Latitude),
!is.na(Longitude),
Latitude > 40,
Latitude < 42,
Longitude > -75,
Longitude < -72
)
farmers_markets_bk_clean <- farmers_markets_bk_clean %>%
filter(Year == 2025)
The cleaned farmers market data were converted into spatial points and transformed to match the census tract layer.
farmers_markets_sf <- farmers_markets_bk_clean %>%
st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326)
farmers_markets_sf <- st_transform(farmers_markets_sf, st_crs(bk_tracts_acs))
Farmers market points were spatially joined to Brooklyn census tracts and counted by tract.
farmers_markets_tracts <- farmers_markets_sf %>%
st_join(
bk_tracts_acs %>% select(GEOID),
join = st_within,
left = FALSE
)
farmers_market_counts <- farmers_markets_tracts %>%
st_drop_geometry() %>%
count(GEOID, name = "farmers_market_count")
CEANYC resources were counted by census tract.
ceanyc_counts <- ceanyc_bk %>%
st_drop_geometry() %>%
count(GEOID, name = "ceanyc_count")
Food pantry, farmers market, and CEANYC count tables were joined back to the main census tract dataset. Missing counts were replaced with zero because a missing value means that no matching resource was found in that tract.
bk_tracts_food <- bk_tracts_acs %>%
left_join(ceanyc_counts, by = "GEOID") %>%
left_join(farmers_market_counts, by = "GEOID") %>%
left_join(food_pantry_counts, by = "GEOID") %>%
mutate(
ceanyc_count = replace_na(ceanyc_count, 0),
farmers_market_count = replace_na(farmers_market_count, 0),
food_pantry_count = replace_na(food_pantry_count, 0),
total_food_resources =
ceanyc_count +
farmers_market_count +
food_pantry_count
)
Because census tracts vary in population size, raw counts alone are not enough to compare access across tracts. I calculated food resources per 1,000 residents.
bk_tracts_food <- bk_tracts_food %>%
mutate(
food_resources_per_1000 = (total_food_resources / population) * 1000
)
Per capita measures can be unstable in very low-population areas. I checked how many tracts had populations below 100, 250, 500, and 1,000 residents.
bk_tracts_food %>%
st_drop_geometry() %>%
summarize(
tracts_under_100 = sum(population < 100, na.rm = TRUE),
tracts_under_250 = sum(population < 250, na.rm = TRUE),
tracts_under_500 = sum(population < 500, na.rm = TRUE),
tracts_under_1000 = sum(population < 1000, na.rm = TRUE)
) %>%
kable(caption = "Number of Brooklyn census tracts below selected population thresholds.")
| tracts_under_100 | tracts_under_250 | tracts_under_500 | tracts_under_1000 |
|---|---|---|---|
| 4 | 4 | 5 | 8 |
Census tracts with fewer than 500 residents were removed from the final access and gap analysis to reduce distortion from unstable per capita values.
bk_tracts_food <- bk_tracts_food %>%
filter(population >= 500)
Food resources per 1,000 residents were standardized using z-scores so that food access could be compared directly with the need index.
bk_tracts_food <- bk_tracts_food %>%
mutate(
access_z = as.numeric(scale(food_resources_per_1000))
)
The food access gap index was calculated by subtracting standardized access from the need index. Higher values indicate tracts where food-related need is high relative to food infrastructure access.
bk_tracts_food <- bk_tracts_food %>%
mutate(
gap_index = as.numeric(need_index) - access_z
)
To better understand the final index, I inspected the tracts with the highest and lowest gap values.
lowest_gap_tracts <- bk_tracts_food %>%
st_drop_geometry() %>%
select(GEOID, NAME.x, population, need_index, food_resources_per_1000, access_z, gap_index, total_food_resources) %>%
arrange(gap_index) %>%
head(10)
highest_gap_tracts <- bk_tracts_food %>%
st_drop_geometry() %>%
select(GEOID, NAME.x, population, need_index, food_resources_per_1000, access_z, gap_index, total_food_resources) %>%
arrange(desc(gap_index)) %>%
head(10)
kable(highest_gap_tracts, caption = "Ten census tracts with the highest food access gap index values.")
| GEOID | NAME.x | population | need_index | food_resources_per_1000 | access_z | gap_index | total_food_resources |
|---|---|---|---|---|---|---|---|
| 36047035200 | Census Tract 352, Kings County, New York | 1162 | 4.431523 | 0 | -0.4839898 | 4.915513 | 0 |
| 36047121000 | Census Tract 1210, Kings County, New York | 4203 | 3.482719 | 0 | -0.4839898 | 3.966709 | 0 |
| 36047080800 | Census Tract 808, Kings County, New York | 1668 | 3.454304 | 0 | -0.4839898 | 3.938293 | 0 |
| 36047025902 | Census Tract 259.02, Kings County, New York | 3650 | 3.111843 | 0 | -0.4839898 | 3.595832 | 0 |
| 36047091000 | Census Tract 910, Kings County, New York | 4199 | 3.082848 | 0 | -0.4839898 | 3.566838 | 0 |
| 36047038200 | Census Tract 382, Kings County, New York | 5714 | 3.064796 | 0 | -0.4839898 | 3.548786 | 0 |
| 36047008500 | Census Tract 85, Kings County, New York | 7577 | 2.993533 | 0 | -0.4839898 | 3.477523 | 0 |
| 36047053101 | Census Tract 531.01, Kings County, New York | 2487 | 2.992971 | 0 | -0.4839898 | 3.476960 | 0 |
| 36047091200 | Census Tract 912, Kings County, New York | 5549 | 2.906133 | 0 | -0.4839898 | 3.390123 | 0 |
| 36047048900 | Census Tract 489, Kings County, New York | 3591 | 2.882646 | 0 | -0.4839898 | 3.366636 | 0 |
kable(lowest_gap_tracts, caption = "Ten census tracts with the lowest food access gap index values.")
| GEOID | NAME.x | population | need_index | food_resources_per_1000 | access_z | gap_index | total_food_resources |
|---|---|---|---|---|---|---|---|
| 36047011901 | Census Tract 119.01, Kings County, New York | 1358 | -0.6405380 | 2.945508 | 7.884159 | -8.524697 | 4 |
| 36047115000 | Census Tract 1150, Kings County, New York | 2717 | 1.2046176 | 3.312477 | 8.926713 | -7.722095 | 9 |
| 36047005100 | Census Tract 51, Kings County, New York | 2815 | -1.2461460 | 2.131439 | 5.571399 | -6.817545 | 6 |
| 36047040500 | Census Tract 405, Kings County, New York | 1512 | -0.5810074 | 1.984127 | 5.152888 | -5.733896 | 3 |
| 36047029900 | Census Tract 299, Kings County, New York | 2297 | 0.0752661 | 2.176752 | 5.700134 | -5.624868 | 5 |
| 36047005900 | Census Tract 59, Kings County, New York | 1143 | -0.8783265 | 1.749781 | 4.487115 | -5.365442 | 2 |
| 36047050001 | Census Tract 500.01, Kings County, New York | 2154 | -1.2696329 | 1.392758 | 3.472816 | -4.742449 | 3 |
| 36047012901 | Census Tract 129.01, Kings County, New York | 2246 | -1.2702234 | 1.335708 | 3.310738 | -4.580962 | 3 |
| 36047051500 | Census Tract 515, Kings County, New York | 2608 | -1.2015384 | 1.150307 | 2.784016 | -3.985555 | 3 |
| 36047102801 | Census Tract 1028.01, Kings County, New York | 1652 | -0.9667787 | 1.210654 | 2.955461 | -3.922240 | 2 |
A Pearson correlation test was used to examine whether census tracts with higher food-related need also tend to have more food resources per 1,000 residents. This does not prove causation, but it helps summarize whether need and food infrastructure are aligned.
cor_test_result <- cor.test(
bk_tracts_food$need_index,
bk_tracts_food$food_resources_per_1000,
use = "complete.obs"
)
cor_test_result
##
## Pearson's product-moment correlation
##
## data: bk_tracts_food$need_index and bk_tracts_food$food_resources_per_1000
## t = 1.1821, df = 771, p-value = 0.2375
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.02806616 0.11270992
## sample estimates:
## cor
## 0.04253299
The correlation is very close to zero, indicating little to no linear relationship between food-related need and food infrastructure per capita. This suggests that food resources are not systematically located in areas with higher levels of need across Brooklyn.
This histogram shows the distribution of food access gap scores across Brooklyn census tracts. Positive scores indicate higher need relative to food infrastructure access.
bk_tracts_food %>%
st_drop_geometry() %>%
filter(!is.na(as.numeric(gap_index))) %>%
ggplot(aes(x = as.numeric(gap_index))) +
geom_histogram(
bins = 28,
fill = "indianred3",
color = "white",
linewidth = 0.3,
alpha = 0.85
) +
geom_vline(
xintercept = 0,
linetype = "dashed",
color = "black",
linewidth = 0.7
) +
annotate(
"text",
x = 0.15,
y = 115,
label = "0 = average balance\nbetween need and access",
hjust = 0,
size = 3.5
) +
labs(
title = "Distribution of Food Access Gap Scores Across Brooklyn",
subtitle = "Positive scores indicate higher need relative to food infrastructure access",
x = "Food Access Gap Index",
y = "Number of Census Tracts"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(size = 11),
panel.grid.minor = element_blank()
)
This scatterplot compares the food-related need index with food resources per 1,000 residents. Each point represents a census tract. The trend line shows whether food resources increase or decrease as need increases.
cor_label <- paste0("Correlation = ", round(as.numeric(cor_test_result$estimate), 2))
bk_tracts_food %>%
st_drop_geometry() %>%
filter(
!is.na(as.numeric(need_index)),
!is.na(as.numeric(food_resources_per_1000)),
!is.na(as.numeric(gap_index))
) %>%
ggplot(aes(
x = as.numeric(need_index),
y = as.numeric(food_resources_per_1000)
)) +
geom_vline(
xintercept = 0,
linetype = "dashed",
color = "grey40",
linewidth = 0.6
) +
geom_hline(
yintercept = mean(bk_tracts_food$food_resources_per_1000, na.rm = TRUE),
linetype = "dashed",
color = "grey40",
linewidth = 0.6
) +
geom_point(aes(color = as.numeric(gap_index)), alpha = 0.65, size = 2) +
geom_smooth(
method = "lm",
color = "black",
linewidth = 1,
se = FALSE
) +
scale_color_gradient2(
low = "blue",
mid = "white",
high = "red",
midpoint = 0,
name = "Gap Index"
) +
annotate(
"text",
x = min(as.numeric(bk_tracts_food$need_index), na.rm = TRUE),
y = max(bk_tracts_food$food_resources_per_1000, na.rm = TRUE),
label = cor_label,
hjust = 0,
vjust = 1,
size = 4
) +
labs(
title = "Relationship Between Food Need and Food Access in Brooklyn",
subtitle = "Each point represents a census tract; red points indicate higher need relative to food access",
x = "Food-Related Need Index",
y = "Food Resources per 1,000 Residents"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(size = 11),
panel.grid.minor = element_blank()
)
Although the main analysis is conducted at the census tract level, NTA names were added to make the final results easier to interpret at a neighborhood scale. I used representative points inside each tract to avoid duplicating tracts that touch more than one NTA boundary.
bk_tracts_food <- bk_tracts_food %>%
select(-any_of("NTAName"))
bk_nta <- st_transform(bk_nta, st_crs(bk_tracts_food))
tract_points <- st_point_on_surface(bk_tracts_food)
tract_nta_lookup <- tract_points %>%
st_join(
bk_nta %>% select(NTAName),
join = st_within,
left = TRUE
) %>%
st_drop_geometry() %>%
select(GEOID, NTAName)
bk_tracts_food <- bk_tracts_food %>%
left_join(tract_nta_lookup, by = "GEOID")
After adding NTA names, I summarized the average gap index and average need index for each neighborhood.
nta_summary <- bk_tracts_food %>%
st_drop_geometry() %>%
filter(!is.na(NTAName)) %>%
group_by(NTAName) %>%
summarize(
mean_gap = mean(as.numeric(gap_index), na.rm = TRUE),
mean_need = mean(as.numeric(need_index), na.rm = TRUE),
tract_count = n()
) %>%
arrange(desc(mean_gap))
kable(
nta_summary %>% head(15),
caption = "Top 15 Brooklyn NTAs by Average Food Access Gap Index"
)
| NTAName | mean_gap | mean_need | tract_count |
|---|---|---|---|
| South Williamsburg | 2.2994719 | 2.1553055 | 10 |
| Coney Island-Sea Gate | 1.5527298 | 1.3755191 | 12 |
| Brighton Beach | 1.4224115 | 1.0299955 | 8 |
| Borough Park | 1.2013610 | 0.8872324 | 21 |
| Spring Creek-Starrett City | 1.1769006 | 0.9980540 | 3 |
| Sunset Park (East)-Borough Park (West) | 1.0879354 | 0.6039456 | 10 |
| East New York-City Line | 0.8973269 | 0.7934019 | 12 |
| Gravesend (South) | 0.8271401 | 0.5166298 | 7 |
| Sunset Park (Central) | 0.7862316 | 0.3022419 | 13 |
| East Flatbush-Remsen Village | 0.6938158 | 0.4135765 | 9 |
| Brownsville | 0.5972298 | 1.7870418 | 14 |
| Gravesend (West) | 0.4160755 | 0.0728520 | 16 |
| Bensonhurst | 0.3972562 | 0.0280856 | 28 |
| Gravesend (East)-Homecrest | 0.3604554 | -0.1235344 | 18 |
| Sunset Park (West) | 0.3212744 | 0.1092701 | 15 |
This chart shows whether neighborhood-level food access gaps are spread evenly or concentrated among a smaller number of higher-gap neighborhoods.
ggplot(nta_summary, aes(x = mean_gap)) +
geom_histogram(
bins = 20,
fill = "steelblue",
color = "white"
) +
labs(
title = "Distribution of Average Gap Index by Brooklyn NTA",
x = "Average Gap Index",
y = "Number of NTAs"
) +
theme_minimal()
This bar chart shows the 15 Brooklyn neighborhoods with the highest average food access gap scores.
nta_summary %>%
slice_max(mean_gap, n = 15) %>%
mutate(NTAName = fct_reorder(NTAName, mean_gap)) %>%
ggplot(aes(x = NTAName, y = mean_gap, fill = mean_gap)) +
geom_col(width = 0.75) +
coord_flip() +
scale_fill_gradient(
low = "lightsalmon",
high = "darkred"
) +
labs(
title = "Top 15 Brooklyn Neighborhoods by Average Food Access Gap Index",
subtitle = "Higher values indicate greater mismatch between food need and food infrastructure",
x = "Neighborhood (NTA)",
y = "Average Gap Index",
fill = "Gap Index"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(size = 11),
panel.grid.minor = element_blank(),
axis.text.y = element_text(size = 10)
)
# Top 15 NTAs by Average Need Index This bar chart shows the 15 Brooklyn
neighborhoods with the highest average food-related need scores.
nta_summary %>%
slice_max(mean_need, n = 15) %>%
mutate(NTAName = fct_reorder(NTAName, mean_need)) %>%
ggplot(aes(x = NTAName, y = mean_need, fill = mean_need)) +
geom_col(width = 0.75) +
coord_flip() +
scale_fill_gradient(
low = "lightyellow",
high = "darkred"
) +
labs(
title = "Top 15 Brooklyn Neighborhoods by Average Food-Related Need Index",
subtitle = "Based on standardized SNAP participation and poverty rates",
x = "Neighborhood (NTA)",
y = "Average Need Index",
fill = "Need Index"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(size = 11),
panel.grid.minor = element_blank(),
axis.text.y = element_text(size = 10)
)
kable(
nta_summary %>% slice_max(mean_need, n = 15),
caption = "Top 15 NTAs by Average Food-Related Need Index"
)
| NTAName | mean_gap | mean_need | tract_count |
|---|---|---|---|
| South Williamsburg | 2.2994719 | 2.1553055 | 10 |
| Brownsville | 0.5972298 | 1.7870418 | 14 |
| Coney Island-Sea Gate | 1.5527298 | 1.3755191 | 12 |
| Brighton Beach | 1.4224115 | 1.0299955 | 8 |
| Spring Creek-Starrett City | 1.1769006 | 0.9980540 | 3 |
| East New York (North) | -0.9898533 | 0.9678449 | 13 |
| Borough Park | 1.2013610 | 0.8872324 | 21 |
| East New York-City Line | 0.8973269 | 0.7934019 | 12 |
| East New York-New Lots | 0.1912495 | 0.7896870 | 14 |
| Sunset Park (East)-Borough Park (West) | 1.0879354 | 0.6039456 | 10 |
| Ocean Hill | 0.2108762 | 0.5979447 | 10 |
| Bushwick (East) | -0.1573075 | 0.5339574 | 16 |
| Gravesend (South) | 0.8271401 | 0.5166298 | 7 |
| Bedford-Stuyvesant (West) | -0.2410075 | 0.4600780 | 21 |
| Cypress Hills | 0.2259705 | 0.4199031 | 14 |
The final maps label the top 10 neighborhoods by average gap index and average need index. This makes the maps easier to interpret.
top_gap_ntas <- nta_summary %>%
slice_max(mean_gap, n = 10) %>%
pull(NTAName)
top_need_ntas <- nta_summary %>%
slice_max(mean_need, n = 10) %>%
pull(NTAName)
bk_nta_labeled <- bk_nta %>%
left_join(nta_summary, by = "NTAName")
bk_nta_centroids <- bk_nta_labeled %>%
st_centroid()
This final gap map shows where food-related need is high relative to food infrastructure access. Higher values indicate higher need and lower relative access.
top_gap_centroids <- bk_nta_centroids %>%
filter(NTAName %in% top_gap_ntas)
top_gap_polygons <- bk_nta %>%
filter(NTAName %in% top_gap_ntas)
top_gap_centroids_regular <- top_gap_centroids %>%
filter(NTAName != "Sunset Park (Central)")
sunset_park_central_label <- top_gap_centroids %>%
filter(NTAName == "Sunset Park (Central)") %>%
mutate(
x = st_coordinates(.)[, 1],
y = st_coordinates(.)[, 2] + 0.005
) %>%
st_drop_geometry()
ggplot() +
geom_sf(
data = bk_tracts_food,
fill = "grey90",
color = NA
) +
geom_sf(
data = bk_tracts_food,
aes(fill = as.numeric(gap_index)),
color = NA
) +
geom_sf(
data = bk_tracts_food,
fill = NA,
color = "white",
size = 0.08
) +
geom_sf(
data = bk_nta,
fill = NA,
color = "black",
size = 0.35
) +
geom_sf(
data = top_gap_polygons,
fill = NA,
color = "black",
size = 0.8
) +
geom_sf_label(
data = top_gap_centroids_regular,
aes(label = NTAName),
size = 2.4,
fill = "white",
color = "black",
alpha = 0.92,
label.size = 0.18,
label.padding = unit(0.12, "lines")
) +
geom_label(
data = sunset_park_central_label,
aes(x = x, y = y, label = NTAName),
size = 2.4,
fill = "white",
color = "black",
alpha = 0.92,
linewidth = 0.18,
label.padding = unit(0.12, "lines")
) +
scale_fill_gradient2(
low = "blue",
mid = "white",
high = "red",
midpoint = 0,
na.value = "grey75",
breaks = c(-8, 0, 4),
labels = c("-8", "0", "4")
) +
labs(
title = "Food Access Gap Index by Census Tract in Brooklyn",
subtitle = "Higher values indicate higher food-related need and lower food infrastructure access
Labels and bold outlines show the 10 neighborhoods with greatest need-to-access mismatch",
fill = "Gap Index"
) +
coord_sf(clip = "off") +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(size = 11),
panel.grid.minor = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank()
)
The SNAP and poverty maps show that food-related economic vulnerability is not evenly distributed across Brooklyn. Several parts of the borough show higher SNAP participation and higher poverty, especially in central, eastern, and parts of southern Brooklyn. These patterns support the decision to combine SNAP participation and poverty into a broader need index.
The food-related need index identifies areas where multiple indicators of vulnerability overlap. Neighborhoods such as South Williamsburg, Brownsville, East New York, Borough Park, and parts of southern Brooklyn appear as important high-need areas in the analysis.
The food access gap index answers a different question. It does not only show where need is high. Instead, it shows where need is high relative to available food infrastructure. This distinction matters because a neighborhood can have high need but still have comparatively more food infrastructure than another high-need neighborhood.
The correlation analysis and scatterplot show a very weak relationship between the need index and food resources per 1,000 residents. In simple terms, higher-need census tracts do not consistently have more food resources per person. This suggests that food infrastructure is not systematically aligned with the areas of greatest food-related need across Brooklyn.
The neighborhood-level summary adds another layer of interpretation. South Williamsburg ranks highly for both average need and average gap, making it a key priority area in the analysis. Other neighborhoods, such as Brownsville and East New York, also show high need, but their gap scores differ. This shows that need and access mismatch are related, but they are not identical.
One limitation of this analysis is that food access is measured based on resources located within census tract boundaries. In real life, residents are not limited to their own census tract and may travel to nearby areas for food. This means the tract-based approach may underestimate access in places where resources are close by but fall outside the tract boundary.
Another limitation is that the project combines different types of food infrastructure into one total count. Food pantries, farmers markets, CSAs, community gardens, and community fridges do not all provide the same kind of access. They differ in cost, hours, eligibility, reliability, and purpose. For that reason, the total food resource count should be interpreted as a broad measure of food infrastructure rather than a perfect measure of actual food access.
Finally, this project measures where resources are located, but not whether residents actually use them. Future research could improve the analysis by adding distance-based or network-based accessibility measures, such as walking distance, transit access, or travel time.
This project used R to analyze spatial gaps between food-related need and food infrastructure across Brooklyn. By combining ACS demographic indicators, spatial boundary data, food infrastructure records, geocoding, spatial joins, per capita measures, and index construction, the project identified census tracts and neighborhoods where food-related need is high relative to available food resources.
The results show that food-related need is spatially clustered and that food infrastructure is not consistently aligned with the areas of greatest need. The weak correlation between need and food resources per 1,000 residents supports this finding. The food access gap index provides the clearest summary of the project’s main finding: some Brooklyn neighborhoods face a stronger mismatch between food-related vulnerability and food infrastructure access than others.
Overall, this project demonstrates how spatial analysis in R can move beyond simply mapping poverty or counting resources. By comparing need and access together, the analysis provides a more useful way to identify potential food access gaps and areas that may benefit from additional food-related resources, planning attention, or community investment.
Alaimo, K., Beavers, A. W., Crawford, C., Hodges Snyder, E., & Litt, J. S. (2016). Amplifying health through community gardens: A framework for advancing multicomponent, behaviorally based neighborhood interventions. Current Environmental Health Reports, 3(3), 302–312.
Pebesma, E. (2018). Simple features for R: Standardized support for spatial vector data. The R Journal, 10(1), 439–446.
Walker, K. (2023). tidycensus: Load US Census Boundary and Attribute Data as tidyverse and sf-ready data frames. R package.
Walker, R. E., Keane, C. R., & Burke, J. G. (2010). Disparities and access to healthy food in the United States: A review of food deserts literature. Health & Place, 16(5), 876–884.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., & Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.
Zenk, S. N., Schulz, A. J., Israel, B. A., James, S. A., Bao, S., & Wilson, M. L. (2005). Neighborhood racial composition, neighborhood poverty, and the spatial accessibility of supermarkets in metropolitan Detroit. American Journal of Public Health, 95(4), 660–667.