This project explores food access and pricing inequality in NYC, focusing on differences between Brownsville (Brooklyn) and Lower Manhattan. The motivating observation is that essential items (e.g., eggs) can be more expensive in Brooklyn bodegas than at “expensive” chains (Whole Foods, Trader Joe’s) in Manhattan.
Research question: Are staple grocery items more expensive and/or less accessible in low-income neighborhoods compared with higher-income neighborhoods in NYC?
Why it matters: Results connect to food justice, urban planning, and resource allocation.
##Part 2 - Data
Cases. A geographic area (ZIP code or census tract) summarized by store counts/density and demographics (income/poverty/SNAP). Optionally, include a small self-collected price table (4–6 staple items).
Collection method. Public administrative datasets (NYC Open Data, USDA), plus optional self-collected prices.
Type of study. Observational.
Data sources (for later):
NYC Open Data – food retailers / FRESH
USDA Food Access Research Atlas
U.S. Census / ACS
For this test, we’ll use a tiny toy dataset so knitting works without any external files.
# ---------------------------------------------------------------
# NOTE: This "toy_store_counts" dataset is a TEMPORARY PLACEHOLDER
# It allows the R Markdown file to knit successfully before I
# download and import the real datasets (NYC Open Data, USDA, ACS).
#
# The toy data below simply mimics the structure of the actual data
# I will use later (area_id, n_stores, population, income, poverty).
# Once I have real data, I will replace this section with code that
# reads and joins my CSVs, as shown in later instructions.
# ---------------------------------------------------------------
toy_store_counts <- tribble(
~area_id, ~n_stores, ~population, ~median_income, ~poverty_rate,
"Brownsville_11212", 35, 86000, 36500, 0.27,
"LowerManhattan_10013", 80, 62000, 112000, 0.11
) |>
mutate(store_density_per_10k = n_stores / (population / 10000))
# Quick summary of store density
toy_store_counts
## # A tibble: 2 × 6
## area_id n_stores population median_income poverty_rate store_density_per_10k
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Brownsvi… 35 86000 36500 0.27 4.07
## 2 LowerMan… 80 62000 112000 0.11 12.9
summary(toy_store_counts$store_density_per_10k)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.070 6.278 8.486 8.486 10.695 12.903
# Simple visualization (toy example)
ggplot(toy_store_counts, aes(median_income, store_density_per_10k, label = area_id)) +
geom_point(size = 3) +
geom_text(nudge_y = 0.2, size = 3) +
labs(
title = "Income vs Store Density (toy example)",
x = "Median household income ($)",
y = "Stores per 10,000 residents"
)
# Toy model to illustrate structure:
toy_model <- lm(store_density_per_10k ~ median_income + poverty_rate, data = toy_store_counts)
summary(toy_model)
##
## Call:
## lm(formula = store_density_per_10k ~ median_income + poverty_rate,
## data = toy_store_counts)
##
## Residuals:
## ALL 2 residuals are 0: no residual degrees of freedom!
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.200712 NaN NaN NaN
## median_income 0.000117 NaN NaN NaN
## poverty_rate NA NA NA NA
##
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: NaN
## F-statistic: NaN on 1 and 0 DF, p-value: NA
##Part 5 - Conclusion
This analysis will quantify disparities in grocery access (and optionally staple prices) between NYC neighborhoods, informing equity and policy discussions.
##References
NYC Open Data – Food retail/FRESH
USDA ERS – Food Access Research Atlas
U.S. Census Bureau – ACS