DATA 110_ Final Submission Project 3

Author

Catherine Z. Matenje

Disparities in Food Access across United States Census Tracts

“Food Desert” by Shirley Cannon

Introduction

This project explores various disparities in food access across U.S. census tracts using data from the USDA Food Access Research Atlas. The dataset contains 72,531 observations and 147 variables, and provides information on socioeconomic conditions, transportation access, and geographic characteristics. After cleaning the original excel file to include only relevant variables, there were 17 variables in total left.

Research Question:How are socioeconomic and structural factors associated with low food access across communities?

The key predictors in this analysis are poverty rate, median family income, vehicle access, and urban classification, and how these factors relate to food access. Food access is measured as both a continuous variable (the proportion of the population living far from supermarkets) and as a categorical indicator of food deserts.

I selected this topic because limited access to healthy food is associated with disparities in nutrition, chronic disease, and overall well-being. I believe that understanding these patterns at the community level can help inform public health interventions and policy decisions aimed at achieving food equity.

Background Research:

Limited access to healthy and affordable food remains an important public health issue in the United States. Areas with low food access, also known as “food deserts,” are communities where residents have difficulty accessing supermarkets or stores with nutritious food options due to distance, transportation barriers, or low income (U.S. Department of Agriculture [USDA], 2023). Research shows that transportation access plays a major role in food access disparities, especially in rural and low-income communities where residents may lack reliable access to vehicles or public transportation (Centers for Disease Control and Prevention [CDC], 2025; U.S. Hunger, 2023). In addition, food access challenges have been linked to increased risks of chronic conditions such as obesity and diabetes, as limited access to healthy food can lead to increased reliance on ultra-processed or less nutritious food options (Walker et al., 2010; Beaulac et al., 2009). According to the USDA, both income and vehicle access are key factors that influence whether households can consistently obtain healthy food (USDA, 2023). This highlights the importance of examining how poverty, transportation, and geographic location contribute to food access disparities across communities.

Variables

Variable Definition Role
CensusTract Unique identifier for each census tract (small geographic unit used by the U.S. Census Bureau) Identifier
State State in which the census tract is located Identifier
County County in which the census tract is located Identifier
Urban Indicates whether the tract is classified as urban (1) or rural (0) Predictor
PovertyRate Percentage of the population living below the federal poverty line Predictor
MedianFamilyIncome Median family income within the census tract Predictor
HUNVFlag Indicates low vehicle access (tracts where many households do not have a vehicle and are far from a supermarket) Predictor
LILATracts_1And10 Indicates whether a tract is both low-income and has low access to supermarkets (food desert indicator) Outcome
lapop1share Proportion of the population living more than 1 mile from a supermarket Alternative Outcome

Load Libraries

# Load required libraries
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RColorBrewer)
library(ggalluvial)
library(ggfortify)

Load Data set

# Load dataset

setwd("C:/Users/cathe/OneDrive/Desktop/Montgomery College Transition/2025-2026 MONTGOMERY COLLEGE TRANSITION/MC COURSES 25-26/Spring 2026/DATA 110/02. Projects/Project 3 - Final")

food_data <- read_csv("FoodAccessResearchAtlasData2019_original.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 72531 Columns: 147
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (114): CensusTract, State, County, MedianFamilyIncome, LAPOP1_10, LAPOP0...
dbl  (33): Urban, Pop2010, OHU2010, GroupQuartersFlag, NUMGQTRS, PCTGQTRS, L...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(food_data)
# A tibble: 6 × 147
  CensusTract State   County    Urban Pop2010 OHU2010 GroupQuartersFlag NUMGQTRS
  <chr>       <chr>   <chr>     <dbl>   <dbl>   <dbl>             <dbl>    <dbl>
1 01001020100 Alabama Autauga …     1    1912     693                 0        0
2 01001020200 Alabama Autauga …     1    2170     743                 0      181
3 01001020300 Alabama Autauga …     1    3373    1256                 0        0
4 01001020400 Alabama Autauga …     1    4386    1722                 0        0
5 01001020500 Alabama Autauga …     1   10766    4082                 0      181
6 01001020600 Alabama Autauga …     1    3668    1311                 0        0
# ℹ 139 more variables: PCTGQTRS <dbl>, LILATracts_1And10 <dbl>,
#   LILATracts_halfAnd10 <dbl>, LILATracts_1And20 <dbl>,
#   LILATracts_Vehicle <dbl>, HUNVFlag <dbl>, LowIncomeTracts <dbl>,
#   PovertyRate <dbl>, MedianFamilyIncome <chr>, LA1and10 <dbl>,
#   LAhalfand10 <dbl>, LA1and20 <dbl>, LATracts_half <dbl>, LATracts1 <dbl>,
#   LATracts10 <dbl>, LATracts20 <dbl>, LATractsVehicle_20 <dbl>,
#   LAPOP1_10 <chr>, LAPOP05_10 <chr>, LAPOP1_20 <chr>, LALOWI1_10 <chr>, …
dim(food_data)
[1] 72531   147
food_data_updated <- read_csv("FoodAccessResearchAtlasData2019_upd.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 72531 Columns: 17
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): CensusTract, State, County, MedianFamilyIncome, lapop1, lapop1share
dbl (11): Urban, Pop2010, OHU2010, GroupQuartersFlag, NUMGQTRS, PCTGQTRS, LI...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Data Cleaning

Explore the data

# View structure of dataset
head(food_data_updated)
# A tibble: 6 × 17
  CensusTract State   County    Urban Pop2010 OHU2010 GroupQuartersFlag NUMGQTRS
  <chr>       <chr>   <chr>     <dbl>   <dbl>   <dbl>             <dbl>    <dbl>
1 01001020100 Alabama Autauga …     1    1912     693                 0        0
2 01001020200 Alabama Autauga …     1    2170     743                 0      181
3 01001020300 Alabama Autauga …     1    3373    1256                 0        0
4 01001020400 Alabama Autauga …     1    4386    1722                 0        0
5 01001020500 Alabama Autauga …     1   10766    4082                 0      181
6 01001020600 Alabama Autauga …     1    3668    1311                 0        0
# ℹ 9 more variables: PCTGQTRS <dbl>, LILATracts_1And10 <dbl>, HUNVFlag <dbl>,
#   LowIncomeTracts <dbl>, PovertyRate <dbl>, MedianFamilyIncome <chr>,
#   LA1and10 <dbl>, lapop1 <chr>, lapop1share <chr>
dim(food_data_updated)
[1] 72531    17

Cleaning and Renaming Variables

# Select relevant variables
clean_data <- food_data_updated |>
  select(CensusTract, State, County, Urban,
         PovertyRate, MedianFamilyIncome,
         HUNVFlag, LILATracts_1And10, lapop1share) |>
# Remove missing values
  drop_na() |>
# Rename variables to shorter names
  rename(
    poverty_rate = PovertyRate,
    median_income = MedianFamilyIncome,
    low_vehicle_access = HUNVFlag,
    food_desert = LILATracts_1And10,
    low_access_share = lapop1share,
    urban = Urban
  ) |>
# Convert categorical variables to factors
  mutate(
    urban = as.factor(urban),
    low_vehicle_access = as.factor(low_vehicle_access),
    food_desert = as.factor(food_desert),
# Convert some variables to numeric
    low_access_share = as.numeric(low_access_share),
    median_income = as.numeric(median_income)
  ) |>

 # Remove rows with missing outcome values
  drop_na(low_access_share, median_income)
Warning: There were 2 warnings in `mutate()`.
The first warning was:
ℹ In argument: `low_access_share = as.numeric(low_access_share)`.
Caused by warning:
! NAs introduced by coercion
ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
# Check cleaned data
str(clean_data)
tibble [52,047 × 9] (S3: tbl_df/tbl/data.frame)
 $ CensusTract       : chr [1:52047] "01001020100" "01001020200" "01001020300" "01001020400" ...
 $ State             : chr [1:52047] "Alabama" "Alabama" "Alabama" "Alabama" ...
 $ County            : chr [1:52047] "Autauga County" "Autauga County" "Autauga County" "Autauga County" ...
 $ urban             : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 1 1 1 ...
 $ poverty_rate      : num [1:52047] 11.3 17.9 15 2.8 15.2 21.6 30.5 8.9 13.7 9.8 ...
 $ median_income     : num [1:52047] 81250 49000 62609 70607 96334 ...
 $ low_vehicle_access: Factor w/ 2 levels "0","1": 1 1 1 1 2 1 1 1 2 1 ...
 $ food_desert       : Factor w/ 2 levels "0","1": 1 2 1 1 1 2 2 1 1 1 ...
 $ low_access_share  : num [1:52047] 99.2 58.1 46 31.1 24.6 ...
names(clean_data)
[1] "CensusTract"        "State"              "County"            
[4] "urban"              "poverty_rate"       "median_income"     
[7] "low_vehicle_access" "food_desert"        "low_access_share"  
sum(is.na(clean_data$low_access_share))
[1] 0
sum(is.na(clean_data$median_income))
[1] 0
summary(clean_data)
 CensusTract           State              County          urban    
 Length:52047       Length:52047       Length:52047       0:17219  
 Class :character   Class :character   Class :character   1:34828  
 Mode  :character   Mode  :character   Mode  :character            
                                                                   
                                                                   
                                                                   
  poverty_rate  median_income    low_vehicle_access food_desert
 Min.   : 0.0   Min.   :  2499   0:39375            0:42837    
 1st Qu.: 6.0   1st Qu.: 54061   1:12672            1: 9210    
 Median :10.8   Median : 70455                                 
 Mean   :13.7   Mean   : 78216                                 
 3rd Qu.:18.4   3rd Qu.: 94375                                 
 Max.   :99.5   Max.   :250001                                 
 low_access_share
 Min.   :  0.00  
 1st Qu.: 17.86  
 Median : 55.04  
 Mean   : 53.97  
 3rd Qu.: 93.78  
 Max.   :100.00  

To prepare the data for analysis, I first selected only the variables that were relevant to my research question and removed any unnecessary columns from the excel file. I then renamed the variables to make them easier to understand and work with. Some variables were converted into the correct format, such as turning categorical variables into factors and converting the low access variable into a numeric value so it could be used in graphs and regression. I noticed that some values could not be converted properly, which created missing values. To keep the analysis accurate, I removed observations with missing values in the outcome variable as well as median income variable. After cleaning, the data set was ready for exploratory data and statistical analysis.

Updated List of Variables after cleaning

Variable Definition Role
CensusTract Unique identifier for each census tract, a small geographic unit used for statistical analysis by the U.S. Census Bureau Identifier
State U.S. state in which the census tract is located Identifier
County County in which the census tract is located Identifier
urban Indicator of whether the census tract is classified as urban (1) or rural (0) based on population density and land use Predictor
poverty_rate Percentage of individuals in the census tract living below the federal poverty line Predictor
median_income Median family income within the census tract, representing the midpoint of household income distribution Predictor
low_vehicle_access Indicator of whether the census tract has low vehicle access, meaning a significant share of households do not have access to a vehicle and may face transportation barriers to reaching food retailers Predictor
food_desert Binary indicator of whether the census tract is classified as a low-income, low-access area (i.e., a “food desert”) based on distance to supermarkets and income thresholds Secondary Outcome
low_access_share Proportion of the population in the census tract that lives more than one mile from the nearest supermarket, representing the level of geographic food access Primary Outcome

In this analysis, low_access_share is used as the primary outcome variable because it is a continuous measure of food access, while food_desert is included as a secondary categorical indicator.

Statistical Analysis

I plan to run a multiple linear regression, however, before doing this, I will run a correlation matrix to check for multicolinearity

library(corrplot)
Warning: package 'corrplot' was built under R version 4.5.3
corrplot 0.95 loaded
# Select numeric variables only
numeric_data <- clean_data |>
  select(poverty_rate, median_income, low_access_share)

# Correlation matrix
cor_matrix <- cor(numeric_data)

corrplot(cor_matrix, 
         method = "color",
         col = colorRampPalette(c("lightblue", "white", "lightpink"))(200),
         addCoef.col = "black",
         tl.col = "black",
         tl.srt = 45)
Warning in ind1:ind2: numerical expression has 2 elements: only the first used

The correlation matrix shows that most of the relationships between the variables are weak. The strongest relationship is between poverty rate and median income (r = -0.66), which indicates a moderately strong negative relationship. This means that areas with higher poverty tend to have lower income which is expected. However, the relationships between low food access and both poverty rate (r = -0.11) and median income (r = -0.02) are very weak. This is surprising but may also suggest that poverty and income do not strongly explain differences in low food access across census tracts on their own in this data set. Overall, the results show that there is weak linear relationship between the main variables and low food access. This can be further examined in the multiple linear regression below

Multiple Linear Regression

Model 1

model <- lm(low_access_share ~ poverty_rate + median_income + urban + low_vehicle_access, 
            data = clean_data)

summary(model)

Call:
lm(formula = low_access_share ~ poverty_rate + median_income + 
    urban + low_vehicle_access, data = clean_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-88.658 -25.990   0.538  16.765  89.460 

Coefficients:
                      Estimate Std. Error  t value Pr(>|t|)    
(Intercept)          8.675e+01  5.732e-01  151.352  < 2e-16 ***
poverty_rate        -3.276e-01  1.723e-02  -19.016  < 2e-16 ***
median_income        1.805e-05  5.002e-06    3.607 0.000309 ***
urban1              -4.578e+01  2.858e-01 -160.190  < 2e-16 ***
low_vehicle_access1  3.821e+00  3.298e-01   11.587  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 29.62 on 52042 degrees of freedom
Multiple R-squared:  0.3489,    Adjusted R-squared:  0.3489 
F-statistic:  6972 on 4 and 52042 DF,  p-value: < 2.2e-16

Model 2:

I performed backward elimination and removed Median Income as the effect size is small.

model2 <- lm(low_access_share ~ poverty_rate + urban + low_vehicle_access, data = clean_data)

summary(model2)

Call:
lm(formula = low_access_share ~ poverty_rate + urban + low_vehicle_access, 
    data = clean_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-87.715 -25.959   0.543  16.691  91.562 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)          88.55951    0.27821  318.32   <2e-16 ***
poverty_rate         -0.36719    0.01329  -27.64   <2e-16 ***
urban1              -45.53610    0.27759 -164.04   <2e-16 ***
low_vehicle_access1   3.74428    0.32914   11.38   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 29.62 on 52043 degrees of freedom
Multiple R-squared:  0.3487,    Adjusted R-squared:  0.3487 
F-statistic:  9289 on 3 and 52043 DF,  p-value: < 2.2e-16

Interpretation of Final Regression Model

  1. The intercept is 88.56 (p < 2e-16), which represents the expected percentage of low food access when all predictors are equal to zero. While this value is statistically significant, it is not very meaningful in practice because having zero values for all predictors is unrealistic.

  2. For poverty rate, on average, a 1 percentage point increase in poverty is associated with about a 0.37 percentage point decrease in low food access (β = -0.37, p < 2e-16). This relationship is statistically significant. However, the negative direction is unsual. perhpas, poverty on its own may not fully explain differences in food access.

  3. For urban classification, urban areas have about 45.54 percentage points lower low food access compared to rural areas (β = -45.54, p < 2e-16). This difference is statistically significant and is the strongest effect in the model. This result indicates that urban areas generally have much better access to food than rural areas.

  4. For low vehicle access, areas with limited access to vehicles have about 3.74 percentage points higher low food access (β = 3.74, p < 2e-16). This relationship is statistically significant and suggests that transportation barriers are an important factor influencing food access.

  5. The adjusted R-squared for Model 1 (full model) was 0.3489 (34.89%), while the adjusted R-squared for Model 2 (reduced model) is 0.3487(34.87%). This shows that removing median income resulted in a tiny change in the model’s explanatory power.

  6. Because the adjusted R-squared remained nearly the same, I selected Model 2 as the final model.

Diagnostic Plots

autoplot(model2, nrow = 2, ncol = 2)
Warning: `fortify(<lm>)` was deprecated in ggplot2 4.0.0.
ℹ Please use `broom::augment(<lm>)` instead.
ℹ The deprecated feature was likely used in the ggfortify package.
  Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.
Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
ℹ The deprecated feature was likely used in the ggfortify package.
  Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
ℹ The deprecated feature was likely used in the ggfortify package.
  Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.

Interpretation of Diagnostic Plots

  1. Residuals vs Fitted: The residuals are mostly scattered around zero, but there is slight curve in the smooth line. The clustered patterns are expected due to categorical variables like urban. Overall, linearity is mostly satisfied with minor deviations.

  2. Normal Q-Q: The points generally follow the reference line, indicating approximate normality. There are some deviations at the tails, suggesting minor non-normality, but no major issues.

  3. Scale-Location: The spread of residuals is not perfectly constant and shows a slight trend. This suggests mild heteroscedasticity, but it is not severe.

  4. Residuals vs Leverage: Most points have low leverage, with a few slightly higher ones. However, none appear highly influential, so no single observation strongly affects the model.

Visualizations

Alluvial

I chose an alluvial plot to visualize how multiple factors, such as urban/rural classification and vehicle access, influence food access. This type of plot is useful for showing how observations flow across categories and helps reveal patterns that are not easily seen in simpler plots.

Creating categorical variables for alluvial

clean_data2 <- clean_data |>
  mutate(
    poverty_category = case_when(
      poverty_rate < 10 ~ "Low Poverty",
      poverty_rate >= 10 & poverty_rate < 20 ~ "Medium Poverty",
      poverty_rate >= 20 ~ "High Poverty"
    ),
    
    foodaccess_category = case_when(
      low_access_share < 33 ~ "Low Access Issues",
      low_access_share >= 33 & low_access_share < 66 ~ "Moderate Access Issues",
      low_access_share >= 66 ~ "High Access Issues"
    )
  )

Summarizing counts for Alluvial and convert categorial variables to factors

alluvial_data <- clean_data2 |>
  group_by(urban, low_vehicle_access, foodaccess_category) |>
  summarise(count = n(), .groups = "drop") |>
  mutate(
    foodaccess_category = factor(
      foodaccess_category,
      levels = c("Low Access Issues", "Moderate Access Issues", "High Access Issues")
    ),
    urban = ifelse(urban == 1, "Urban", "Rural"),
    low_vehicle_access = ifelse(low_vehicle_access == 1, "Low Vehicle Access", "Adequate Vehicle Access")
  )

unique(alluvial_data$foodaccess_category)
[1] High Access Issues     Low Access Issues      Moderate Access Issues
Levels: Low Access Issues Moderate Access Issues High Access Issues

Alluvial Plot

ggplot(alluvial_data,
       aes(axis1 = urban,
           axis2 = low_vehicle_access,
           axis3 = foodaccess_category,
           y = count)) +
  
  geom_alluvium(aes(fill = foodaccess_category),
                width = 0.15,
                alpha = 0.85,
                color = "white") +
  
  scale_fill_manual(values = c(
    "Low Access Issues" = "#A8D5BA",      
    "Moderate Access Issues" = "#AFCBFF", 
    "High Access Issues" = "#F4A6A6"    
  )) +
  
  scale_x_discrete(limits = c("Urban", "Vehicle Access", "Food Access Level"),
                   expand = c(0.05, 0.05)) +
  
  labs(
    title = "Flow of Food Access Disparities Across Communities",
    y = "Number of Census Tracts",
    fill = "Access Level",
    caption = "Urban: 1 = Urban, 0 = Rural | Vehicle Access: 1 = Low Access, 0 = Adequate Access | Food Access Level reflects severity of limited access to supermarkets."
  ) +
  
  theme_minimal() +
  
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    axis.title.x = element_blank(),
    axis.text.x = element_text(size = 11),
    legend.position = "right"
  )

Faceted Denisty Plot

I used a faceted density plot to better compare how low food access is distributed across urban/rural groups. This type of plot provides a clearer view of patterns than a scatterplot and allows me to compare urban and rural areas while also separating the data by vehicle access.

ggplot(clean_data, aes(x = low_access_share, fill = urban)) +
  
  geom_density(alpha = 0.5) +
  
  facet_wrap(~ low_vehicle_access,
             labeller = labeller(
               low_vehicle_access = c(
                 "0" = "Adequate Vehicle Access",
                 "1" = "Low Vehicle Access"
               )
             )) +
  
  scale_fill_manual(
    values = c("0" = "#AFCBFF", "1" = "#F4A6A6"),
    labels = c("Rural", "Urban")
  ) +
  
  labs(
    title = "Distribution of Low Food Access by Urban Status and Vehicle Access",
    x = "Low Food Access (%)",
    y = "Density",
    fill = "Urban Classification",
    caption = "This plot compares the distribution of low food access across urban and rural areas, separated by levels of vehicle access."
  ) +
  
  theme_minimal()

Essay

The alluvial plot shows how census tracts move across urban/rural classification, vehicle access, and levels of food access. The width of each flow represents the number of census tracts in each category. A clear pattern is that areas with low vehicle access tend to fall into higher levels of food access issues, while areas with better vehicle access are more likely to have fewer access issues. Additionally, rural areas appear to have a larger proportion of census tracts with high access issues compared to urban areas, suggesting that both transportation access and geographic location play important roles in food access disparities.

The faceted density plot shows how low food access is distributed across urban and rural areas, separated by vehicle access. In both panels, rural areas tend to have higher levels of low food access, as distributions are concentrated at higher percentages. These differences are more noticeable in areas with low vehicle access, where both urban and rural communities experience higher food access challenges. Overall, both visualizations highlight consistent patterns showing that transportation and location are key factors influencing food access.

One interesting pattern that stands out is that poverty does not appear to have a strong relationship with food access on its own, per the regression results, even though it is often assumed to be a major factor. Instead, structural factors such as transportation access and whether an area is urban or rural seem to play a more important role.

One limitation of this project is that some potential visualizations were difficult display. For example, initial scatterplots during exploratory data analysis were severely cluttered due to the large dataset and did not clearly show patterns, most scatterplots showed weak relationships. I also considered making the alluvial plot interactive using plotly, but this was not compatible with the required format of this project (Rpubs). Additionally, there may be other important variables, such as public transportation access or the number of nearby grocery stores, that were not included in the dataset but could have provided further insight. Overall, the visualizations suggest that food access disparities are complex and influenced more by structural factors than by income on its own.

References:

Beaulac, J., Kristjansson, E., & Cummins, S. (2009). A systematic review of food deserts, 1966–2007. Preventing Chronic Disease, 6(3).

Centers for Disease Control and Prevention (CDC). (2025). Food access and public health. https://www.cdc.gov/pcd/issues/2025/24_0458.htm

U.S. Department of Agriculture (USDA). (2023). Food access research atlas documentation. https://www.ers.usda.gov/data-products/food-access-research-atlas/documentation

U.S. Hunger. (2023). Transportation and food insecurity. https://ushunger.org/blog/transportation-food-insecurity/

Walker, R. E., Keane, C. R., & Burke, J. G. (2010). Disparities and access to healthy food in the United States: A review of food deserts literature. Health & Place, 16(5), 876–884.

Visualization techniques for the faceted density plot were created with assistance from ChatGPT (OpenAI)