Introduction

Obesity is a major public health issue in the United States. National estimates suggest that approximately one in three Americans has obesity, reflecting a substantial and widespread burden. Obesity is closely linked to a range of adverse health outcomes, including diabetes, heart disease, metabolic syndrome, and stroke (NIH, n.d.). Given these serious health risks, dietary habits and the environments in which people obtain food play a critical role in maintaining and promoting good health.

Food habits are shaped by food environments, which influence people’s ability to access affordable and healthy foods. Communities with fewer healthy food outlets, such as supermarkets and grocery stores, and higher concentrations of unhealthy options, such as fast-food restaurants, tend to experience higher rates of obesity and poorer dietary outcomes (Pineda et al. (2024)). Individuals who lack convenient or affordable access to healthy foods are less likely to incorporate them into their diets and may instead rely on readily available options like fast food, further exacerbating unhealthy eating patterns.

The Retail Food Environment Index (RFEI) is a commonly used metric to quantify local retail food environments. It is typically calculated as the ratio of supermarkets and produce vendors to fast-food restaurants and convenience stores, capturing the relative balance between healthier and less healthy food outlets. However, this measure does not account for mobility or daily travel patterns, which are crucial for understanding how people actually experience and access food in their everyday lives. We therefore propose incorporating mobility-based accessibility measures to evaluate the quality of retail food environments, as accessibility better reflects perceived distance, travel time, and ease of reaching food outlets than simple counts or ratios of stores.

Examining how retail food environments influence obesity is essential for informing public health and planning interventions. More nuanced measures of the retail food environment can guide planners and policymakers in designing land-use strategies that improve access to healthy foods, particularly in vulnerable neighborhoods with limited grocery options. By identifying areas with poor accessibility and rethinking the location and mix of food outlets, it becomes possible to support healthier food choices and contribute to reducing obesity-related health disparities.

Research Questions

In this project, we examine how retail food environments influence obesity by raising 3 questions.

Data

Obesity Prevalence

We use obseity prevalence as our dependent variable. The data is from CDC’s PLACES program and UCSF Health Atlas (Centers for Disease Control and Prevention, 2024).

## Simple feature collection with 6 features and 3 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.44623 ymin: 33.57743 xmax: -84.16538 ymax: 33.96102
## Geodetic CRS:  NAD83
##         GEOID db_r            NAMELSAD                       geometry
## 1 13089023212 38.2 Census Tract 232.12 MULTIPOLYGON (((-84.18651 3...
## 2 13089020400 24.9    Census Tract 204 MULTIPOLYGON (((-84.34921 3...
## 3 13089020500 30.2    Census Tract 205 MULTIPOLYGON (((-84.34919 3...
## 4 13089022800 27.4    Census Tract 228 MULTIPOLYGON (((-84.2965 33...
## 5 13135050311 29.7 Census Tract 503.11 MULTIPOLYGON (((-84.22908 3...
## 6 13063040510 47.2 Census Tract 405.10 MULTIPOLYGON (((-84.4452 33...

Obesity prevalence exhibits a clear spatial pattern across the study area. Southern and eastern neighborhoods show notably higher rates, with prevalence exceeding 30%. In practical terms, this means that roughly one in three residents in these areas is classified as having obesity. Such a high prevalence suggests a substantial local burden of obesity-related health risks, including diabetes, cardiovascular disease, and other chronic conditions, and indicates areas where targeted public health interventions may be particularly needed.

In contrast, northern neighborhoods and the downtown area have comparatively lower obesity prevalence. These areas may have better access to health-promoting resources—such as walkable environments, recreational facilities, or healthier food options—or differ in socioeconomic composition and lifestyle patterns, which can contribute to lower obesity rates. The spatial contrast between high-prevalence (south and east) and low-prevalence (north and downtown) areas highlights potential geographic inequities in health outcomes and supports the need for place-based strategies that prioritize the most affected neighborhoods.

POIs

We utilized point-of-interest (POI) data from SafeGraph, which includes North American Industry Classification System (NAICS) codes and vendor categories that can be used for classification. From this dataset, we first selected food-related POIs using NAICS codes 7225 (Restaurants and Other Eating Places) and 445110 (Grocery Stores).

Within these, we identified unhealthy food outlets by filtering POIs whose categories included terms such as “Burgers,” “Chicken Wings,” “Pizza,” “Waffles,” “Fast Food,” “Ice Cream Shop,” and “Donut Shop.” Conversely, we defined healthy food outlets by selecting POIs categorized as “Healthy Food,” “Organic Food,” “Salad,” “Vegetarian Food,” as well as “Grocery Stores.” This categorization allowed us to distinguish between relatively healthier and less healthy components of the retail food environment for subsequent analysis.

We observe that unhealthy POIs (“bad” POIs) are more numerous than healthy POIs (“good” POIs) in the study area. Unhealthy outlets tend to cluster in specific locations, forming concentrated pockets of fast-food and similar establishments. In contrast, healthy POIs appear more spatially dispersed across neighborhoods. This pattern is partly driven by our classification, which includes grocery stores as healthy POIs. Because grocery stores are typically distributed more evenly to serve residential areas, they contribute to a more spread-out spatial distribution of healthy POIs across the urban fabric.

RFEI

The Retail Food Environment Index (RFEI) is a conventional measure of local food environments that relies on counts of food outlets within a given area, making it inherently sensitive to the chosen unit of analysis (e.g., census tract or block group). In this study, we calculate RFEI as the ratio of supermarkets and produce vendors to fast-food restaurants and convenience stores. Higher RFEI values indicate a more favorable retail food environment, characterized by a greater relative presence of healthy outlets compared to unhealthy ones.

The Retail Food Environment Index (RFEI) values for most neighborhoods in the study area range between 0 and 1, indicating a relatively balanced mix of healthy and unhealthy food outlets. Among them, Lilburn and Clarkston exhibit comparatively high RFEI values, suggesting a more favorable retail food environment with a greater relative presence of healthy food outlets. Forest Park also shows elevated RFEI values, indicating better access to healthier food options than many surrounding areas.

Overall, neighborhoods tend to have higher RFEI values in and around their downtown or central commercial areas. This pattern implies that the cores of these communities often concentrate a larger share of grocery stores and other healthy food retailers relative to fast-food restaurants and convenience stores, potentially offering residents in these areas better opportunities to access nutritious food.

Accessibility to Retail Foods POIs

We calculated accessibility to retail food POIs using a walking network derived from OpenStreetMap (OSM). First, we defined the origins of trips as the centroids of each Census Block Group (CBG). We chose CBG centroids instead of census tract centroids because tracts cover larger areas and are less suitable for capturing fine-scale walking distances to nearby POIs. Using CBG centroids allows us to better represent local accessibility within neighborhoods.

The maps illustrate spatial variation in mean walking time from Census Block Groups (CBGs) to healthy and unhealthy Points of Interest (POIs) across the five-county Atlanta metropolitan region. Walking times to POIs are highly uneven, with short travel times concentrated almost exclusively in the urban core. In many suburban and exurban areas, residents face substantially longer walking times or have no accessible healthy destinations at all, indicating limited walkable access to health-supportive amenities.

The accessibility maps reveal clear spatial disparities in walking times to healthy and unhealthy POIs across the Atlanta metropolitan region. Large portions of suburban and exurban counties—particularly Cobb, Gwinnett, North Fulton, and South DeKalb—exhibit mean walking times to healthy POIs in the range of 45 minutes, or have no accessible destinations within the modeled walking threshold. These long or missing walking times indicate that many residents in these areas effectively lack walkable access to healthy food outlets.

Although there are isolated pockets of shorter walking times (0–19 minutes) outside the urban core, these are typically limited to small town centers or older commercial corridors where land-use patterns are somewhat more compact. Overall, areas with long walking times to healthy POIs tend to coincide with lower-density suburban development and limited mixed-use zoning. This built environment pattern implies that residents in many neighborhoods must rely on driving to reach healthy food outlets, reinforcing car dependence and constraining opportunities for routine, walkable access to nutritious foods.

The map of unhealthy POIs reveals that mean walking times to these destinations are, somewhat unexpectedly, quite similar to those for healthy POIs across many Census Block Groups. This diverges from our initial hypothesis that unhealthy POIs would systematically offer much shorter walking times than healthy POIs due to their greater numbers. Although the total count of unhealthy POIs is indeed larger, the resulting accessibility patterns suggest a different spatial dynamic than anticipated. In particular, the similarity in average walking times indicates that unhealthy POIs tend to be more tightly concentrated in specific commercial clusters rather than being evenly distributed throughout all neighborhoods.

##      tract_id ttime_bad ttime_good
## 1 13063040202  37.25000   33.43085
## 2 13063040203  33.12500   35.25298
## 3 13063040204  27.56250   28.95833
## 4 13063040302  28.65714   35.50607
## 5 13063040306  22.41667   33.84900
## 6 13063040307  42.50000   29.98990

We aggregated census block group walking time using average and aggregated to census tract level.

Socioeconomic Analysis using ACS Data By Block Group

We include socioeconomic variables in our models to account for factors that are often strongly associated with obesity prevalence. Prior research consistently shows that low-income communities and neighborhoods with higher shares of minoritized racial and ethnic groups tend to experience disproportionately high rates of obesity. By incorporating these socioeconomic indicators, we aim to better isolate the association between the retail food environment and obesity outcomes, while acknowledging underlying structural and social inequalities that shape health risks across neighborhoods.

Using FOX 5 Digital Team (2023), the income buckets for the median household income figure were determined because the middle income range in Atlanta is considered to range from $49,652 to $148,214. Lower income block groups cluster in the southern and western portions of the metro area.

The four socioeconomic indicators reveal clear spatial disparities across the Atlanta metro area. Median household income is lowest in the southern, western,and some areas of the northwestern parts that correlate with areas with high concentrations of Black and Hispanic residents. These same neighborhoods look to have higher percentage of households without access to a vehicle which limits residents’ ability to travel outside of their immediate vicinity for healthy food options. Areas with low vehicle usage also correlate with the denser parts of the city, along with public transit commuting – but there doesn’t seem to be as high rates of public transit commuting in areas that also have more households without a vehicle. Overall, the quadrant of maps highlights the geographic overlap between lower income neighborhoods with limited mobility access which can affect access to healthy food outlets. This will be further assessed using statistical analysis.

Result

Comparing RFEI with Accessibility to Retail Foods POIs

We compared the traditional Retail Food Environment Index (RFEI) with the accessibility index developed in this study. Overall, the accessibility index produces smoother spatial gradients, whereas RFEI tends to show abrupt breaks at tract or neighborhood boundaries, reflecting its sensitivity to the chosen unit of analysis.

However, the accessibility index appears to capture more nuanced variation in the distribution of food environments. By incorporating network-based travel times rather than simple counts or ratios of outlets, the accessibility measure better reflects how residents actually experience access to food across space. In this sense, food accessibility provides a more refined depiction of neighborhood food environments than RFEI alone. Despite these differences, both measures exhibit broadly similar spatial patterns, with higher values concentrated in Downtown, Midtown, and Buckhead.

Correlation Plot

The correlation analysis indicates that socioeconomic variables are highly interrelated, reflecting the broader structural linkages between income, education, and household resources. SNAP participation shows a particularly strong positive correlation with obesity prevalence, suggesting that neighborhoods with higher reliance on food assistance programs tend to experience greater obesity burdens. Similarly, lower educational attainment is strongly associated with higher obesity prevalence.

The share of people who commute by walking is positively correlated with diabetes prevalence, which may reflect complex neighborhood dynamics—for example, dense urban areas where walking is more common may also be places where chronic conditions are concentrated due to other social and environmental stressors. In contrast, both RFEI and our food environment accessibility are slightly correlated, as food environment accessibility was calculated from RFEI. On top of it, both measures exhibit relatively low correlations with these health outcomes, indicating that the relationship between retail food environments and chronic disease is not straightforward and may be mediated or overshadowed by underlying socioeconomic conditions.

## # Check for Multicollinearity
## 
## Low Correlation
## 
##         Term  VIF   VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
##  pct_transit 1.35 [1.26, 1.49]     1.16      0.74     [0.67, 0.80]
##     pct_walk 1.08 [1.03, 1.21]     1.04      0.92     [0.82, 0.97]
##        tt_45 1.03 [1.00, 1.30]     1.02      0.97     [0.77, 1.00]
##      pct_bch 1.62 [1.49, 1.79]     1.27      0.62     [0.56, 0.67]
##  pct_poverty 2.07 [1.88, 2.30]     1.44      0.48     [0.43, 0.53]
##   pct_renter 1.46 [1.35, 1.61]     1.21      0.69     [0.62, 0.74]
##     pct_snap 2.17 [1.97, 2.42]     1.47      0.46     [0.41, 0.51]
##        fd_ht 1.11 [1.05, 1.23]     1.05      0.90     [0.81, 0.95]
##         rfei 1.12 [1.06, 1.24]     1.06      0.89     [0.81, 0.94]

We conduct a multicollinearity analysis to assess the degree of correlation among socioeconomic variables and retail food environment indicators. The results show that the percentage of SNAP participants and the percentage of the population living below the poverty level exhibit high Variance Inflation Factor (VIF) values. The percentage of adults with a bachelor’s degree or higher also shows a relatively elevated VIF. These findings are expected, as these socioeconomic indicators are closely related and often capture overlapping dimensions of economic disadvantage.

Given the high multicollinearity among these variables, we chose to retain only the SNAP participation rate in our final models. SNAP percentage not only serves as a proxy for economic vulnerability, but also has a particularly meaningful conceptual link to food access and dietary behavior. Moreover, its strong observed association with obesity prevalence provides clear interpretive value for this study. By including SNAP percentage and excluding other highly collinear socioeconomic variables, we reduce redundancy in the model while preserving a key indicator of economic and food-related hardship.

## # Check for Multicollinearity
## 
## Low Correlation
## 
##         Term  VIF   VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
##     pct_walk 1.05 [1.01, 1.22]     1.03      0.95     [0.82, 0.99]
##  pct_transit 1.35 [1.25, 1.48]     1.16      0.74     [0.67, 0.80]
##        tt_45 1.02 [1.00, 1.45]     1.01      0.98     [0.69, 1.00]
##   pct_renter 1.30 [1.21, 1.43]     1.14      0.77     [0.70, 0.82]
##     pct_snap 1.38 [1.28, 1.52]     1.17      0.72     [0.66, 0.78]
##        fd_ht 1.11 [1.05, 1.23]     1.05      0.90     [0.81, 0.95]
##         rfei 1.11 [1.05, 1.23]     1.05      0.90     [0.81, 0.95]

Regression Plot

We estimate a series of linear regression models to examine how food environment indices are associated with obesity prevalence, while controlling for key socioeconomic variables that may also influence obesity. Because the accessibility index and RFEI are moderately correlated with each other, we specified separate models for each measure rather than including both in the same regression. This approach allows us to isolate the effect of each index on obesity prevalence and avoids potential issues related to multicollinearity between the two food environment indicators.We eliminate observations with any missing values in our variables.

## 
## Call:
## lm(formula = db_r ~ pct_walk + tt_45 + +pct_renter + pct_snap + 
##     fd_ht, data = df_clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -17.3227  -3.5064  -0.7726   3.4028  17.5924 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) 26.99159    1.06937  25.241 < 0.0000000000000002 ***
## pct_walk    -2.42923    0.41811  -5.810        0.00000000894 ***
## tt_45        0.24455    0.10234   2.389               0.0171 *  
## pct_renter  -0.03762    0.01468  -2.562               0.0106 *  
## pct_snap     1.06711    0.04219  25.296 < 0.0000000000000002 ***
## fd_ht        1.55497    0.98469   1.579               0.1147    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.129 on 819 degrees of freedom
## Multiple R-squared:  0.4837, Adjusted R-squared:  0.4805 
## F-statistic: 153.5 on 5 and 819 DF,  p-value: < 0.00000000000000022
## # A tibble: 6 × 5
##   term        estimate std.error statistic   p.value
##   <chr>          <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)  27.0       1.07       25.2  1.94e-104
## 2 pct_walk     -2.43      0.418      -5.81 8.94e-  9
## 3 tt_45         0.245     0.102       2.39 1.71e-  2
## 4 pct_renter   -0.0376    0.0147     -2.56 1.06e-  2
## 5 pct_snap      1.07      0.0422     25.3  8.89e-105
## 6 fd_ht         1.55      0.985       1.58 1.15e-  1
## # A tibble: 1 × 12
##   r.squared adj.r.squared sigma statistic   p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>     <dbl> <dbl>  <dbl> <dbl> <dbl>
## 1     0.484         0.481  5.13      153. 5.77e-115     5 -2516. 5047. 5080.
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>

In the first regression model using the food accessibility index, the coefficient for the accessibility measure is not statistically significant (t = 1.579, p = 0.115), suggesting only a marginal association between food accessibility and obesity prevalence after controlling for socioeconomic factors. By contrast, all included socioeconomic variables are statistically significant.

Notably, the percentage of people who commute by walking has a very large and significant effect (t = 25.241, p < 0.001). A one–percentage point increase in the share of walking commuters is associated with a 2.4–percentage point decrease in obesity prevalence, indicating a strong negative relationship between active commuting and obesity. The percentage of SNAP recipients is also highly significant (t = 25.296, p < 0.001): a one–percentage point increase in SNAP participation is associated with a 1.06–percentage point increase in obesity prevalence, highlighting the link between economic vulnerability and obesity risk. In addition, the share of residents with commute times longer than 45 minutes is positively associated with obesity prevalence, while the percentage of renter households had a negative association with the outcome variable. Together, these results suggest that socioeconomic and commuting characteristics play a much stronger role in explaining spatial variation in obesity than the food accessibility index alone.

## 
## Call:
## lm(formula = db_r ~ pct_walk + tt_45 + +pct_renter + pct_snap + 
##     rfei, data = df_clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.9458  -3.4744  -0.7618   3.4644  17.6401 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) 28.24611    0.41431  68.176 < 0.0000000000000002 ***
## pct_walk    -2.44056    0.41708  -5.852        0.00000000704 ***
## tt_45        0.25534    0.10224   2.498               0.0127 *  
## pct_renter  -0.03551    0.01462  -2.429               0.0153 *  
## pct_snap     1.05709    0.04222  25.037 < 0.0000000000000002 ***
## rfei         0.42993    0.16871   2.548               0.0110 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.117 on 819 degrees of freedom
## Multiple R-squared:  0.4862, Adjusted R-squared:  0.4831 
## F-statistic:   155 on 5 and 819 DF,  p-value: < 0.00000000000000022
## # A tibble: 6 × 5
##   term        estimate std.error statistic   p.value
##   <chr>          <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)  28.2       0.414      68.2  0        
## 2 pct_walk     -2.44      0.417      -5.85 7.04e-  9
## 3 tt_45         0.255     0.102       2.50 1.27e-  2
## 4 pct_renter   -0.0355    0.0146     -2.43 1.53e-  2
## 5 pct_snap      1.06      0.0422     25.0  3.49e-103
## 6 rfei          0.430     0.169       2.55 1.10e-  2
## # A tibble: 1 × 12
##   r.squared adj.r.squared sigma statistic   p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>     <dbl> <dbl>  <dbl> <dbl> <dbl>
## 1     0.486         0.483  5.12      155. 7.95e-116     5 -2514. 5043. 5076.
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>

In the second regression model, which incorporates RFEI instead of the accessibility index, all socioeconomic variables as well as RFEI are statistically significant. The RFEI coefficient is positive and significant (t = 2.438, p = 0.01), indicating that higher RFEI values are associated with higher obesity prevalence. The remaining socioeconomic covariates retain similar directions and magnitudes to those observed in the accessibility model, suggesting that their effects are robust to the choice of food environment measure.

Taken together, the two models do not support our original hypotheses that: (1) retail food environment indices would have a strong protective effect against obesity, and (2) the accessibility index would better explain obesity prevalence than RFEI. Contrary to expectations, the “healthy” RFEI measure is positively, rather than negatively, associated with obesity prevalence. This indicates that neighborhoods with relatively more healthy food outlets (as captured by RFEI) also tend to have higher obesity rates, once socioeconomic factors are controlled.

Several features of Atlanta’s retail food environment and mobility patterns may help explain these counterintuitive findings. First, in many parts of the Atlanta region, socioeconomically vulnerable groups are concentrated in or near downtown areas within each jurisdiction, while wealthier populations are more likely to reside farther out. As a result, healthy food outlets and higher RFEI values often coincide with areas that also have higher concentrations of low-income and minoritized residents, who face multiple structural barriers to maintaining healthy diets. This suggests that obesity distribution is not determined solely—or even primarily—by the built environment, but is deeply intertwined with socioeconomic conditions, household resources, and the capacity to act on available food choices.

Second, Atlanta’s strong dependence on automobile travel may weaken the importance of local, walkable proximity to food outlets. Even when healthy POIs are located relatively close in network distance, many residents may still access both healthy and unhealthy food primarily by car, reducing the behavioral relevance of marginal differences in walking-based accessibility. In this context, the presence of healthy food outlets (as measured by RFEI) may co-occur with, rather than offset, underlying socioeconomic vulnerabilities that are more directly driving obesity risk.

Conclusion

Overall, the results suggest that socioeconomic conditions play a much stronger role in shaping obesity prevalence than either food accessibility or retail food environment indices. SNAP participation is both strongly associated with obesity, indicating that economic vulnerability and social resources are central to understanding spatial patterns of obesity. Daily travel patterns—particularly travel mode—also matter, as areas with higher shares of walking commuters show systematically different obesity outcomes.

By contrast, food environment indices show relatively weak or inconsistent relationships with obesity. The physical accessibility of food outlets, as captured by our network-based accessibility measure, is not a statistically significant predictor of obesity prevalence. Taken together, these findings highlight the importance of addressing underlying socioeconomic inequities alongside improvements in the built and food environments.

References

NIH, 2025, https://www.niddk.nih.gov/health-information/weight-management/adult-overweight-obesity/health-risks#:~:text=High%20blood%20pressure%20link%2C%20also,which%20help%20regulate%20blood%20pressure.

Centers for Disease Control and Prevention. (2024, October 29). About PLACES: Local data for better health. https://www.cdc.gov/places/about/index.html

FOX 5 Digital Team. 2023. “This Is How Much It Takes to Be ’Middle Class’ in Georgia.” FOX 5 Atlanta. https://www.fox5atlanta.com/news/georgia-middle-class-economic-survey.
Pineda, E., D. J. Jimenez, S. Friel, and J. A. Rivera. 2024. “Food Environment and Obesity: A Systematic Review.” Public Health Nutrition 27 (1): 1–15. https://doi.org/10.1017/S1368980023002403.