Food insecurity Intervention

Author

Angel Gallardo

Published

December 15, 2025

Abstract

Food insecurity remains a global issue. In the United States alone, 13.5% of households, or 18 million households, were food insecure in 2023. At the same time, restaurants contribute to significant food loss, food that is ready for consumption. As such, restaurant food loss represents a potential but underutilized resource for intervention. This research aims to build a model that accurately predicts high food insecurity at the county level. Furthermore, we look to demonstrate the potential impact that local restaurants and government programs can create with efficient intervention. This way, solving food insecurity in the United States can also reduce food waste from restaurants in the community. Policy-making can improve at the county level to help mitigate food insecurity and food loss with food assistance programs, as opposed to the current state-level decision-making. By incorporating food assistance program participation data into the model, we can gain insight into how Americans are securing their food. Thus, this research looks to detect whether calorie intake can predict high food insecurity. Finally, this research simulates an intervention, where a portion of restaurant food loss is converted into recovery calories through SNAP policy changes.

Introduction

In 2023, the USDA, FDA, and EPA (U.S. Food and Drug Administration (FDA), U.S. Department of Agriculture (USDA), and U.S. Environmental Protection Agency (EPA) 2023) announced a national goal to reduce food loss and waste in the U.S by 50% by 2030. Food insecurity (Feeding America 2025) is described as people not having access to food needed to live their fullest lives. On the other side, Food waste (ReFED, Inc. 2025) is defined as the edible amount of food, post harvest, that is available for human consumption but is not yet consumed for any reason. Currently, more than one-third of all available food goes uneaten through loss or waste.

According to Feeding America (Feeding America 2025), the primary drivers of food insecurity include income-related factors (low-wage jobs, unreliable work, or financial emergencies); high cost of living (essentials that are increasingly difficult to afford); community factors (limited access to transportation); health-related issues (reduced budgets for food and other essentials); and systematic barriers (unfair systems in policies and institutions).

By using Feeding America food insecurity estimates, we can build on this information to create a model that can accurately predict highly food insecure areas. We can build a model that identifies counties most in need of intervention and predicts high food insecurity, and determine the demographic and program participation variables most strongly associated with food insecurity in the United States. Finally, the study simulates a policy-driven intervention in which a portion of restaurant food loss is recovered and redirected through SNAP, estimating its potential impact on reducing food insecurity at the county level.

Research Question

This research focuses on the county level, where predictive modeling can reveal high priority areas most in need of food security interventions. By building a model that incorporates existing county-level food insecurity estimates, it becomes possible to identify counties at greatest risk, as well identifying indicators that predict food insecurity, while supporting government agencies policies in rapidly targeting resources. Such a model could help federal and state programs collaborate more effectively with local retailers and food service providers to mitigate hunger. Retailers may also benefit through reduced food waste and opportunities to participate in discounted food assistance programs that increase both customer engagement and revenue. In essence, a county-level predictive framework can guide state leaders in expanding food assistance partnerships such as restaurant participation in SNAP where they are most needed. Because each state currently determines which restaurants may participate in SNAP, a data-driven approach could strengthen local food systems and improve equitable access to food assistance programs.

The significance of this phenomenon is twofold. First, more than 95 percent of food waste in the United States ends up in landfills, where it breaks down anaerobically and releases methane, carbon dioxide, and other greenhouse gases into the atmosphere (Mu et al. 2017). If food waste is not addressed, it will continue to contribute significantly to climate change. Second, the USDA reported that in 2023, 13.5 percent of U.S. households experienced food insecurity at some point during the year. Food insecurity is a global crisis, and countries around the world are actively seeking solutions to eliminate it.

There are numerous real-world applications for a predictive model that connects food recovery with food insecurity reduction. For retailers and food service businesses, reducing food waste can generate revenue increases, improve supply chain efficiency, and enhance sustainability practices. Food that would otherwise be discarded can instead be recovered and redirected to food banks, community programs, or subsidized food assistance markets, contributing to reductions in hunger and improvements in public health. Local nonprofits and government agencies can use model outputs to develop targeted initiatives that match surplus food with the areas most in need. Taxpayers also benefit when public program funding is allocated more efficiently. Additionally, reducing food waste directly lowers greenhouse gas emissions and alleviates pressure on natural resources, supporting long-term environmental sustainability and economic resilience.

Research Questions

  • How well can regression-based models predict county-level food insecurity using socioeconomic and food access data?

  • Which demographic, socioeconomic, and food assistance indicators are most significant in predicting food insecurity?

  • Is expanding the Restaurant Meal Program an effective intervention? If so, where?

Literature Review

Measuring Food insecurity and Hunger in the United States (Carlson, Andrews, and Bickel 1999) developed a standardized household scale to quantify food insecurity in the US. The measurement tool used categorical household survey responses to quantify food insecurity severity, ranging from concerns about food supply to skipping meals or experiencing hunger. This benchmark classifies food security to be defined as assured access at all times to enough food for an active healthy life. This study also defined malnutrition as a more severe form of food insecurity, which is not the case in the United States The scale revealed that severe food insecurity was more common in households with children under eighteen, minority households, and among households with low income-to-poverty ratios. This scale serves as a the foundation that links food insecurity to socioeconomic factors. However, the measurement is descriptive rather then a predictive model. A predictive model can utilize this work by including these factors along with economic access and recoverable calories to estimate the probability that a county reduces food insecurity.

A data-driven approach improves food insecurity crisis prediction (Lentz et al. 2019)highlighted the similar challenges in measuring food insecurity globally. The research argues that Integrated Food Security Phase Classification (IPC), a global framework that classifies food insecurity, does not make use of the readily available secondary data, in a replicable and transparent manner. Furthermore, there is no current food security early warning and monitoring system that incorporates readily available data into a predictive model. As a result, these methods limit timely, targeted humanitarian response. The research builds linear and log-linear models using three readily available secondary data measures, weather and price fluctuations, geographic variations, demographics, and intra-annual seasonal trends to predict food security at the sub national level. Their findings show that their model correctly predicts 83% to 99% of the most food insecure clusters, while the IPC’s approach only predicts between 0% and 10%. Although the model is an improvement from the IPC’s approach, the model does not incorporate solutions to reduce food insecurity. Meanwhile, the U.S, unlike other countries, food insecurity is less driven by weather shocks and geographic shock, but by socioeconomic factors and access to federal assistance programs. By accurately predicting high food insecurity, the US government can assess whether counties are steering towards high food insecurity to perform early intervention.

Several studies examine the relationship between food waste and food insecurity. Leveraging Machine Learning for Food Waste Reduction: An Analysis of Predictive Models (Yang, Zhao, and Lu 2024) applied machine learning models, including Random Forests, to predict food waste using data on household consumption patterns, retail demand, and food service estimates. The study concluded that household-level food waste estimates are the most influential predictor driving food waste predictions at the per-capita level. However, household data may be difficult to collect reliably, due to self-reporting biases and lack of incentives for participation, limiting the practicality of household-level interventions. As a result, using household food excess as a potential intervention may be less practical or effective to redistribute excess food.

Another study (Irani et al. 2018) examines how food waste and loss in supply chains impact food security, using causal modeling and simulation to identify key drivers from qualitative data. It also makes a distinction between food loss and food waste, the latter being food that is good for human consumption. The findings show that food waste is not solely the result of kitchen inefficiencies or over-ordering, but arises from a combination of factors, including consumer behavior, storage logistics, and supplier practices. However, one study was noted as identifying plate waste to compromise the highest amount of food waste, followed by storage, and preparation loss. The research employs a Fuzzy Cognitive Map (FCM) to model the interrelated factors driving food waste and loss. The study concludes that no single factor—such as inventory management—can reduce food waste effectively without complementary actions, such as menu planning and staff training. The study’s reliance on qualitative data and prototype simulations limits its generalizability, and restaurants may need context-specific data to accurately quantify the impact of interventions. It is estimated that 17% of food is wasted at the retail level, where redistribution can have a bigger impact than at the production or consumption stages of the supply chain. Although food waste at the retail level has been studied less than at other supply chain stages, retail stores tend to generate more food waste than food loss, as the discarded food is generally still consumable.

Feeding America’s Map the Meal Gap (2023) (Hake, Engelhard, and Dewey 2023) provides the leading annual estimates of food insecurity at the county level in the United States. Because no national survey directly measures county-level food insecurity each year, the study develops a statistical model that estimates food insecurity rates using available socioeconomic indicators, including poverty, unemployment, home ownership, food budget shortfall, and local food cost indices. These estimates also incorporate calculations of the additional dollars food-insecure households would need to meet basic nutritional requirements. However, because the MMG model is based on descriptive state-level regression coefficients applied to counties, the resulting estimates are not exact measurements and are subject to considerable uncertainty. The model is also not designed to forecast future food insecurity, assess causal relationships, or evaluate how policy interventions—such as changes in benefits or program access—might affect food insecurity.

Despite these limitations, MMG provides a valuable foundation by identifying broad geographic patterns and highlighting socioeconomic predictors that identify significant vulnerability. Building on the analysis, a predictive regression model can estimate the probability that a county falls into a high food insecurity category and can identify the most influential drivers of vulnerability. Furthermore, incorporating a current intervention policy such as the Restaurant Meal program offers an effective way to evaluate whether additional food resources could meaningfully reduce the predicted number of high food insecurity counties.

The Restaurant Meal Program (RMP) allows certain SNAP clients to buy prepared meals from restaurants using their SNAP benefits. The program is available in only a handful of states. To qualify, an individual must be a SNAP recipient who is elderly, disabled, homeless, or the spouse of a qualifying recipient. A county-level predictive model could inform where expanding restaurant participation might have the greatest effect.

Overall, In the context of the Unitedd States, food insecurity rarely reaches levels of malnutrition, in part because federal nutrition programs serve as an important buffer. However, despite the availability of these programs, the USDA reported that 13.5% of US households experienced food insecurity at some point in 2023. Addressing both food waste and food insecurity simultaneously provides an opportunity to improve public health, reduce greenhouse gas emissions, and strengthen local food systems. A predictive model that integrates socioeconomic predictors with measures of available excess food could support targeted interventions. Solving food insecurity benefits everyone, from residents, food retailers, and taxpayers.

Data

Food insecurity

The food environment atlas (U.S. Department of Agriculture, Economic Research Service 2025) provides secondary data by state and county. The data set includes the food insecurity by state only, demographics, and food-program participation rates.

The re-evaluation of the Thrifty food Plan (U.S. Department of Agriculture 2021) in 2021 to estimate the cost of a practical, nutritious, and budget-conscious diet. In 2021 the food plan referenced a family of four, where the cost of the market basket was calculated at $835.57 per month. The referenced family consisted of two adults, male and female, between the ages of 20-50 years old, calculated a daily intake of calories of about 3,000 and 2,200. The other family members included two children, one with the age rage 6-8, and a daily calorie intake of 1,800, and the other in the 9-11 age category with a daily intake of 2,200. This means that on average one dollar of SNAP benefit purchases 330 calories.

Restaurant excess estimates

Food loss and waste data is scarce due to the uneven information across all stages of food chains, and aggregated data that is difficult to replicate. In fact, a UNEP study (Soloha and Dace 2025) noted that the food waste per capita in the EU cannot be fully explained because the methodology used to quantify food waste in each member state is not known, making it difficult to conclude whether the differences in data are justified or are due to differences in scope. One review paper, (Xue et al. 2017) identified that data is heavily biased towards industrialized and high-income countries. The paper also points out that fewer studies focus on retail, households, and distribution levels of the supply chain.

US EPA (U.S. Environmental Protection Agency 2025) provides a map of restaurants at the county level with excess food estimates. We will use this data to find the aggregate food loss in tons from restaurants at the county level. This data will be utilized to estimate the amount of available calories that can be recoverable at the local level. Finally, a percentage of the recoverable amount will be estimated by finding the average cost per calorie and discounting it by the 10% used in the RMP program. Then the recovered calories will be modeled to discover it relationship with food insecurity.

Socioeconomic factors - median income, poverty, and family characteristics

Demographics - Age and race

Health - obesity rate

Food assistance benefits - Accessibility for SNAP recipients, Store availability

Data Exploration

The data summary shows that most of the variables data is complete, except for the WIC redemption variable at 55% complete. PCH_LACCESS_LOWI_15_19, defined as low income and low access to stores between 2015-2019 has a large standard deviation which could be due to population migration. The mean food insecurity rate per county is at 12%, with a standard deviation of 3%. The highest percent of food insecurity is at 29%. On the socioeconomic side, median income per county is about $60,000, with some counties median income at $26,000. In addition, the mean poverty rate in the US is at 14.5%, where some counties experience poverty rates as high as 44%. Most of the predictors are left skewed, however one of the few right skewed predictors, is the white population per county; meaning that most counties has high white population. The mean poverty rate in the US is at 14.5%, where some counties experience poverty rates as high as 44%.

Data summary
Name combined
Number of rows 3161
Number of columns 39
_______________________
Column type frequency:
character 3
factor 1
numeric 35
________________________
Group variables None

Variable type: character

skim_variable complete_rate min max empty n_unique whitespace
FIPS 1 3 5 0 3157 0
State 1 2 2 0 51 0
County 1 3 30 0 1850 0

Variable type: factor

skim_variable complete_rate ordered n_unique top_counts
METRO23 0.99 FALSE 2 0: 1958, 1: 1185

Variable type: numeric

skim_variable complete_rate mean sd p0 p25 p50 p75 p100 hist
Food_insecure 0.98 12.47 3.73 3.00 10.00 12.00 15.00 29.00 ▂▇▆▁▁
PCT_NHWHITE20 0.99 74.15 19.80 1.78 62.58 80.73 90.34 97.40 ▁▁▂▃▇
PCT_NHBLACK20 0.99 8.58 14.03 0.00 0.50 1.98 9.67 87.13 ▇▁▁▁▁
PCH_WICWOMEN_16_21 0.99 6.50 1.57 3.15 5.19 6.42 7.36 9.56 ▁▇▅▆▃
PCT_HISP20 0.99 9.79 13.68 0.17 2.37 4.61 10.46 97.68 ▇▁▁▁▁
PCT_NHTMR20 0.99 3.77 1.75 0.18 2.75 3.51 4.40 23.09 ▇▂▁▁▁
MEDHHINC21 0.99 58941.63 15267.69 25653.00 49016.25 56634.00 65687.25 153716.00 ▅▇▁▁▁
PCH_SNAP_17_22 0.99 -1.21 1.70 -5.09 -2.35 -1.22 -0.14 3.88 ▂▆▇▃▁
PCH_LACCESS_LOWI_15_19 0.98 29.03 651.98 -100.00 -18.82 -5.85 9.98 34101.74 ▇▁▁▁▁
PCH_LACCESS_HHNV_15_19 0.99 6.18 104.29 -100.00 -21.54 -4.34 16.29 3807.55 ▇▁▁▁▁
PCT_OBESE_ADULTS22 0.99 35.22 3.27 24.30 33.40 35.50 37.70 41.00 ▁▂▅▇▅
PCH_SFSP_17_21 0.99 4.83 5.81 -6.84 0.75 3.15 6.78 22.80 ▂▇▅▁▁
PCH_WIC_17_21 0.99 -0.33 0.23 -0.86 -0.51 -0.35 -0.19 0.21 ▂▇▇▆▂
REDEMP_SNAPS23 0.94 2840380.40 10913019.94 4856.34 193521.51 601624.18 1781044.29 338046560.60 ▇▁▁▁▁
PC_SNAPBEN22 0.98 30.92 19.38 0.39 17.01 26.90 40.78 259.97 ▇▁▁▁▁
REDEMP_WICS22 0.55 78416.08 50303.86 0.00 42862.06 71615.40 102446.62 629470.75 ▇▁▁▁▁
SNAPSPTH23 0.99 1.03 0.44 0.19 0.76 0.97 1.22 11.01 ▇▁▁▁▁
PCT_LACCESS_HHNV19 0.99 3.13 2.97 0.00 1.61 2.54 3.88 54.57 ▇▁▁▁▁
PCH_SNAPSPTH_17_23 0.98 16.04 53.29 -58.46 1.14 9.84 22.35 2269.16 ▇▁▁▁▁
PCT_LACCESS_SNAP19 0.99 7.91 5.45 0.00 3.89 6.70 10.88 42.09 ▇▅▁▁▁
POVRATE21 0.99 14.61 5.66 2.90 10.60 13.60 17.58 43.90 ▅▇▂▁▁
total_SNAP 0.97 3426233.75 13749766.54 696.92 261064.64 767620.21 2139400.87 448439197.91 ▇▁▁▁▁
SNAP_participants 0.98 13111.19 42177.60 6.56 1143.67 3001.98 8659.66 1146732.98 ▇▁▁▁▁
SNAP_per_participant 0.97 257.85 140.41 8.00 168.94 234.28 317.64 2298.06 ▇▁▁▁▁
calories_SNAP_part 0.97 85090.47 46334.79 2639.43 55751.10 77313.68 104820.08 758360.21 ▇▁▁▁▁
PCT_65OLDER20 0.99 20.14 4.70 4.80 17.24 19.79 22.63 58.86 ▂▇▁▁▁
PCT_18YOUNGER20 0.99 22.02 3.31 6.98 20.05 21.99 23.81 40.88 ▁▃▇▁▁
Population 0.98 106491.60 335716.59 57.00 10808.25 25801.00 68481.75 9829544.00 ▇▁▁▁▁
Household_burden 0.98 10.61 3.48 1.00 8.00 10.00 12.00 31.00 ▂▇▂▁▁
Rural_percent 0.98 59.30 31.04 0.00 34.50 60.10 88.85 100.00 ▃▅▆▆▇
Total_Excess_Food 0.96 2369.61 11250.20 0.00 88.78 282.52 1066.62 341656.49 ▇▁▁▁▁
n_restaurants 0.96 148.29 575.88 1.00 8.00 24.00 72.00 16753.00 ▇▁▁▁▁
kcal_wasted_per_capita 0.95 36733.31 36203.02 0.00 19820.20 30705.70 45549.28 912615.29 ▇▁▁▁▁
restaurants_per_capita 0.95 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ▇▅▁▁▁
PCT_LACCESS_POP19 0.99 24.39 18.60 0.00 11.92 21.46 32.37 100.00 ▇▇▂▁▁

The histogram plots demonstrate that most of the variables are either left or right skewed. For instance, the the histogram shows most counties have a high percentage of white residents, but there are extreme cases where counties have a low white population. However, food insecurity, poverty, population of residents under the age of 18, and over 65 are normally distributed. Because the dependent variable, food insecurity, is approximately normally distributed, it is more compatible with linear regression assumptions and may improve model accuracy. The skewed independent variables show that a transformation or a model robust to non-normal predictors may be more appropriate. In addition, the scatter plots confirm that most predictors exhibit heteroscedasticity against food insecurity, meaning that the variance is not consistent across the range of predictors. Overall, the the skewed predictors, as well as the presence of outliers indicate that variable transformation or non-linear models are more adequate to predict food insecurity.

Indicators vs. Food insecurity

Distribution of indicators

Warning: Removed 485 rows containing non-finite outside the scale range
(`stat_bin()`).

The top predictors high positive correlation with food insecurity are poverty, SNAP participants that have low access to stores, and calories per SNAP dollar. The only variable highly negative correlated with food insecurity, median income, demonstrates that higher income reduces food insecurity.

Correlation

# A tibble: 35 × 2
   Variable             Correlation
   <chr>                      <dbl>
 1 Food_insecure               1   
 2 POVRATE21                   0.77
 3 MEDHHINC21                 -0.68
 4 PCT_LACCESS_SNAP19          0.66
 5 PC_SNAPBEN22                0.58
 6 SNAP_per_participant        0.52
 7 calories_SNAP_part          0.52
 8 PCT_LACCESS_HHNV19          0.4 
 9 SNAPSPTH23                  0.38
10 PCT_NHBLACK20               0.32
# ℹ 25 more rows
# A tibble: 6 × 43
  fips  State County  Food_insecure PCT_NHWHITE20 PCT_NHBLACK20
  <chr> <chr> <chr>           <int>         <dbl>         <dbl>
1 01001 AL    Autauga            15          70.7         19.3 
2 01003 AL    Baldwin            12          80.5          7.77
3 01005 AL    Barbour            20          44.0         47.0 
4 01007 AL    Bibb               16          73.8         19.7 
5 01009 AL    Blount             14          84.2          1.40
6 01011 AL    Bullock            16          22.0         71.3 
# ℹ 37 more variables: PCH_WICWOMEN_16_21 <dbl>, PCT_HISP20 <dbl>,
#   PCT_NHTMR20 <dbl>, MEDHHINC21 <dbl>, PCH_SNAP_17_22 <dbl>,
#   PCH_LACCESS_LOWI_15_19 <dbl>, PCH_LACCESS_HHNV_15_19 <dbl>,
#   PCT_OBESE_ADULTS22 <dbl>, PCH_SFSP_17_21 <dbl>, PCH_WIC_17_21 <dbl>,
#   REDEMP_SNAPS23 <dbl>, PC_SNAPBEN22 <dbl>, REDEMP_WICS22 <dbl>,
#   SNAPSPTH23 <dbl>, PCT_LACCESS_HHNV19 <dbl>, PCH_SNAPSPTH_17_23 <dbl>,
#   PCT_LACCESS_SNAP19 <dbl>, POVRATE21 <dbl>, METRO23 <fct>, …

Food insecurity normalized by population

Warning in sprintf(paste0("%0", padding, "d"), as.numeric(data$fips)): NAs
introduced by coercion

Restaurant food waste per capita

Warning in sprintf(paste0("%0", padding, "d"), as.numeric(data$fips)): NAs
introduced by coercion

Areas with high food and food waste

Warning in sprintf(paste0("%0", padding, "d"), as.numeric(data$fips)): NAs
introduced by coercion

Data Preparation

The NA’s will be replaced by using the median of the state. Furthermore, in an attempt to normalize the predictors, large outliers are removed. If the predictors are not normalized, logistic models will be utilized to predict food insecurity. The dependent variable would be converted as a binary variable, where any value higher than the median (12%) would be classified as high food insecurity.

Certain variables will be renamed for clarity.

An additional variable, recovered calories from restaurant, will be creating using an example scenario, where SNAP allots additional dollars to purchase meals from restaurants at a discount rate. The model will used the same discount rate used in the restaurant meals program.

character(0)

Add Restaurant Meal Program counties

< table of extent 0 x 0 >
# A tibble: 40 × 2
   Variable                  Correlation
   <chr>                           <dbl>
 1 Foodinsecurity                   1   
 2 Poverty_rate                     0.76
 3 Medianincome                    -0.69
 4 SNAP_low_access_pct              0.66
 5 SNAP_per_capita                  0.59
 6 SNAP_per_participant             0.51
 7 calories_SNAP_part               0.51
 8 HouseHolds_LowAcces_NoCar        0.5 
 9 SNAP_auth_stores                 0.42
10 WIC_redemptions                  0.32
# ℹ 30 more rows

Models

Methodology

Due to persistent feature of skewness and heteroscedastic behavior against food insecurity, WL and GLS model seems appropriate to predict food insecurity. This approach predicts the rate of food insecurity based on its demographic, economic, health, and government program accessibility. In this framework, food insecurity rate is measured for easy predictability as well as forward looking. Although some demographic variables had low correlation against food insecurity, the first model will include socioeconomic, demographics, age and race predictor because prior research included those statistics to classify those groups that may contribute the unequal access to food. This may be especially true in the United States as the racial demographic of the United states is very diversified.

Socioeconomic conditions are represented by poverty, which is a strong determinant of food affordability and financial stability. Household burden, or food resource limitations due to other financial obligations is also included because affordable foods does affect at least one member in the family, according to previous research. Together, these variables form a comprehensive predictive model for understanding the drivers of food insecurity across U.S. counties. The other variable included in the model is the number of restaurants per county, with the hypothesize that access to restaurants will be a significant factor in predicting high food insecurity.

The third model includes the RMP variable to understand how effective that intervention is.  The second model will remove the insignificant variables for better interpretability.

Models

# A tibble: 1,896 × 43
   FIPS  Foodinsecurity WhitePct BlackPct WomenWICpart HispPct TwoOrMorePct
   <chr>          <dbl>    <dbl>    <dbl>        <dbl>   <dbl>        <dbl>
 1 55033              8     90.0   0.854          5.80    2.28         3.07
 2 21017             13     82.8   5.10           9.54    7.34         3.85
 3 17051             12     90.4   3.63           4.01    3.06         2.08
 4 51820              9     72.4  11.8            5.19    8.76         5.19
 5 48335             13     48.1  10.3            7.10   38.4          2.04
 6 20087              8     90.1   0.539          5.47    3.09         5.07
 7 13021             16     36.1  54.2            5.47    4.28         2.83
 8 51770              8     55.9  27.1            5.19    8.48         5.26
 9 40025             13     71.7   0.0436         7.97   24.5          3.22
10 48497             14     73.6   0.957          7.10   20.0          4.02
# ℹ 1,886 more rows
# ℹ 36 more variables: Medianincome <dbl>, SNAPparticipantschange <dbl>,
#   No_car_low_store <dbl>, PCH_LACCESS_HHNV_15_19 <dbl>,
#   adult_obese_rate <dbl>, SummerFoodProgChange <dbl>, WICChange <dbl>,
#   SNAP_redemptions <dbl>, SNAP_per_capita <dbl>, WIC_redemptions <dbl>,
#   SNAP_auth_stores <dbl>, HouseHolds_LowAcces_NoCar <dbl>,
#   SNAPstoreschange <dbl>, SNAP_low_access_pct <dbl>, Poverty_rate <dbl>, …
# A tibble: 1,265 × 43
   FIPS  Foodinsecurity WhitePct BlackPct WomenWICpart HispPct TwoOrMorePct
   <chr>          <dbl>    <dbl>    <dbl>        <dbl>   <dbl>        <dbl>
 1 01001             15     70.7    19.3          8.22   3.60          4.23
 2 01005             20     44.0    47.0          8.22   5.99          2.19
 3 01015             16     68.3    21.8          8.22   4.30          3.92
 4 01017             15     53.5    38.7          8.22   3.56          2.58
 5 01019             14     90.4     3.95         8.22   1.60          3.24
 6 01023             17     55.6    41.2          8.22   0.892         1.89
 7 01027             15     79.1    13.6          8.22   3.15          3.25
 8 01035             16     50.3    43.9          8.22   2.21          2.50
 9 01051             13     71.1    20.6          8.22   3.17          3.72
10 01057             18     83.0    10.5          8.22   2.43          3.25
# ℹ 1,255 more rows
# ℹ 36 more variables: Medianincome <dbl>, SNAPparticipantschange <dbl>,
#   No_car_low_store <dbl>, PCH_LACCESS_HHNV_15_19 <dbl>,
#   adult_obese_rate <dbl>, SummerFoodProgChange <dbl>, WICChange <dbl>,
#   SNAP_redemptions <dbl>, SNAP_per_capita <dbl>, WIC_redemptions <dbl>,
#   SNAP_auth_stores <dbl>, HouseHolds_LowAcces_NoCar <dbl>,
#   SNAPstoreschange <dbl>, SNAP_low_access_pct <dbl>, Poverty_rate <dbl>, …

The first model, a WLS regression model explained 73% of the county level food insecurity rate. The key drivers for predicting food insecurity are the county’s poverty rate, household burden, SNAP recipients with low access to stores, higher percentage of families with children under 18, and multiracial families. n In this model, senior citizen population is not a significant predictor for modeling food insecurity. This could be indicative that families with children under 18 should be eligible for the Restaurant meal program. Interestingly, rural counties show slightly lower food insecurity, may be due to local affordable local options. The residual histogram plot shows a normal distribution with a median of -.28; however, the residuals-versus-fitted plot reveals diagonal banding and heteroskedasticity, reflecting the discrete nature of the outcome and non-constant variance across fitted values.


Call:
lm(formula = Foodinsecurity ~ Poverty_rate + log(Medianincome) + 
    WhitePct + adult_obese_rate + TwoOrMorePct + BlackPct + HispPct + 
    SNAP_low_access_pct + log1p(HouseHolds_LowAcces_NoCar) + 
    log1p(SNAP_auth_stores) + Rural_percent + Pct_65_older + 
    PCT_18_younger + Household_burden + MetroStatus + Rural_percent * 
    SNAP_low_access_pct, data = foodinsec_train, weights = Population)

Weighted Residuals:
    Min      1Q  Median      3Q     Max 
-3578.0  -228.9   -28.1   193.2  4027.4 

Coefficients:
                                    Estimate Std. Error t value
(Intercept)                       56.7105290  5.2900626  10.720
Poverty_rate                       0.1984288  0.0227972   8.704
log(Medianincome)                 -5.0629086  0.4087600 -12.386
WhitePct                          -0.0148988  0.0059982  -2.484
adult_obese_rate                   0.0864007  0.0142613   6.058
TwoOrMorePct                       0.2059338  0.0343139   6.001
BlackPct                          -0.0479749  0.0069100  -6.943
HispPct                           -0.0043581  0.0075447  -0.578
SNAP_low_access_pct                0.1333587  0.0328217   4.063
log1p(HouseHolds_LowAcces_NoCar)  -0.2956887  0.1804798  -1.638
log1p(SNAP_auth_stores)            3.0439465  0.4165468   7.308
Rural_percent                     -0.0168620  0.0037994  -4.438
Pct_65_older                       0.0211570  0.0150477   1.406
PCT_18_younger                     0.0789701  0.0188592   4.187
Household_burden                   0.1676457  0.0182708   9.176
MetroStatus2                       0.2998158  0.1391923   2.154
SNAP_low_access_pct:Rural_percent  0.0008422  0.0004051   2.079
                                              Pr(>|t|)    
(Intercept)                       < 0.0000000000000002 ***
Poverty_rate                      < 0.0000000000000002 ***
log(Medianincome)                 < 0.0000000000000002 ***
WhitePct                                        0.0131 *  
adult_obese_rate                      0.00000000165664 ***
TwoOrMorePct                          0.00000000234053 ***
BlackPct                              0.00000000000528 ***
HispPct                                         0.5636    
SNAP_low_access_pct                   0.00005041129214 ***
log1p(HouseHolds_LowAcces_NoCar)                0.1015    
log1p(SNAP_auth_stores)               0.00000000000040 ***
Rural_percent                         0.00000960463315 ***
Pct_65_older                                    0.1599    
PCT_18_younger                        0.00002952803364 ***
Household_burden                  < 0.0000000000000002 ***
MetroStatus2                                    0.0314 *  
SNAP_low_access_pct:Rural_percent               0.0377 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 482.7 on 1879 degrees of freedom
Multiple R-squared:  0.7352,    Adjusted R-squared:  0.7329 
F-statistic:   326 on 16 and 1879 DF,  p-value: < 0.00000000000000022
there are higher-order terms (interactions) in this model
consider setting type = 'predictor'; see ?vif
# A tibble: 16 × 2
   Variable                            VIF
   <chr>                             <dbl>
 1 SNAP_low_access_pct               11.9 
 2 WhitePct                          11.7 
 3 SNAP_low_access_pct:Rural_percent 10.9 
 4 HispPct                            8.73
 5 log(Medianincome)                  8.52
 6 Poverty_rate                       8.44
 7 Rural_percent                      6.93
 8 BlackPct                           5.94
 9 log1p(HouseHolds_LowAcces_NoCar)   5.01
10 Household_burden                   3.65
11 Pct_65_older                       2.83
12 log1p(SNAP_auth_stores)            2.69
13 PCT_18_younger                     2.28
14 adult_obese_rate                   2.24
15 MetroStatus                        1.98
16 TwoOrMorePct                       1.92
[1] -0.2394513

The second model, a generalized least squares (GLS) specification, improves model fit by accounting for heteroskedasticity through a population-based variance structure. The poverty rate remains the strongest predictor for estimating food insecurity, Interestingly SNAP authorized stores per capita has a positive relationship with food insecurity. This could be due to the mobility constraints certain counties may have to access those stores. In addition, Low SNAP store accessibility is also a significant variable for predicting food insecurity. This would also confirm the explanation for SNAP authorized stores per capita to have a positive relationship with food insecurity. Furthermore with the residual standard error at 6.48, this model does explain a substantial portion of the variation, however there is still unexplained structural and local components. Finally, this model residuals show no strong patterns of heterskedasticity,

Generalized least squares fit by REML
  Model: Foodinsecurity ~ Poverty_rate + TwoOrMorePct + BlackPct + HispPct +      log1p(SNAP_auth_stores) + PCT_18_younger + Household_burden +      Rural_percent * SNAP_low_access_pct 
  Data: foodinsec_train 
       AIC      BIC    logLik
  8182.282 8254.324 -4078.141

Variance function:
 Structure: Power of variance covariate
 Formula: ~Population 
 Parameter estimates:
     power 
-0.1120844 

Coefficients:
                                       Value Std.Error   t-value p-value
(Intercept)                        2.4867203 0.5463083  4.551862  0.0000
Poverty_rate                       0.3798557 0.0148820 25.524423  0.0000
TwoOrMorePct                       0.2104072 0.0317640  6.624085  0.0000
BlackPct                          -0.0294668 0.0043625 -6.754498  0.0000
HispPct                            0.0081823 0.0044504  1.838545  0.0661
log1p(SNAP_auth_stores)            2.0756585 0.3724924  5.572351  0.0000
PCT_18_younger                     0.0167300 0.0167836  0.996808  0.3190
Household_burden                   0.0596750 0.0181746  3.283423  0.0010
Rural_percent                     -0.0065568 0.0029350 -2.234008  0.0256
SNAP_low_access_pct                0.2822901 0.0278175 10.147932  0.0000
Rural_percent:SNAP_low_access_pct -0.0008148 0.0003303 -2.466486  0.0137

 Correlation: 
                                  (Intr) Pvrty_ TwOrMP BlckPc HspPct l1(SNA
Poverty_rate                       0.000                                   
TwoOrMorePct                      -0.245  0.040                            
BlackPct                           0.097 -0.315  0.225                     
HispPct                            0.151 -0.201  0.271  0.309              
log1p(SNAP_auth_stores)           -0.274 -0.408  0.012 -0.021 -0.029       
PCT_18_younger                    -0.804  0.027 -0.009 -0.103 -0.322  0.054
Household_burden                  -0.589 -0.225 -0.148 -0.230 -0.238  0.102
Rural_percent                     -0.335  0.007  0.091  0.148  0.156 -0.319
SNAP_low_access_pct               -0.038 -0.303 -0.051 -0.019  0.112 -0.141
Rural_percent:SNAP_low_access_pct  0.127  0.018  0.045  0.019 -0.027  0.174
                                  PCT_18 Hshld_ Rrl_pr SNAP__
Poverty_rate                                                 
TwoOrMorePct                                                 
BlackPct                                                     
HispPct                                                      
log1p(SNAP_auth_stores)                                      
PCT_18_younger                                               
Household_burden                   0.336                     
Rural_percent                      0.082  0.350              
SNAP_low_access_pct               -0.099  0.109  0.410       
Rural_percent:SNAP_low_access_pct  0.066 -0.127 -0.653 -0.852

Standardized residuals:
        Min          Q1         Med          Q3         Max 
-3.40533233 -0.71498854 -0.02769379  0.65913715  3.45616036 

Residual standard error: 6.481644 
Degrees of freedom: 1896 total; 1885 residual
# A tibble: 10 × 2
   Variable                            VIF
   <chr>                             <dbl>
 1 Rural_percent:SNAP_low_access_pct 12.1 
 2 SNAP_low_access_pct                9.48
 3 Rural_percent                      4.15
 4 Poverty_rate                       3.12
 5 log1p(SNAP_auth_stores)            1.97
 6 Household_burden                   1.91
 7 BlackPct                           1.62
 8 HispPct                            1.55
 9 PCT_18_younger                     1.27
10 TwoOrMorePct                       1.19

The last model is the gls model includes the Restaurant meal program participation as an interaction term with Low SNAP accessibility stores. This model indicates that the current RMP programs does not significantly reduce food insecurity at the county level. Given that food insecurity is strongly associated with access limitations, families with children, and financial burdens, the results imply that targeted expansion of RMP eligibility to counties facing mobility and socioeconomic constraints may improve program effectiveness.

Generalized least squares fit by REML
  Model: Foodinsecurity ~ Poverty_rate + TwoOrMorePct + BlackPct + HispPct +      log1p(SNAP_auth_stores) + PCT_18_younger + Household_burden +      Rural_percent + SNAP_low_access_pct + RMP_county * SNAP_low_access_pct 
  Data: foodinsec_train 
       AIC      BIC    logLik
  8177.598 8255.174 -4074.799

Variance function:
 Structure: Power of variance covariate
 Formula: ~Population 
 Parameter estimates:
     power 
-0.1130277 

Coefficients:
                                       Value Std.Error   t-value p-value
(Intercept)                        2.6444739 0.5437274  4.863602  0.0000
Poverty_rate                       0.3825901 0.0149667 25.562674  0.0000
TwoOrMorePct                       0.2117501 0.0317678  6.665552  0.0000
BlackPct                          -0.0287352 0.0043707 -6.574448  0.0000
HispPct                            0.0083772 0.0044575  1.879345  0.0604
log1p(SNAP_auth_stores)            2.1814556 0.3685300  5.919343  0.0000
PCT_18_younger                     0.0200236 0.0168275  1.189930  0.2342
Household_burden                   0.0564571 0.0182777  3.088843  0.0020
Rural_percent                     -0.0112245 0.0022332 -5.026107  0.0000
SNAP_low_access_pct                0.2197558 0.0147061 14.943218  0.0000
RMP_countyRMP                     -0.3674844 0.2781083 -1.321371  0.1865
SNAP_low_access_pct:RMP_countyRMP  0.0983417 0.0476417  2.064193  0.0391

 Correlation: 
                                  (Intr) Pvrty_ TwOrMP BlckPc HspPct l1(SNA
Poverty_rate                      -0.008                                   
TwoOrMorePct                      -0.253  0.038                            
BlackPct                           0.095 -0.310  0.222                     
HispPct                            0.156 -0.197  0.271  0.312              
log1p(SNAP_auth_stores)           -0.298 -0.422  0.006 -0.029 -0.027       
PCT_18_younger                    -0.822  0.034 -0.011 -0.104 -0.320  0.038
Household_burden                  -0.563 -0.229 -0.146 -0.222 -0.236  0.126
Rural_percent                     -0.339  0.032  0.159  0.212  0.182 -0.278
SNAP_low_access_pct                0.131 -0.545 -0.018 -0.013  0.162  0.020
RMP_countyRMP                     -0.052  0.022  0.034 -0.038 -0.041  0.015
SNAP_low_access_pct:RMP_countyRMP  0.007  0.046 -0.035  0.056  0.053 -0.064
                                  PCT_18 Hshld_ Rrl_pr SNAP_l__ RMP_RM
Poverty_rate                                                          
TwoOrMorePct                                                          
BlackPct                                                              
HispPct                                                               
log1p(SNAP_auth_stores)                                               
PCT_18_younger                                                        
Household_burden                   0.330                              
Rural_percent                      0.172  0.339                       
SNAP_low_access_pct               -0.079 -0.013 -0.363                
RMP_countyRMP                      0.061 -0.155  0.058  0.119         
SNAP_low_access_pct:RMP_countyRMP -0.005  0.093 -0.006 -0.137   -0.777

Standardized residuals:
        Min          Q1         Med          Q3         Max 
-3.43788282 -0.70046988 -0.01882958  0.66590193  3.43899672 

Residual standard error: 6.549499 
Degrees of freedom: 1896 total; 1884 residual
# A tibble: 11 × 2
   Variable                         VIF
   <chr>                          <dbl>
 1 Poverty_rate                    3.15
 2 RMP_county                      2.77
 3 SNAP_low_access_pct             2.65
 4 SNAP_low_access_pct:RMP_county  2.6 
 5 Rural_percent                   2.4 
 6 Household_burden                1.93
 7 log1p(SNAP_auth_stores)         1.93
 8 BlackPct                        1.63
 9 HispPct                         1.55
10 PCT_18_younger                  1.28
11 TwoOrMorePct                    1.19
Comparison of Food Insecurity Models
OLS (Weighted) GLS (Pop-Weighted) GLS + RMP
Poverty rate (%) 0.198*** 0.380*** 0.383***
(0.023) (0.015) (0.015)
Adult obesity rate (%) 0.086***
(0.014)
Two or more races (%) 0.206*** 0.210*** 0.212***
(0.034) (0.032) (0.032)
Black population (%) -0.048*** -0.029*** -0.029***
(0.007) (0.004) (0.004)
Hispanic population (%) -0.004 0.008+ 0.008+
(0.008) (0.004) (0.004)
Low SNAP access (%) 0.133*** 0.282*** 0.220***
(0.033) (0.028) (0.015)
HH low access & no vehicle (log) -0.296
(0.180)
SNAP authorized stores (log) 3.044*** 2.076*** 2.181***
(0.417) (0.372) (0.369)
Rural population (%) -0.017*** -0.007* -0.011***
(0.004) (0.003) (0.002)
Under 18 (%) 0.079*** 0.017 0.020
(0.019) (0.017) (0.017)
Household burden 0.168*** 0.060** 0.056**
(0.018) (0.018) (0.018)
Num.Obs. 1896 1896 1896
R2 0.735 0.667 0.667
R2 Adj. 0.733
RMSE 2.05 2.10 2.10
  • p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `across(c(RMSE, MAE), round, 3)`.
Caused by warning:
! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
Supply arguments directly to `.fns` through an anonymous function instead.

  # Previously
  across(a:b, mean, na.rm = TRUE)

  # Now
  across(a:b, \(x) mean(x, na.rm = TRUE))
# A tibble: 3 × 3
  Model                      RMSE   MAE
  <chr>                     <dbl> <dbl>
1 WLS (Population-Weighted)  2.05  1.59
2 GLS (Baseline)             2.10  1.66
3 GLS (With RMP)             2.10  1.66

The RMSE and MAE for the testing data has a spread of .13 and .05, indicating that the model does not overfit the data.

# A tibble: 3 × 3
  Model                      RMSE   MAE
  <chr>                     <dbl> <dbl>
1 WLS (Population-Weighted)  2.17  1.66
2 GLS (Baseline)             2.23  1.73
3 GLS (With RMP)             2.24  1.73

The residuals vs rural percentage indicates that Rural areas alone is not a key indicator of food insecurity. This could be due to other forms of available means to acquire food. However, low access to SNAP stores has a positive correlation with food insecurity.

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

Based on the residuals versus the key predictors, food insecurity is overpredicted at higher poverty rates and areas with lower accessibility to SNAP stores.

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

Intervention counties

By predicting the highest food insecured counties, we can choose the counties that could benefit by the restaurant meal program.

Conclusion

This research developed a generalized least squares model to explain and predict food insecurity in the US using socioeconomic, demographics, and food access indicators. By modeling heteroskedasticity through a population-based variance, the GLS improves upon a weighted regression model. These indicators make it possible to identify which counties are structurally vulnerable, and where policy interventions should be prioritized. Overall, this model explains why a county is at high risk rather, but it is still not a high accurate prediction engine. A next step could include a logistic regression to better predict high risk counties while maintaining interpretability.

References

References

Carlson, Steven J, Margaret S Andrews, and Gary W Bickel. 1999. “Measuring Food Insecurity and Hunger in the United States: Development of a National Benchmark Measure and Prevalence Estimates.” The Journal of Nutrition 129 (2): 510S–516S.
Feeding America. 2025. “What Is Food Insecurity?” 2025. https://www.feedingamerica.org/hunger-in-america/food-insecurity.
Hake, Monica, Emily Engelhard, and Adam Dewey. 2023. “Map the Meal Gap 2023: A Report on County and Congressional District Food Insecurity and County Food Cost in the United States in 2021.” Technical Report. Feeding America. https://www.feedingamerica.org/sites/default/files/2023-05/Map%20the%20Meal%20Gap%202023.pdf.
Irani, Zahir, Amir M Sharif, Habin Lee, Emel Aktas, Zeynep Topaloğlu, Tamara van’t Wout, and Samsul Huda. 2018. “Managing Food Security Through Food Waste and Loss: Small Data to Big Data.” Computers & Operations Research 98: 367–83.
Lentz, Erin C, Hope Michelson, Katherine Baylis, and Yang Zhou. 2019. “A Data-Driven Approach Improves Food Insecurity Crisis Prediction.” World Development 122: 399–409.
Mu, Dongyan, Naomi Horowitz, Maeve Casey, and Kimmera Jones. 2017. “Environmental and Economic Analysis of an in-Vessel Food Waste Composting System at Kean University in the US.” Waste Management 59: 476–86.
ReFED, Inc. 2025. “The Problem: Food Waste in the u.s.” ReFED. 2025. https://refed.org/food-waste/the-problem/?sort=economic-value-per-ton.
Soloha, Raimonda, and Elina Dace. 2025. “Research on Quantification of Food Loss and Waste in Europe: A Systematic Literature Review and Synthesis of Methodological Limitations.” Resources, Conservation & Recycling Advances 28: 200287. https://doi.org/10.1016/j.rcradv.2025.200287.
U.S. Department of Agriculture. 2021. “Thrifty Food Plan, 2021.” Food; Nutrition Service, USDA; https://www.fns.usda.gov/cnpp/thrifty-food-plan-2021.
U.S. Department of Agriculture, Economic Research Service. 2025. “Food Environment Atlas: Data Access and Documentation Downloads.” https://www.ers.usda.gov/data-products/food-environment-atlas/data-access-and-documentation-downloads/.
U.S. Environmental Protection Agency. 2025. “Excess Food Opportunities Map Data Download 3.1.” https://epa.maps.arcgis.com/home/item.html?id=62103e6a6f004217b9ee49fdfd1c2615.
U.S. Food and Drug Administration (FDA), U.S. Department of Agriculture (USDA), and U.S. Environmental Protection Agency (EPA). 2023. “FDA, USDA and EPA Propose National Strategy to Reduce u.s. Food Loss and Waste.” December 2023. https://www.fda.gov/news-events/press-announcements/fda-usda-and-epa-propose-national-strategy-reduce-us-food-loss-and-waste.
Xue, Li, Gang Liu, Julian Parfitt, Xuejun Liu, Erica Van Herpen, Åsa Stenmarck, Ciaran O’Connor, Karin Östergren, and Shengkui Cheng. 2017. “Missing Food, Missing Data? A Critical Review of Global Food Losses and Food Waste Data.” Environmental Science & Technology 51 (12): 6618–33. https://doi.org/10.1021/acs.est.7b00401.
Yang, Yi, Chen Zhao, and Hang Lu. 2024. “Leveraging Machine Learning for Food Waste Reduction: An Analysis of Predictive Models.” Applied and Computational Engineering 112: 154–60. https://direct.ewa.pub/proceedings/ace/article/view/18115.