Understanding the multifaceted impact of socioeconomic factors on well-being remains a pivotal challenge in developmental and social sciences. Recent research has illuminated the significant role that economic disparities play in influencing health outcomes and developmental trajectories, particularly among children and their families. Notably, studies such as Weissman et al. (2023) have highlighted how state-level economic policies and cost of living can mediate the association between family income and child brain development and mental health. Building upon this foundation, our study seeks to expand the scope of inquiry by examining how varying levels of income influence child and parent well-being through different spectrums of material hardship and economic assistance, incorporating a nuanced analysis of both environmental and policy-driven moderators.
The detrimental effects of poverty extend beyond simple measures of income, often manifesting through material hardships that include food insecurity, inadequate housing, and unstable living conditions. These hardships can exert profound psychological and physical effects on families, potentially exacerbating the stressors associated with low income. Moreover, the assistance sought by families, whether in the form of direct financial aid or supportive services, may provide a buffering effect against these hardships. Our analysis employs a dual-model approach to dissect these dynamics further: the first model explores how the cost of living interacts with income and material hardships to affect well-being, while the second model assesses the impact of the type and extent of aid requested by families on their overall well-being.
By integrating these dimensions, this study aims to provide a more comprehensive understanding of the complex interplay between economic factors and family well-being. This approach not only contributes to the theoretical discourse on socioeconomic impacts but also holds practical implications for policy-making and the design of interventions aimed at alleviating poverty’s effects on vulnerable populations. Through this research, we aspire to refine the metrics and models used to assess poverty, advocating for policies that recognize and respond to the nuanced realities faced by low-income families.
This study addresses the following critical questions:
Study 1 aims to explore
Based on the literature and the conceptual framework guiding this study, we propose the following hypotheses: (1) Higher cost of living will exacerbate the negative effects of low income on family well-being, particularly for families experiencing greater material hardships. However, these effects will be mitigated in contexts where material hardships are less severe. (2) Families receiving more comprehensive aid, in terms of both scope and magnitude, will exhibit better well-being outcomes, with this effect being more pronounced among those facing higher levels of material hardship and living in areas with a higher cost of living.
Data for this study were obtained from the Rapid Assessment of Pandemic Impact on Development–Early Childhood (RAPID-EC) project. This national project conducts weekly or biweekly surveys to explore how the pandemic affects households with children aged 0 to 5 years. The University of Oregon’s and Stanford University’s institutional review boards approved all study procedures. Participants were recruited via community organization email lists, Facebook advertisements, and panel services. Initially, families completed an online survey to verify eligibility. Those who qualified provided consent online and filled out a baseline survey covering demographics, employment and financial challenges, health and well-being, and access to childcare. After completing this initial survey, families joined a participant pool and received email invitations for follow-up surveys. These follow-up surveys revisited core baseline topics and introduced new special topics on a weekly or biweekly schedule. Each sampling point for the follow-up surveys was demographically representative of the U.S. population concerning race, income, and geographic location. Families were compensated $5 for each survey they completed.
########################## Clean Data - County Data ###########################
# Step 1: Cleaning master_dem zip code column
master_dem <- master_dem %>%
mutate(zipcode1 = zipcode) %>% # use new column to track changes
# Mark zip codes that need modification as NA
mutate(zipcode1 = ifelse(grepl("^\\d{5}$", zipcode1) |
grepl("^\\d{5}-\\d{4}$", zipcode1),
ifelse(grepl("^\\d{5}-\\d{4}$", zipcode1),
sub("-.*", "", zipcode1),
zipcode1),
NA)) %>%
# Add a column to indicate whether the original zipcode was modified for QA
mutate(zipcode_modified = ifelse(is.na(zipcode1) & !is.na(zipcode), TRUE, FALSE))
# Step 2: Merge zip codes and fips codes
# The major issue with this step is that 20% of the data will have a single zip
# code with multiple fips codes which will make merging RAPID dataset a problem
# later down the line. So first, only use zip codes that are relevant in the data
# Step 2.1: Extract all zip codes in rapid_data into a vector
zip_codes_in_rapid <- unique(master_dem$zipcode1) # use zipcode1 because it is transformed
# Step 2.2: Subset zip_data to only include data with zip codes in RAPID
zip_data <- zip_data %>%
filter(zipcode %in% zip_codes_in_rapid)
# Step 2.3: figure out zip codes for each county fips code
zipcodes_by_county <- zip_data %>%
group_by(county_fips, County, State) %>%
summarise(zipcodes = paste(unique(zipcode), collapse = ", "), # column with zips
n_zipcodes = n_distinct(zipcode)) %>% # this column tells us how many zips per county
ungroup()
# Step 2.4: Add zipcodes column from zipcodes_by_county to county_data
county_data <- county_data %>%
left_join(zipcodes_by_county, by = c("county_fips", "State", "County")) %>%
drop_na(zipcodes) # if NA = no counties represented in RAPID data (confirm!)
# Step 3: Merge county_data with master_dem data
# Here, we need to be careful with how we merge; since some zip codes will be able to be
# matched by a perfect combination of zip code, nchild, and nfamily, that is fine.
# However, some will produce multiple matches with county data because of multiple zip codes
# existing. For this, we are just going to take the AVERAGE grouped by zip, child, and family.
# Step 3.1: Figure out which zip codes have multiple county fips codes
zip_multi_fips <- zip_data %>%
group_by(zipcode) %>%
summarise(county_fips_codes = paste(unique(county_fips), collapse = ", "),
n_fips = n_distinct(county_fips)) %>%
filter(n_fips > 1) %>%
ungroup()
# Step 3.2: Create a list of all zip codes with multiple fips
zip_codes_with_issues <- zip_multi_fips$zipcode
# Step 3.3: Create a column that signals which zip codes need to be modified
# this was necessary to help ensure we are modifying the right ones correctly for QA
master_dem <- master_dem %>%
mutate(zip_issues = zipcode1 %in% zip_codes_with_issues)
county_data <- county_data %>%
separate_rows(zipcodes, sep = ",\\s*") %>%
mutate(zip_issues = zipcodes %in% zip_codes_with_issues)
# Step 3.4: Ensure single unique combination by zip code, nchild, and nfamily
county_data_final <- county_data %>%
group_by(zipcodes, nchild, nfamily) %>%
summarise(Housing_Monthly = mean(Housing_Monthly, na.rm = TRUE),
Food_Monthly = mean(Food_Monthly, na.rm = TRUE),
Transportation_Monthly = mean(Transportation_Monthly, na.rm = TRUE),
Healthcare_Monthly = mean(Healthcare_Monthly, na.rm = TRUE),
OtherNecessities_Monthly = mean(OtherNecessities_Monthly, na.rm = TRUE),
Childcare_Monthly = mean(Childcare_Monthly, na.rm = TRUE),
Taxes_Monthly = mean(Taxes_Monthly, na.rm = TRUE),
Total_Monthly = mean(Total_Monthly, na.rm = TRUE),
Housing_Yearly = mean(Housing_Yearly, na.rm = TRUE),
Food_Yearly = mean(Food_Yearly, na.rm = TRUE),
Transportation_Yearly = mean(Transportation_Yearly, na.rm = TRUE),
Healthcare_Yearly = mean(Healthcare_Yearly, na.rm = TRUE),
OtherNecessities._Yearly = mean(OtherNecessities._Yearly, na.rm = TRUE),
Childcare_Yearly = mean(Childcare_Yearly, na.rm = TRUE),
Taxes_Yearly = mean(Taxes_Yearly, na.rm = TRUE),
Total_Yearly = mean(Total_Yearly, na.rm = TRUE),
median_family_income = mean(median_family_income, na.rm = TRUE)
) %>%
rename(zipcode = zipcodes)
# Step 3.5: Add county data to master_dem
# Step 1: Group by nchild and nfamily in master_dem and count the number of rows in each group
combo_count <- master_dem %>%
group_by(nchild, nfamily) %>%
summarise(count = n(), .groups = "drop") # Ungroup after summarising
# Step 2: Extract distinct combinations from county_data_final
distinct_county_combinations <- county_data_final %>%
ungroup() %>%
dplyr::select(nchild, nfamily) %>%
distinct()
# Step 3: Identify the combinations missing in county_data_final
missing_combinations <- combo_count %>%
anti_join(distinct_county_combinations, by = c("nchild", "nfamily"))
# Step 4: Create a new column in master_dem indicating whether a combination is missing
master_dem <- master_dem %>%
left_join(missing_combinations, by = c("nchild", "nfamily")) %>%
mutate(missing_combination = !is.na(count)) %>%
dplyr::select(-count) %>%
left_join(county_data_final, by = c("zipcode", "nchild", "nfamily"))
## aid_inv$ReceivedAid n percent valid_percent
## 0 8924 0.5192901 0.7019034
## 1 3790 0.2205412 0.2980966
## NA 4471 0.2601688 NA
## aid_inv$ReceivedAid_basic1 n percent valid_percent
## 0 9207 0.5357579 0.7241623
## 1 3507 0.2040733 0.2758377
## NA 4471 0.2601688 NA
## aid_inv$ReceivedAid_basic2 n percent valid_percent
## 0 9059 0.5271458 0.7125216
## 1 3655 0.2126855 0.2874784
## NA 4471 0.2601688 NA
## aid_inv$ReceivedAid_other n percent valid_percent
## 0 12151 0.70707012 0.95571811
## 1 563 0.03276113 0.04428189
## NA 4471 0.26016875 NA
## aid_inv$Health_Medical_Services n percent valid_percent
## 0 10511 0.6116381 0.8267264
## 1 2203 0.1281932 0.1732736
## NA 4471 0.2601688 NA
## aid_inv$Food_Benefits n percent valid_percent
## 0 10199 0.5934827 0.8021866
## 1 2515 0.1463486 0.1978134
## NA 4471 0.2601688 NA
## aid_inv$Income_Benefits n percent valid_percent
## 0 12148 0.7068955 0.95548215
## 1 566 0.0329357 0.04451785
## NA 4471 0.2601688 NA
## aid_inv$Disability_Benefits n percent valid_percent
## 0 12361 0.71929008 0.97223533
## 1 353 0.02054117 0.02776467
## NA 4471 0.26016875 NA
## aid_inv$Military_Benefits n percent valid_percent
## 0 12637 0.735350596 0.993943684
## 1 77 0.004480652 0.006056316
## NA 4471 0.260168752 NA
## aid_inv$Housing_Benefits n percent valid_percent
## 0 12338 0.71795170 0.9704263
## 1 376 0.02187955 0.0295737
## NA 4471 0.26016875 NA
## aid_inv$Childcare_Subsidy n percent valid_percent
## 0 12287 0.71498400 0.96641498
## 1 427 0.02484725 0.03358502
## NA 4471 0.26016875 NA
## aid_inv$Transportation_Benefits n percent valid_percent
## 0 12630 0.734943264 0.99339311
## 1 84 0.004887984 0.00660689
## NA 4471 0.260168752 NA
## aid_inv$Training_Benefits n percent valid_percent
## 0 12630 0.734943264 0.99339311
## 1 84 0.004887984 0.00660689
## NA 4471 0.260168752 NA
## aid_inv$Clothing_Benefits n percent valid_percent
## 0 12670 0.737270876 0.996539248
## 1 44 0.002560372 0.003460752
## NA 4471 0.260168752 NA
## aid_inv$Unemployment_Benefits n percent valid_percent
## 0 12259 0.71335467 0.96421268
## 1 455 0.02647658 0.03578732
## NA 4471 0.26016875 NA
## aid_inv$Other_Benefits n percent valid_percent
## 0 12322 0.71702066 0.96916785
## 1 392 0.02281059 0.03083215
## NA 4471 0.26016875 NA
# Code FSTR module to get a couple of variable to use in follow-up analyses:
## fs_any: binary variable indicating whether reporting at least one types of hardships
## fs_num: continuous numeric variable indicating how many types of hardships they reported in total
## fs_hardship: direct taken from the question "how hard it is to pay for basic needs?" with a 4-point Likert scale response.
# Here we take the most recent response (when multiple are available)
# to be consistent with the operation of taking the most recent income levels.
fs <- rapid_data %>%
# select necessray variables in FSTR module using contains()
dplyr::select (CaregiverID, StartDate, SurveyType, Week,
FSTR.001, contains ("FSTR.002"),
contains("JOB.015.a.2"), JOB.008.2,
STRESS.002) %>%
mutate (fs_hardship = FSTR.001,
fs_food = FSTR.002_1,
fs_housing = FSTR.002_2,
fs_utility = FSTR.002_3,
fs_healthcare = FSTR.002_4,
fs_childcare = FSTR.002_7,
fs_wellbeing = case_when (FSTR.002_5 == 1|FSTR.002_6==1|FSTR.002_10==1 ~ 1),
fs_any = ifelse (fs_food == 1 |
fs_housing == 1 |
fs_utility == 1|
fs_healthcare == 1|
fs_childcare == 1|
fs_wellbeing == 1, 1, 0),)
fs_items <- c ("fs_food", "fs_housing", "fs_utility",
"fs_healthcare", "fs_childcare", "fs_wellbeing")
fs$fs_num = rowSums(fs[,fs_items],na.rm=T)
fs <- fs %>%
mutate (fs_any = case_when (fs_any == 1 ~ 1,
fs_hardship == 0 ~ 0,
TRUE ~ NA_real_))
fs_rct <- fs %>%
group_by (CaregiverID)%>%
# filter (is.na(fs_any) == F)%>% (why was this here?)
filter (Week == max (Week))%>%
ungroup()%>%
dplyr::select(CaregiverID, Week,
fs_any, fs_num, fs_hardship, fs_food, fs_housing,
fs_utility, fs_healthcare, fs_childcare, fs_wellbeing,
STRESS.002, contains("JOB.015.a.2"), JOB.008.2)%>%
mutate(across(all_of(fs_items), ~ifelse(is.na(.x), 0, .x)))
### Final Data Prep
final_data_prep <- fs_rct %>%
# Add demographic data to final_data by CaregiverID
merge(master_dem_final,
by = "CaregiverID",
all.x = T) %>%
# filter out INRs too high or too low (N = 125) or missing (9%)
dplyr::filter(abs(log_INR) <= 4) %>%
# grand mean center INR
mutate(centered_logINR = log_INR - mean(log_INR),
# factor income group
income_group = factor(income_group,
levels = c("Below FPL",
"100-200% Above FPL",
"200-400% Above FPL",
"400%+ Above FPL")),
aid_group = factor(ReceivedAid_basic1,
levels = c("0", "1")),
Health_Medical_Services = factor(Health_Medical_Services,
levels = c("0", "1")),
Food_Benefits = factor(Food_Benefits,
levels = c("0", "1"))
)
final_data <- final_data_prep %>%
select(CaregiverID,
# select potential dependent variables
fs_num, fs_any, fs_food, fs_healthcare,
# select independent variables for model and plots
log_INR, centered_logINR, income_group, aid_group,
COLI, COLI_Merge, COLI_Merge_food, COLI_Merge_health,
# select COLI median variables
contains("median"),
aid_group, Health_Medical_Services, Food_Benefits,
# select potential demo + control variables
race_ethnic, Page_impute, Pgender, Pedu, region
) %>%
# condense Pedu variable
mutate(Pedu_condensed = factor(
case_when(
Pedu %in% c("Less than high school", "Some high school", "High school diploma/GED") ~ "High School/GED or Below",
Pedu %in% c("Some college", "Associate degree", "other") ~ "Some College/Associate Degree",
Pedu == "Bachelor's degree" ~ "Bachelor's Degree",
Pedu %in% c("Master's degree", "Doctorate/Professional") ~ "Postgraduate Degrees",
TRUE ~ "Other"
),
levels = c("High School/GED or Below", "Some College/Associate Degree", "Bachelor's Degree", "Postgraduate Degrees", "Other"),
ordered = TRUE
)) %>%
mutate(aid_group = as.factor(aid_group),
Page_impute = as.numeric(Page_impute),
race_ethnic = factor(race_ethnic,
levels = c("White", "Black", "Latinx", "Other minorities"),
ordered = FALSE))
# Summary of variables in analysis
sum(is.na(final_data$fs_num))
## [1] 0
sum(is.na(final_data$COLI)) # 3368 (20%)
## [1] 3368
sum(is.na(final_data$log_INR))
## [1] 0
sum(is.na(final_data$aid_group)) # 3873 (25%)
## [1] 3873
################################ IMPUTATION ###################################
# impute the missing data
imputed_data <- final_data %>%
# select on necessary variables
select(fs_num, COLI, log_INR, aid_group,
Page_impute, Pedu_condensed, race_ethnic)
# use missForest package to impute from Mateus suggestions / code
imputed_data <- missForest(imputed_data)$ximp
imputed_data_final <- imputed_data %>%
# Add groups for new graph with imputed data
mutate(coli_group = case_when(COLI >= 1 ~ "High CoL",
COLI < 1 ~ "Low CoL"),
# use new log_INR thresholds to keep income groups the same
income_group = factor(
case_when(
log_INR < 0 ~ "Below FPL",
log_INR >= 0 & log_INR < 0.693 ~ "100-200% Above FPL",
log_INR >= 0.693 & log_INR < 1.386 ~ "200-400% Above FPL",
log_INR >= 1.386 ~ "400%+ Above FPL"
),
levels = c("Below FPL",
"100-200% Above FPL",
"200-400% Above FPL",
"400%+ Above FPL")),
fs_any = ifelse(fs_num > 0, 1, 0)
)
The following graphs show proportion of families experiencing material hardship by each of the following groupings:
Next, we show 2-way graphs showing different combinations of the following:
Now we show a 3-way graph of each of the primary variables
Given the large percentage of data missing from CoL and Aid, we also provide below graphs using the imputed data