Analyzing Social Vulnerability and Mental Health Outcomes at the County Level

Shota Hasui

2023-06-10

The idea for this project was to understand if there was an association, at the county level, between social vulnerability and mental health outcomes. To assess this, I used two publicly available datasets. The CDC’s Social Vulnerability Index (SVI) and the University of Wisconsin’s County Health Rankings . SVI is an index that quantifies social vulnerability which “refers to the potential negative effects on communities caused by external stresses on human health.” The County Health Rankings “measures the health of nearly all counties in the nation and ranks them within states. The Rankings are compiled using county-level measures from a variety of national and state data sources. These measures are standardized and combined using scientifically-informed weights.”

The first step necessary was to join the datasets on the FIPS code. The FIPS code is a unique identifier that every county in the U.S. has.

library(tidyverse)
library(janitor)
library(MatchIt)
library(WeightIt)
library(cobalt)
library(sandwich)
library(lmtest)
library(tipr)
library(biscale)
library(sf)
library(cowplot)
library(tigris)
svi = read_csv("SVI2020_US.csv") %>% clean_names()
county = read_csv("analytic_data2020_0.csv") %>% clean_names()
county = county %>% select(x5_digit_fips_code, poor_mental_health_days_raw_value)
county = county[-1, ]
county$fips = county$x5_digit_fips_code
county = county %>% select(fips, poor_mental_health_days_raw_value)
svi$fips <- substr(svi$fips, start=1, stop=5)
df <- full_join(svi, county, by = "fips")

Next, I wanted to visualize how these two variables are distributed in the U.S. For this, I decided to use a bivalent choropleth map. This type of map allows you to visually assess areas where only one variable is high, where both variables are low, and every such combination.

The SVI variable is called svi and the County Health mental health outcomes variable is called mh.

df$svi = df$rpl_themes
df <- na.omit(df, "svi")
df <- df %>% filter(svi != -999)
quantile(as.numeric(df$poor_mental_health_days_raw_value), na.rm = TRUE)
##       0%      25%      50%      75%     100% 
## 2.532818 3.635372 4.017173 4.382190 6.313753
quantile(as.numeric(df$svi), na.rm = TRUE)
##   0%  25%  50%  75% 100% 
## 0.00 0.25 0.50 0.75 1.00
df$mh <- as.numeric(df$poor_mental_health_days_raw_value)
svi_mh <- bi_class(df, x = svi, y = mh, style = "quantile", dim = 3)
svi_mh <- svi_mh %>% select(fips, svi, mh, bi_class)
us_counties <- counties(cb = TRUE, resolution = "20m")
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |=======================                                               |  34%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |==============================                                        |  44%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |======================================================================|  99%
  |                                                                            
  |======================================================================| 100%
us_counties_shifted <- shift_geometry(us_counties) %>% st_as_sf()
us_counties_shifted$fips <- paste0(us_counties_shifted$STATEFP, us_counties_shifted$COUNTYFP)
bipal_map <- us_counties_shifted %>%
  left_join(svi_mh, by = "fips")
svi_mh_map <- ggplot() + 
  geom_sf(data = bipal_map, aes(fill = bi_class), show.legend = FALSE) + 
  bi_scale_fill(pal = "PurpleOr", dim = 3) +
  bi_theme() +
  labs(title = 'SVI and MH in the US',)
legend <- bi_legend(pal = "PurpleOr",
                    dim = 3,
                    xlab = "Higher SVI",
                    ylab = "Worse MH",
                    size = 6)
final_svi_mh <- ggdraw() +
  draw_plot(svi_mh_map, 0, 0, 1, 1) +
  draw_plot(legend, 0.83, .1, 0.2, 0.2)
final_svi_mh

We can see that many counties have both high rates of social vulnerability and mental health adverse outcomes. This seems particularly the case in the south. Now, we need to understand if these associations occur just by happenstance.

The first method I am going to use to assess this is logistic regression.

For this, I made both the svi and mh variables binary, with only the highest 10% of svi being classified as high risk and only the highest 20% mental health outcomes being classified as high risk.

I ran both an unadjusted model and an adjusted one. In the adjusted model we included factors such as unemployment, disability, county level poverty, and percent uninsured.

df$svi <- ifelse(df$svi < 0.89909, 0, 1)
df$mh <- ifelse(df$mh < 4.382615, 0, 1)
model1 = glm(mh ~ svi, data = df, family = binomial(link = "logit"))
summary(model1)
## 
## Call:
## glm(formula = mh ~ svi, family = binomial(link = "logit"), data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.7784  -0.7558  -0.7558  -0.7558   1.6688  
## 
## Coefficients:
##              Estimate Std. Error  z value Pr(>|z|)    
## (Intercept) -1.106870   0.008455 -130.913  < 2e-16 ***
## svi          0.068068   0.026214    2.597  0.00941 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 93673  on 83327  degrees of freedom
## Residual deviance: 93666  on 83326  degrees of freedom
## AIC: 93670
## 
## Number of Fisher Scoring iterations: 4
exp(coef(model1))
## (Intercept)         svi 
##   0.3305922   1.0704378
model2 = glm(mh ~ svi + m_nohsdp + e_uninsur + m_age65 + m_disabl + m_minrty + mp_pov150 + ep_unemp, data=df, family = binomial(link = "logit"))
summary(model2)
## 
## Call:
## glm(formula = mh ~ svi + m_nohsdp + e_uninsur + m_age65 + m_disabl + 
##     m_minrty + mp_pov150 + ep_unemp, family = binomial(link = "logit"), 
##     data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -5.4729  -0.7681  -0.6380  -0.0721   6.2327  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.678e+00  2.487e-02 -67.458  < 2e-16 ***
## svi         -1.369e-01  3.099e-02  -4.419 9.91e-06 ***
## m_nohsdp     6.090e-04  1.268e-04   4.803 1.57e-06 ***
## e_uninsur    1.641e-04  3.401e-05   4.825 1.40e-06 ***
## m_age65     -8.519e-04  1.022e-04  -8.336  < 2e-16 ***
## m_disabl     3.554e-03  1.092e-04  32.540  < 2e-16 ***
## m_minrty    -1.999e-03  4.132e-05 -48.371  < 2e-16 ***
## mp_pov150    5.557e-02  1.864e-03  29.818  < 2e-16 ***
## ep_unemp     5.136e-02  1.879e-03  27.327  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 93673  on 83327  degrees of freedom
## Residual deviance: 88171  on 83319  degrees of freedom
## AIC: 88189
## 
## Number of Fisher Scoring iterations: 4
exp(coef(model2))
## (Intercept)         svi    m_nohsdp   e_uninsur     m_age65    m_disabl 
##   0.1867810   0.8720196   1.0006092   1.0001641   0.9991485   1.0035606 
##    m_minrty   mp_pov150    ep_unemp 
##   0.9980031   1.0571474   1.0526989

Interestingly, we see a different interpretation of the effect of svi on mental health between the simple and complex model. In the simple, we see that higher svi is associated with a slightly higher odds of adverse mental health outcomes but in the complex we see the opposite. Both are statistically significant.

It seems there is confounding present.

Next, I want to use a causal inference technique called propensity score matching to see what inference we arrive at. The core idea behind propensity score matching is to make the two groups you are comparing (high svi and low svi in this case) as similar as possible on the confounders you are adjusting for. We assess this using a love plot, hoping to see a mean difference between the groups below 0.2.

love.plot(svi ~ m_nohsdp + e_uninsur + m_age65 + m_disabl + m_minrty + mp_pov150 + ep_unemp, data=df, abs = T, thresholds = 0.2)

m.out <- matchit(svi ~ m_nohsdp + e_uninsur + m_age65 + m_disabl + m_minrty + mp_pov150 + ep_unemp, data = df, method = "nearest")
love.plot(svi ~ m_nohsdp + e_uninsur + m_age65 + m_disabl + m_minrty + mp_pov150 + ep_unemp, data=df, weights = list(PSM = m.out), abs = T, thresholds = 0.2)

matched = match.data(m.out, data=df)
model3 = glm(mh ~ svi, data = matched, weights = weights, family = binomial(link="logit"))
confint(model3, vcov. = vcovCL, cluster = ~subclass)
##                  2.5 %      97.5 %
## (Intercept) -1.0003647 -0.90500783
## svi         -0.1544041 -0.01819132
coeftest(model3, vcov. = vcovCL, cluster = ~subclass)
## 
## z test of coefficients:
## 
##              Estimate Std. Error  z value Pr(>|z|)    
## (Intercept) -0.952518   0.024326 -39.1557  < 2e-16 ***
## svi         -0.086284   0.034960  -2.4681  0.01358 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coeftest(model3, vcov. = vcovCL, cluster = ~subclass))
## 
## z test of coefficients:
## 
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.38577    1.02462  0.0000    1.000
## svi          0.91733    1.03558  0.0847    1.014

We see in the initial love plot that all but one covariate have mean differences above 0.2 But, after matching, the differences become miniscule – the adjusted blue dots are all close to 0.0. We are comparing like with like.

Now, we can run a simple logistic regression between svi and mh on the matched data. I chose not to adjust further since the balance was already very good based on the love plot.

The odds ratio is 0.92, showing a small reduction of odds in places with high social vulnerability. This is more consistent with our adjusted logistic regression model.

Finally, I wanted to conduct sensitivity analysis using the tipr package. The e-value tells us how strong the association between an unobserved confounder would have to have with our parameters.

exp(e_value(effect_observed = -0.086284))
## [1] 1.631755

Given the e-value here (1.63), this unmeasured confounder would only need to have an odds ratio of 1.6 with the outcome and exposure for these effects to become null. Thus, I would argue that these results are not terribly robust.

I was glad to find that the sensitivity analysis came to the results it did as the findings here don’t quite mesh with my intuition. Given the limitations of the ecological nature of the study, and the low number of covariates adjusted for, there are likely many confounders that gave rise to the estimates we see.

I still believe places with high social vulnerability likely have higher rates of mental health adverse outcomes but this is not the case in this analysis.

My hypothesis is that further, more robust research likely bears this out.