The BRFSS survey tracks individuals who commit behaviors which place them at higher risk of HIV: unprotected sex, drug use and so forth. My theory is that individuals with poor mental health will be more likely to commit high-risk activities which could lead to HIV infection. The outcome is binary: “1” if someone has committed a behavior which is high-risk for HIV in the past year, “0” if they have not, “NA” for missing data.
For this assignment, I’ll be using
#bad mental health days
brfss16m<-brfss16m %>% mutate(bad_mental_days=case_when(.$menthlth %in% c(1:30)~.$menthlth,
.$menthlth==88~0,
.$menthlth %in% c(77,99)~NA_real_))
# HIV risk
brfss16m<-brfss16m %>% mutate(hiv_risk=case_when(.$hivrisk4==1~1,
.$hivrisk4==2~0,
.$hivrisk4 %in% c(7,9)~NA_real_))
Does exposure to risky behavior increase across MMSA’s?
anova(glm(hiv_risk ~ mmsa, brfss16m, family=binomial), test="Chisq")
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: hiv_risk
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 228761 77812
## mmsa 1 35.454 228760 77776 2.611e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It does indeed appear that propensity for HIV-spreading risky behaviors does change across geographic areas.
Here, I am using code from Dr. Spark’s example to construct higher-level predictors for the BRFSS from ACS data:
#higher-level predictors
library(acs)
## Loading required package: stringr
## Loading required package: XML
##
## Attaching package: 'acs'
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:base':
##
## apply
mykey<-api.key.install(key="")
#Get 2010 ACS median household incomes for tracts in Texas
msaacs<-geo.make(msa="*")
#ACS tables B17 is poverty, B19 is Gini, B25001 is housing vacancy, B25035 is median year built
acsecon<-acs.fetch(key=mykey, endyear=2010, span=5, geography=msaacs, variable = c("B19083_001","B17001_001","B17001_002" , "B25002_001","B25002_003", "B25035_001"))
msaecon<-data.frame(gini=acsecon@estimate[, "B19083_001"],
ppoverty=acsecon@estimate[, "B17001_002"]/acsecon@estimate[, "B17001_001"],
giniz=scale(acsecon@estimate[, "B19083_001"]),
pvacant=acsecon@estimate[,"B25002_003"]/acsecon@estimate[, "B25002_001"],
ppovertyz=scale(acsecon@estimate[, "B17001_002"]/acsecon@estimate[, "B17001_001"]),
pvacantz=scale(acsecon@estimate[,"B25002_003"]/acsecon@estimate[, "B25002_001"]),
medhouse=acsecon@estimate[, "B25035_001" ],
medhousez=scale(acsecon@estimate[, "B25035_001" ]))
msaecon$ids<-paste(acsecon@geography$metropolitanstatisticalareamicropolitanstatisticalarea)
head(msaecon)
## gini ppoverty giniz
## Adjuntas, PR Micro Area 0.525 0.5925656 2.718785
## Aguadilla-Isabela-San Sebastián, PR Metro Area 0.529 0.5577311 2.847562
## Coamo, PR Micro Area 0.532 0.5633984 2.944145
## Fajardo, PR Metro Area 0.483 0.4309070 1.366623
## Guayama, PR Metro Area 0.518 0.4980518 2.493425
## Jayuya, PR Micro Area 0.502 0.5446137 1.978315
## pvacant ppovertyz
## Adjuntas, PR Micro Area 0.1996029 6.271934
## Aguadilla-Isabela-San Sebastián, PR Metro Area 0.2005956 5.762039
## Coamo, PR Micro Area 0.1500353 5.844994
## Fajardo, PR Metro Area 0.3143172 3.905632
## Guayama, PR Metro Area 0.1951184 4.888474
## Jayuya, PR Micro Area 0.1562109 5.570031
## pvacantz medhouse
## Adjuntas, PR Micro Area 0.9053869 1978
## Aguadilla-Isabela-San Sebastián, PR Metro Area 0.9199516 1979
## Coamo, PR Micro Area 0.1781186 1979
## Fajardo, PR Metro Area 2.5885036 1977
## Guayama, PR Metro Area 0.8395882 1978
## Jayuya, PR Micro Area 0.2687281 1978
## medhousez ids
## Adjuntas, PR Micro Area 0.4923898 10260
## Aguadilla-Isabela-San Sebastián, PR Metro Area 0.5982027 10380
## Coamo, PR Micro Area 0.5982027 17620
## Fajardo, PR Metro Area 0.3865770 21940
## Guayama, PR Metro Area 0.4923898 25020
## Jayuya, PR Micro Area 0.4923898 27580
Now that we have higher-level predictors, we can construct GLM models with the data:
#joining the ACS and BRFSS data:
joindata<-merge(brfss16m, msaecon, by.x="mmsa",by.y="ids", all.x=T)
# bad mental health days scaled
joindata$bad_mental_daysz<-as.numeric(scale(joindata$bad_mental_days, center=T, scale=T))
Here I am creating two GLM models, one without higher-level predictors (fit.mix.bin) and one with two higher-level predictors (ppovertyz and giniz – fit.mix.bin2):
joindata$mmsa_fac<-as.factor(joindata$mmsa)
join_data<-joindata %>% filter(complete.cases(ppovertyz))
fit.mix.bin<-glmer(hiv_risk~male+bad_mental_daysz+educ+race_eth+(1|mmsa_fac), family=binomial, join_data,
control=glmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))
fit.mix.bin2<-glmer(hiv_risk~male+bad_mental_daysz+educ+race_eth+ppovertyz+(1|mmsa_fac), family=binomial, join_data,
control=glmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))
summary(fit.mix.bin)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## hiv_risk ~ male + bad_mental_daysz + educ + race_eth + (1 | mmsa_fac)
## Data: join_data
## Control:
## glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
##
## AIC BIC logLik deviance df.resid
## 54310.3 54431.2 -27143.1 54286.3 176103
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.7123 -0.2108 -0.1769 -0.1381 10.3922
##
## Random effects:
## Groups Name Variance Std.Dev.
## mmsa_fac (Intercept) 0.03297 0.1816
## Number of obs: 176115, groups: mmsa_fac, 123
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.838458 0.035518 -108.07 < 2e-16 ***
## male 0.764232 0.025680 29.76 < 2e-16 ***
## bad_mental_daysz 0.340030 0.009069 37.50 < 2e-16 ***
## educ0Prim -0.499613 0.094712 -5.28 1.33e-07 ***
## educ1somehs 0.076023 0.056535 1.34 0.17872
## educ3somecol 0.103274 0.032579 3.17 0.00152 **
## educ4colgrad -0.204533 0.033049 -6.19 6.06e-10 ***
## race_ethhispanic 0.685038 0.042995 15.93 < 2e-16 ***
## race_ethnh black 0.690156 0.039465 17.49 < 2e-16 ***
## race_ethnh multirace 0.721393 0.074520 9.68 < 2e-16 ***
## race_ethnh other 0.275873 0.068583 4.02 5.76e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) male bd_mn_ edc0Pr edc1sm edc3sm edc4cl rc_thh rc_thb
## male -0.439
## bd_mntl_dys -0.176 0.098
## educ0Prim -0.133 -0.007 -0.023
## educ1somehs -0.253 0.002 -0.054 0.124
## educ3somecl -0.509 0.025 0.013 0.178 0.304
## educ4colgrd -0.512 -0.005 0.094 0.167 0.289 0.535
## rac_thhspnc -0.193 -0.003 0.013 -0.165 -0.089 0.036 0.083
## rc_thnhblck -0.218 0.035 0.005 -0.006 -0.046 0.021 0.076 0.162
## rc_thnhmltr -0.079 -0.013 -0.051 -0.004 -0.016 -0.015 0.019 0.079 0.088
## rc_thnhothr -0.076 -0.026 -0.008 -0.017 -0.016 0.003 -0.013 0.092 0.090
## rc_thm
## male
## bd_mntl_dys
## educ0Prim
## educ1somehs
## educ3somecl
## educ4colgrd
## rac_thhspnc
## rc_thnhblck
## rc_thnhmltr
## rc_thnhothr 0.047
summary(fit.mix.bin2)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## hiv_risk ~ male + bad_mental_daysz + educ + race_eth + ppovertyz +
## (1 | mmsa_fac)
## Data: join_data
## Control:
## glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
##
## AIC BIC logLik deviance df.resid
## 54310.9 54441.9 -27142.4 54284.9 176102
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.7121 -0.2110 -0.1766 -0.1382 10.4031
##
## Random effects:
## Groups Name Variance Std.Dev.
## mmsa_fac (Intercept) 0.03174 0.1782
## Number of obs: 176115, groups: mmsa_fac, 123
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.856814 0.038646 -99.80 < 2e-16 ***
## male 0.763916 0.025666 29.76 < 2e-16 ***
## bad_mental_daysz 0.340261 0.009071 37.51 < 2e-16 ***
## educ0Prim -0.499523 0.094158 -5.31 1.13e-07 ***
## educ1somehs 0.076195 0.056361 1.35 0.17640
## educ3somecol 0.103465 0.032555 3.18 0.00148 **
## educ4colgrad -0.204201 0.033013 -6.19 6.19e-10 ***
## race_ethhispanic 0.693624 0.043463 15.96 < 2e-16 ***
## race_ethnh black 0.692533 0.039455 17.55 < 2e-16 ***
## race_ethnh multirace 0.721963 0.074507 9.69 < 2e-16 ***
## race_ethnh other 0.276212 0.068378 4.04 5.36e-05 ***
## ppovertyz -0.041809 0.034950 -1.20 0.23160
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) male bd_mn_ edc0Pr edc1sm edc3sm edc4cl rc_thh rc_thb
## male -0.399
## bd_mntl_dys -0.170 0.098
## educ0Prim -0.122 -0.007 -0.023
## educ1somehs -0.232 0.002 -0.055 0.124
## educ3somecl -0.468 0.025 0.013 0.177 0.302
## educ4colgrd -0.473 -0.005 0.095 0.166 0.287 0.534
## rac_thhspnc -0.241 -0.004 0.016 -0.162 -0.086 0.037 0.084
## rc_thnhblck -0.220 0.035 0.006 -0.006 -0.046 0.021 0.077 0.169
## rc_thnhmltr -0.074 -0.013 -0.051 -0.003 -0.015 -0.015 0.019 0.079 0.088
## rc_thnhothr -0.070 -0.027 -0.009 -0.014 -0.015 0.002 -0.013 0.091 0.090
## ppovertyz 0.406 0.011 -0.022 -0.001 -0.003 -0.004 -0.006 -0.165 -0.048
## rc_thm rc_tho
## male
## bd_mntl_dys
## educ0Prim
## educ1somehs
## educ3somecl
## educ4colgrd
## rac_thhspnc
## rc_thnhblck
## rc_thnhmltr
## rc_thnhothr 0.049
## ppovertyz -0.004 -0.001