Val Pocus
October 11, 2015
Zhou et. al looked at the effect of homeownership on the number of mentally unhealthy days using the 2008 BRFSS data. They evaluated results using logistic, linear, Poisson, negative binomial, and zero-inflated negative binomial models.
Zhou, Hong, et al. “Peer Reviewed: Models for Count Data With an Application to Healthy Days Measures: Are You Driving in Screws With a Hammer?.” Preventing chronic disease 11 (2014).
This assignment attempted to:
Treatment of data and variables
##All states have mentally unhealthy days. For this analysis, will limit to states that had mentally unhealthy days available in the 2009 BRFSS anyway (for comparison to Zhou et. al article): Alabama, Arkansas, California, Hawaii, Illinois, Kansas, Louisiana, Nebraska, New Mexico, Oklahoma, South Carolina, Wisconsin)
BRFSS13 <- BRFSS13[BRFSS13$X_STATE %in% c(1, 5, 6, 15, 17, 20, 22, 31, 35, 40, 45, 55), ]
BRFSS13$X_STATE<-factor(BRFSS13$X_STATE, labels = c("Alabama", "Arkansas", "California", "Hawaii", "Illinois", "Kansas", "Louisiana", "Nebraska", "New Mexico", "Oklahoma", "South Carolina", "Wisconsin"))
##Looking at states where number of mentally unhealthy days is available
BRFSS13$X_STATE3 <- factor(BRFSS13$X_STATE)
###Greate new age variable to exclude under age 35
BRFSS13$X_AGE_G2 <- factor(BRFSS13$X_AGE_G, labels=c("Age 18 to 24", "Age 25 to 34", "Age 35 to 44", "Age 45 to 54", "Age 55 to 64", "Age 65+"), levels=1:6)
levels(BRFSS13$X_AGE_G2)<-c(NA ,NA, "Age 35 to 44", "Age 45 to 54", "Age 55 to 64", "Age 65+")
###Creating new race/ethnicity variable
BRFSS13$X_RACEGR3C <- factor(BRFSS13$X_RACEGR3)
levels(BRFSS13$X_RACEGR3C) <- c("White Non-Hispanic", "Black Non-Hispanic", "Others", "Others", "Hispanic", NA, NA)
###Creating new education variable
BRFSS13$X_EDUCAG2 <- factor(BRFSS13$X_EDUCAG)
levels(BRFSS13$X_EDUCAG2) <- c("Less than high school", "High school graduate", "<4 yr of college", "=> 4 y of college", NA, NA)
###Creating new income variable
BRFSS13$INCOME3 <- factor(BRFSS13$X_INCOMG)
levels(BRFSS13$INCOME3) <- c("<25,000", "<25,000","25,000 to <50,000", "25,000 to <50,000","50,000 or more", "Unknown")
###Creating new marital status variable
BRFSS13$MARITAL <- factor(BRFSS13$X_IMPMRTL)
levels(BRFSS13$MARITAL) <- c("Married", "Divorced/Widowed/Separated","Divorced/Widowed/Separated", "Divorced/Widowed/Separated","Never married", NA)
###Creating new employment variable
BRFSS13$EMPLOY2 <- factor(BRFSS13$EMPLOY1)
levels(BRFSS13$EMPLOY2) <- c("Employed", "Employed","Unemployed", "Unemployed","Homemaker", NA, "Retired", "Unable to work", NA, NA)
###Creating household size variable
BRFSS13$X_CHLDCNT2<-BRFSS13$X_CHLDCNT
BRFSS13$X_CHLDCNT2[BRFSS13$X_CHLDCNT2 > 6] = NA
BRFSS13$HOUSEHOLD<- BRFSS13$NUMADULT + BRFSS13$X_CHLDCNT2
###Creating household size variable for descriptive statistics
BRFSS13$HOUSEHOLD2 <- factor(BRFSS13$HOUSEHOLD)
BRFSS13$HOUSEHOLDDESC <- cut(BRFSS13$HOUSEHOLD,
breaks=c(-Inf, 2, 4, 6, Inf),
labels=c("1 or 2","3 or 4","5 or 6", "7 or more"))
###Homeownership
BRFSS13$X_IMPHOME2 <- factor(BRFSS13$X_IMPHOME)
levels(BRFSS13$X_IMPHOME2) <- c("Own", "Does not own","Does not own")
BRFSS13$X_IMPHOME2 <- factor(BRFSS13$X_IMPHOME2, levels = c("Does not own", "Own"))
###Number of unhealthy days
BRFSS13$MENTHLTH2<-BRFSS13$MENTHLTH
BRFSS13$MENTHLTH2[BRFSS13$MENTHLTH2 == 99] = NA
BRFSS13$MENTHLTH2[BRFSS13$MENTHLTH2 == 77] = NA
BRFSS13$MENTHLTH2[BRFSS13$MENTHLTH2 == 88] = 0
###Creating mental health days for descriptive statistics
BRFSS13$MENTHLTHDESC <- factor(BRFSS13$MENTHLTH2)
BRFSS13$MENTHLTHDESC <- cut(BRFSS13$MENTHLTH2,
breaks=c(-Inf, 0, 10, 20, Inf),
labels=c("Zero","1-10","11-20", "21-30"))
###Sex
BRFSS13$SEX2 <- factor(BRFSS13$SEX)
levels(BRFSS13$SEX2) <- c("Male", "Female")
BRFSS13_2 <- subset(BRFSS13, select=c(MENTHLTH2, X_AGE_G2, SEX2, X_RACEGR3C, X_EDUCAG2, INCOME3, MARITAL, EMPLOY2, HOUSEHOLDDESC, X_IMPHOME2, MENTHLTHDESC))
BRFSS13_3<-subset(na.omit(BRFSS13_2))
| Age Group | |
| Age 35 to 44 | 7,013 (10%) |
| Age 45 to 54 | 11,983 (17%) |
| Age 55 to 64 | 19,329 (27%) |
| Age 65+ | 33,942 (47%) |
| Sex | |
| Male | 26,555 (37%) |
| Female | 45,712 (63%) |
| Race/Ethnicity | |
| White Non-Hispanic | 56,936 (79%) |
| Black Non-Hispanic | 6,463 (9%) |
| Others | 5,186 (7%) |
| Hispanic | 3,682 (5%) |
| Education | |
| Less than high school | 5,943 (8%) |
| High school graduate | 21,964 (30%) |
| <4 yr of college | 19,728 (27%) |
| => 4 y of college | 24,632 (34%) |
| Income | |
| <25,000 | 18,867 (26%) |
| 25,000 to <50,000 | 17,588 (24%) |
| 50,000 or more | 26,493 (37%) |
| Unknown | 9,319 (13%) |
| Marital status | |
| Married | 41,198 (57%) |
| Divorced/Widowed/Separated | 25,506 (35%) |
| Never married | 5,563 (8%) |
| Employment status | |
| Employed | 30,035 (42%) |
| Unemployed | 2,556 (4%) |
| Homemaker | 4,915 (7%) |
| Retired | 28,736 (40%) |
| Unable to work | 6,025 (8%) |
| Household size | |
| 1 or 2 | 24,800 (34%) |
| 3 or 4 | 38,192 (53%) |
| 5 or 6 | 7,727 (11%) |
| 7 or more | 1,548 (2%) |
| Homeownership | |
| Does not own | 12,031 (17%) |
| Own | 60,236 (83%) |
| Number of mentally unhealthy days | |
| Zero | 52,870 (73%) |
| 1-10 | 12,572 (17%) |
| 11-20 | 2,820 (4%) |
| 21-30 | 4,005 (6%) |
| Mean | 2.97 |
| Variance | 53.98 |
| Standard dev | 7.35 |
| Median | 0.00 |
| Quantiles.27% | 0.00 |
| Quantiles.75% | 1.00 |
| Model 1: Logistic | |
|---|---|
| Intercept | -2.01 (0.09)*** |
| Covariates | |
| Aged 45 to 54 | -0.08 (0.05) |
| Aged 55 to 64 | -0.33 (0.05)*** |
| Aged 65+ | -0.83 (0.06)*** |
| Female | 0.29 (0.03)*** |
| Black Non-Hispanic | -0.25 (0.05)*** |
| Other race | 0.07 (0.05) |
| Hispanic | 0.02 (0.06) |
| High-school graduate | -0.21 (0.05)*** |
| Less than 4 years of college | -0.16 (0.05)*** |
| Greater than 4 years of college | -0.36 (0.05)*** |
| $25,000 to <$50,000 | -0.30 (0.04)*** |
| $50,000 or more | -0.61 (0.05)*** |
| Unknown income | -0.41 (0.05)*** |
| Divorced/Widowed/Separated | 0.28 (0.04)*** |
| Never married | 0.06 (0.06) |
| Unemployed | 0.95 (0.06)*** |
| Homemaker | 0.20 (0.06)** |
| Retired | 0.24 (0.04)*** |
| Unable to work | 1.74 (0.04)*** |
| 3-4 in Household | 0.08 (0.04)* |
| 5-6 in Household | 0.01 (0.06) |
| 7+ in Household | 0.02 (0.10) |
| Homeownership | |
| Homeowner | -0.13 (0.03)*** |
| AIC | 38479.44 |
| BIC | 38699.95 |
| Log Likelihood | -19215.72 |
| Deviance | 38431.44 |
| Num. obs. | 72267 |
| p < 0.001, p < 0.01, p < 0.05 | |
## Loading required package: lattice
## Classes and Methods for R developed in the
##
## Political Science Computational Laboratory
##
## Department of Political Science
##
## Stanford University
##
## Simon Jackman
##
## hurdle and zeroinfl functions by Achim Zeileis
| Model 1: Logistic | Model 2: Linear | Model 3: Poisson | Model 4: Negative binomial | |
|---|---|---|---|---|
| Homeowner | -0.13*** | -0.53*** | -0.11*** | -0.13*** |
| (0.03) | (0.08) | (0.01) | (0.03) | |
| AIC | 38479.44 | 720413.83 | 213462.21 | |
| BIC | 38699.95 | 720634.34 | 213691.91 | |
| Log Likelihood | -19215.72 | -360182.91 | -106706.11 | |
| Deviance | 38431.44 | 647901.06 | 41345.58 | |
| Num. obs. | 72267 | 72267 | 72267 | 72267 |
| R2 | 0.11 | |||
| Adj. R2 | 0.11 | |||
| RMSE | 6.93 | |||
| p < 0.001, p < 0.01, p < 0.05 | ||||
| Model 5: Zero-Inflated Negative binomial | |
|---|---|
| Negative binomial component | -0.04 (0.02)* |
| Zero-inflated component | 0.14 (0.03)*** |
| AIC | 206523.56 |
| Log Likelihood | -103212.78 |
| Num. obs. | 72267 |
| p < 0.001, p < 0.01, p < 0.05 | |
## Logistic Linear Poisson
## (Intercept) -2.01376045 3.910324922 1.238590581
## X_AGE_G2Age 45 to 54 -0.08141413 -0.007948733 -0.016999279
## X_AGE_G2Age 55 to 64 -0.32604219 -0.712660265 -0.193056768
## X_AGE_G2Age 65+ -0.83143133 -1.861768428 -0.591555646
## SEX2Female 0.28542464 0.722097185 0.282339807
## X_RACEGR3CBlack Non-Hispanic -0.24508951 -0.608271245 -0.152693736
## X_RACEGR3COthers 0.07300584 0.179696586 0.058759651
## X_RACEGR3CHispanic 0.02024939 -0.051286079 0.002704846
## X_EDUCAG2High school graduate -0.21110462 -0.492146847 -0.134320493
## X_EDUCAG2<4 yr of college -0.16369572 -0.297392856 -0.084314505
## X_EDUCAG2=> 4 y of college -0.35740385 -0.563900420 -0.200554133
## INCOME325,000 to <50,000 -0.29710646 -0.864939236 -0.222650261
## INCOME350,000 or more -0.60959528 -1.323758916 -0.439133937
## INCOME3Unknown -0.40587488 -1.193544860 -0.327554481
## MARITALDivorced/Widowed/Separated 0.28010256 0.725441632 0.192004290
## MARITALNever married 0.06447790 0.224594486 0.055598925
## EMPLOY2Unemployed 0.94754342 2.739804237 0.695162510
## EMPLOY2Homemaker 0.19712990 0.329212394 0.114245390
## EMPLOY2Retired 0.23878862 0.534208651 0.135471834
## EMPLOY2Unable to work 1.74439060 6.999061283 1.232615817
## HOUSEHOLDDESC3 or 4 0.08302426 0.398162935 0.056171743
## HOUSEHOLDDESC5 or 6 0.01174552 0.238707319 0.057857771
## HOUSEHOLDDESC7 or more 0.01823692 0.234643044 0.054679828
## X_IMPHOME2Own -0.13305486 -0.527137027 -0.108673417
## Negative Binomial
## (Intercept) 1.23425423
## X_AGE_G2Age 45 to 54 -0.02866192
## X_AGE_G2Age 55 to 64 -0.17737711
## X_AGE_G2Age 65+ -0.52877957
## SEX2Female 0.32107459
## X_RACEGR3CBlack Non-Hispanic -0.05050304
## X_RACEGR3COthers 0.09443010
## X_RACEGR3CHispanic 0.10358262
## X_EDUCAG2High school graduate -0.20708074
## X_EDUCAG2<4 yr of college -0.13408249
## X_EDUCAG2=> 4 y of college -0.27408608
## INCOME325,000 to <50,000 -0.25029728
## INCOME350,000 or more -0.44015257
## INCOME3Unknown -0.36975206
## MARITALDivorced/Widowed/Separated 0.23602659
## MARITALNever married 0.11496953
## EMPLOY2Unemployed 0.69383447
## EMPLOY2Homemaker 0.08988737
## EMPLOY2Retired 0.11361531
## EMPLOY2Unable to work 1.24583577
## HOUSEHOLDDESC3 or 4 0.05363712
## HOUSEHOLDDESC5 or 6 0.10030186
## HOUSEHOLDDESC7 or more 0.07575185
## X_IMPHOME2Own -0.13215462
## Zero-inflated Negative Binomial
## (Intercept) 2.237279896
## X_AGE_G2Age 45 to 54 -0.018661587
## X_AGE_G2Age 55 to 64 -0.063620824
## X_AGE_G2Age 65+ -0.198447400
## SEX2Female 0.081303153
## X_RACEGR3CBlack Non-Hispanic -0.024928091
## X_RACEGR3COthers 0.108776238
## X_RACEGR3CHispanic 0.074270854
## X_EDUCAG2High school graduate -0.130558804
## X_EDUCAG2<4 yr of college -0.125434133
## X_EDUCAG2=> 4 y of college -0.293121776
## INCOME325,000 to <50,000 -0.150386205
## INCOME350,000 or more -0.293894263
## INCOME3Unknown -0.112472668
## MARITALDivorced/Widowed/Separated 0.147685723
## MARITALNever married 0.039647325
## EMPLOY2Unemployed 0.401006696
## EMPLOY2Homemaker 0.097687545
## EMPLOY2Retired 0.121252368
## EMPLOY2Unable to work 0.690663655
## HOUSEHOLDDESC3 or 4 -0.002389577
## HOUSEHOLDDESC5 or 6 -0.046352810
## HOUSEHOLDDESC7 or more -0.031280354
## X_IMPHOME2Own 0.022146064
| Logistic | Linear | Poisson | Negative Binomial | Zero-inflated Negative Binomial | |
|---|---|---|---|---|---|
| logLik | -19216.00 | -242449.00 | -360183.00 | -106706.00 | -105452.00 |
| Df | 24.00 | 25.00 | 24.00 | 25.00 | 27.00 |
| Obs | 80650.00 |
| Logistic | 17998.00 |
| Poisson | 8860.00 |
| NB | 52175.00 |
| Zero-inflated Negative Binomial | 52792.00 |
## Vuong Non-Nested Hypothesis Test-Statistic:
## (test-statistic is asymptotically distributed N(0,1) under the
## null that the models are indistinguishible)
## -------------------------------------------------------------
## Vuong z-statistic H_A p-value
## Raw -24.78722 model2 > model1 < 2.22e-16
## AIC-corrected -24.74768 model2 > model1 < 2.22e-16
## BIC-corrected -24.56603 model2 > model1 < 2.22e-16
install.packages(“xtable”) install.packages(“Gmisc”) install.packages(“texreg”) install.packages(“pscl”)
getwd() setwd(“/Users/valeriepocus/Documents/BIOS751”) setwd(“G:/My Documents/Misc/School/BIOS751”) BRFSS13 = read.csv(“BRFSS2013_Data.csv”)