HW05 - Discrete Time Models
I used IPUMS NHIS 2018 Data
Use of data from IPUMS NHIS is subject to conditions including that users
should cite the data appropriately. Use command `ipums_conditions()` for more
details.
a. Form a person-period data set
# person-period data set
hw5pp <- survSplit(Surv(agedeath, died)~., data = hw5,
cut = seq(18, 100, 5),
episode = "mort_5_yr")
# survey design
hw5des <- svydesign(ids = ~PSU,
strata = ~STRATA,
weights = ~MORTWTSA,
data = hw5pp,
nest = T)b/c. Consider general model and other time specifications; include all main effects
m0_constant <- svyglm (died ~ race_eth + educ,
design = hw5des,
family = binomial (link="cloglog"))
m1_general <- svyglm (died ~ factor(mort_5_yr) + race_eth + educ,
design = hw5des,
family = binomial (link="cloglog"))
m2_linear <- svyglm (died ~ mort_5_yr + race_eth + educ,
design = hw5des,
family = binomial (link="cloglog"))
m3_quadratic <- svyglm (died ~ mort_5_yr + I(mort_5_yr^2) + race_eth + educ,
design = hw5des,
family = binomial (link="cloglog"))
library(splines)
m4_spline <- svyglm (died ~ ns(mort_5_yr, df=3) + race_eth + educ,
design = hw5des,
family = binomial (link="cloglog"))Model fits
Comment: The general model has the lowest AIC, with the spline model having only at 6.8 difference.
| model | AIC | deltaAIC |
|---|---|---|
| constant | 5468.468 | 1432.96939 |
| general | 4035.499 | 0.00000 |
| linear | 4065.222 | 29.72276 |
| quadratic | 4052.951 | 17.45245 |
| spline | 4042.341 | 6.84181 |
d. Test for an interaction between at least two of the predictors
Comment: Interaction only seen for Hispanic race/ethnicity group. (Reference groups were NH White and College educated)
m1_general_interaction <- svyglm (died ~ factor(mort_5_yr) + race_eth + educ + race_eth*educ,
design = hw5des,
family = binomial (link="cloglog"))
summary(m1_general_interaction)
Call:
svyglm(formula = died ~ factor(mort_5_yr) + race_eth + educ +
race_eth * educ, design = hw5des, family = binomial(link = "cloglog"))
Survey design:
svydesign(ids = ~PSU, strata = ~STRATA, weights = ~MORTWTSA,
data = hw5pp, nest = T)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -22.970000 0.129328 -177.610 < 2e-16 ***
factor(mort_5_yr)2 0.007605 0.001853 4.105 4.78e-05 ***
factor(mort_5_yr)3 11.938466 0.999740 11.942 < 2e-16 ***
factor(mort_5_yr)4 14.174818 0.581057 24.395 < 2e-16 ***
factor(mort_5_yr)5 15.060867 0.528725 28.485 < 2e-16 ***
factor(mort_5_yr)6 14.961732 0.379254 39.450 < 2e-16 ***
factor(mort_5_yr)7 14.862627 0.412004 36.074 < 2e-16 ***
factor(mort_5_yr)8 15.909650 0.394313 40.348 < 2e-16 ***
factor(mort_5_yr)9 16.450201 0.252266 65.210 < 2e-16 ***
factor(mort_5_yr)10 17.055257 0.212807 80.144 < 2e-16 ***
factor(mort_5_yr)11 17.301264 0.163138 106.053 < 2e-16 ***
factor(mort_5_yr)12 18.012621 0.169751 106.112 < 2e-16 ***
factor(mort_5_yr)13 18.553233 0.143038 129.708 < 2e-16 ***
factor(mort_5_yr)14 18.981022 0.155804 121.826 < 2e-16 ***
factor(mort_5_yr)15 20.562151 0.114100 180.211 < 2e-16 ***
race_ethNH Black 0.271622 0.414367 0.656 0.512
race_ethNH Other 0.080902 0.396679 0.204 0.838
race_ethHispanic -14.819377 0.186422 -79.494 < 2e-16 ***
educHS or less 0.682325 0.160352 4.255 2.53e-05 ***
educSome College 0.379971 0.181689 2.091 0.037 *
race_ethNH Black:educHS or less -0.156003 0.497921 -0.313 0.754
race_ethNH Other:educHS or less -0.114122 0.616605 -0.185 0.853
race_ethHispanic:educHS or less 14.799161 0.305044 48.515 < 2e-16 ***
race_ethNH Black:educSome College -0.223934 0.571520 -0.392 0.695
race_ethNH Other:educSome College 0.609634 0.693226 0.879 0.380
race_ethHispanic:educSome College 14.749871 0.554144 26.617 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 0.7325169)
Number of Fisher Scoring iterations: 21
e. Generate hazard plots
Comment: Here we see the NH Other group have a higher hazard of death (holding education constant), likely due to the inclusion of Native American and Alaskan Native groups; it would be worth analyzing this group further to examine Asian and MultiRacial groups. We also see the educational gradient hold true across all race/ethnicity groups (as education increases, hazard of death is lower).