HW05 - Discrete Time Models

Author

Ryan Labio

I used IPUMS NHIS 2018 Data

Use of data from IPUMS NHIS is subject to conditions including that users
should cite the data appropriately. Use command `ipums_conditions()` for more
details.

a. Form a person-period data set

# person-period data set

hw5pp <- survSplit(Surv(agedeath, died)~., data = hw5,
                   cut = seq(18, 100, 5),
                   episode = "mort_5_yr")

# survey design

hw5des <- svydesign(ids = ~PSU,
                    strata = ~STRATA,
                    weights = ~MORTWTSA, 
                    data = hw5pp, 
                    nest = T)

b/c. Consider general model and other time specifications; include all main effects

m0_constant <- svyglm (died ~ race_eth + educ,
                       design = hw5des,
                       family = binomial (link="cloglog"))

m1_general <- svyglm (died ~ factor(mort_5_yr) + race_eth + educ,
                       design = hw5des,
                       family = binomial (link="cloglog"))

m2_linear <- svyglm (died ~ mort_5_yr + race_eth + educ,
                       design = hw5des,
                       family = binomial (link="cloglog"))

m3_quadratic <- svyglm (died ~ mort_5_yr + I(mort_5_yr^2) + race_eth + educ,
                       design = hw5des,
                       family = binomial (link="cloglog"))

library(splines)
m4_spline <- svyglm (died ~ ns(mort_5_yr, df=3) + race_eth + educ,
                       design = hw5des,
                       family = binomial (link="cloglog"))

Model fits

Comment: The general model has the lowest AIC, with the spline model having only at 6.8 difference.

Relative AICs for Alternative Time Specifications
model AIC deltaAIC
constant 5468.468 1432.96939
general 4035.499 0.00000
linear 4065.222 29.72276
quadratic 4052.951 17.45245
spline 4042.341 6.84181

d. Test for an interaction between at least two of the predictors

Comment: Interaction only seen for Hispanic race/ethnicity group. (Reference groups were NH White and College educated)

m1_general_interaction <- svyglm (died ~ factor(mort_5_yr) + race_eth + educ + race_eth*educ,
                       design = hw5des,
                       family = binomial (link="cloglog"))

summary(m1_general_interaction)

Call:
svyglm(formula = died ~ factor(mort_5_yr) + race_eth + educ + 
    race_eth * educ, design = hw5des, family = binomial(link = "cloglog"))

Survey design:
svydesign(ids = ~PSU, strata = ~STRATA, weights = ~MORTWTSA, 
    data = hw5pp, nest = T)

Coefficients:
                                    Estimate Std. Error  t value Pr(>|t|)    
(Intercept)                       -22.970000   0.129328 -177.610  < 2e-16 ***
factor(mort_5_yr)2                  0.007605   0.001853    4.105 4.78e-05 ***
factor(mort_5_yr)3                 11.938466   0.999740   11.942  < 2e-16 ***
factor(mort_5_yr)4                 14.174818   0.581057   24.395  < 2e-16 ***
factor(mort_5_yr)5                 15.060867   0.528725   28.485  < 2e-16 ***
factor(mort_5_yr)6                 14.961732   0.379254   39.450  < 2e-16 ***
factor(mort_5_yr)7                 14.862627   0.412004   36.074  < 2e-16 ***
factor(mort_5_yr)8                 15.909650   0.394313   40.348  < 2e-16 ***
factor(mort_5_yr)9                 16.450201   0.252266   65.210  < 2e-16 ***
factor(mort_5_yr)10                17.055257   0.212807   80.144  < 2e-16 ***
factor(mort_5_yr)11                17.301264   0.163138  106.053  < 2e-16 ***
factor(mort_5_yr)12                18.012621   0.169751  106.112  < 2e-16 ***
factor(mort_5_yr)13                18.553233   0.143038  129.708  < 2e-16 ***
factor(mort_5_yr)14                18.981022   0.155804  121.826  < 2e-16 ***
factor(mort_5_yr)15                20.562151   0.114100  180.211  < 2e-16 ***
race_ethNH Black                    0.271622   0.414367    0.656    0.512    
race_ethNH Other                    0.080902   0.396679    0.204    0.838    
race_ethHispanic                  -14.819377   0.186422  -79.494  < 2e-16 ***
educHS or less                      0.682325   0.160352    4.255 2.53e-05 ***
educSome College                    0.379971   0.181689    2.091    0.037 *  
race_ethNH Black:educHS or less    -0.156003   0.497921   -0.313    0.754    
race_ethNH Other:educHS or less    -0.114122   0.616605   -0.185    0.853    
race_ethHispanic:educHS or less    14.799161   0.305044   48.515  < 2e-16 ***
race_ethNH Black:educSome College  -0.223934   0.571520   -0.392    0.695    
race_ethNH Other:educSome College   0.609634   0.693226    0.879    0.380    
race_ethHispanic:educSome College  14.749871   0.554144   26.617  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 0.7325169)

Number of Fisher Scoring iterations: 21

e. Generate hazard plots

Comment: Here we see the NH Other group have a higher hazard of death (holding education constant), likely due to the inclusion of Native American and Alaskan Native groups; it would be worth analyzing this group further to examine Asian and MultiRacial groups. We also see the educational gradient hold true across all race/ethnicity groups (as education increases, hazard of death is lower).