hw4

Author

Drew Schaefer

In this analysis I use data from National Health Interview Survey (NHIS) linked to the National Death Index (NDI). I use a 10-year sample from 2009-2018. The mortality link up includes deaths through December 31, 2019. The outcome variable that I am interested in is death.

Stratified 1 - level Cluster Sampling design (with replacement)
With (1259) clusters.
svydesign(ids = ~PSU, strata = ~STRATA, weights = ~MORTWTSA, 
    data = nhis_dat[nhis_dat$MORTWTSA > 0, ], nest = T)
Call:
svycoxph(formula = Surv(death_age, d.event) ~ poverty + bmi + 
    male + age2 + AGE, design = des)

  n= 278824, number of events= 19688 

                     coef  exp(coef)   se(coef)  robust se       z Pr(>|z|)    
poverty         4.711e-01  1.602e+00  2.246e-02  2.879e-02  16.362  < 2e-16 ***
bmiobese       -8.086e-02  9.223e-01  1.726e-02  2.178e-02  -3.713 0.000205 ***
bmiunderweight  8.830e-01  2.418e+00  4.233e-02  5.798e-02  15.231  < 2e-16 ***
male            4.277e-01  1.534e+00  1.627e-02  1.880e-02  22.756  < 2e-16 ***
age2            5.213e-04  1.001e+00  7.123e-05  6.768e-05   7.702 1.34e-14 ***
AGE            -2.740e-01  7.604e-01  9.515e-03  9.048e-03 -30.281  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

               exp(coef) exp(-coef) lower .95 upper .95
poverty           1.6017     0.6243    1.5138    1.6947
bmiobese          0.9223     1.0842    0.8838    0.9625
bmiunderweight    2.4182     0.4135    2.1585    2.7093
male              1.5338     0.6520    1.4783    1.5913
age2              1.0005     0.9995    1.0004    1.0007
AGE               0.7604     1.3152    0.7470    0.7740

Concordance= 0.87  (se = 0.001 )
Likelihood ratio test= NA  on 6 df,   p=NA
Wald test            = 18280  on 6 df,   p=<2e-16
Score (logrank) test = NA  on 6 df,   p=NA

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do not).
  poverty         bmi   college male AGE d.event
1       0      normal some coll    0  15       1
2       1      normal some coll    0  15       1
3       0       obese some coll    0  15       1
4       1       obese some coll    0  15       1
5       0 underweight some coll    0  15       1
6       1 underweight some coll    0  15       1

On the graph of hypothetical “people” compared to the means of covariates we see two different profiles of people. The first, in red, is a 30 year old female with a college education that is not in poverty and who’s BMI is in the normal range. The second, in green, is a 65 year man with a high school education of less that is in poverty and who’s BMI puts them in the obese range.

This graph shows the hazard function for the same two hypothetical people.

Stratified 1 - level Cluster Sampling design (with replacement)
With (1259) clusters.
svydesign(ids = ~PSU, strata = ~STRATA, weights = ~MORTWTSA, 
    data = nhis_dat[nhis_dat$MORTWTSA > 0, ], nest = T)
Call:
svycoxph(formula = Surv(death_age, d.event) ~ poverty * race_eth + 
    bmi + male + age2 + AGE, design = des)

  n= 278824, number of events= 19688 

                              coef  exp(coef)   se(coef)  robust se       z
poverty                  2.800e-01  1.323e+00  6.613e-02  7.622e-02   3.674
race_ethNHBlack          4.906e-01  1.633e+00  4.491e-02  5.474e-02   8.961
race_ethNHOther          2.195e-02  1.022e+00  5.413e-02  6.649e-02   0.330
race_ethNHWhite          2.905e-01  1.337e+00  3.668e-02  4.559e-02   6.370
bmiobese                -8.974e-02  9.142e-01  1.734e-02  2.231e-02  -4.022
bmiunderweight           8.875e-01  2.429e+00  4.239e-02  5.658e-02  15.686
male                     4.349e-01  1.545e+00  1.628e-02  1.868e-02  23.284
age2                     5.135e-04  1.001e+00  7.121e-05  6.768e-05   7.587
AGE                     -2.734e-01  7.608e-01  9.510e-03  9.057e-03 -30.181
poverty:race_ethNHBlack  1.163e-01  1.123e+00  8.376e-02  9.554e-02   1.217
poverty:race_ethNHOther  1.628e-01  1.177e+00  1.104e-01  1.338e-01   1.216
poverty:race_ethNHWhite  3.094e-01  1.363e+00  7.214e-02  8.496e-02   3.641
                        Pr(>|z|)    
poverty                 0.000239 ***
race_ethNHBlack          < 2e-16 ***
race_ethNHOther         0.741256    
race_ethNHWhite         1.89e-10 ***
bmiobese                5.76e-05 ***
bmiunderweight           < 2e-16 ***
male                     < 2e-16 ***
age2                    3.27e-14 ***
AGE                      < 2e-16 ***
poverty:race_ethNHBlack 0.223593    
poverty:race_ethNHOther 0.223859    
poverty:race_ethNHWhite 0.000271 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                        exp(coef) exp(-coef) lower .95 upper .95
poverty                    1.3232     0.7558    1.1396    1.5363
race_ethNHBlack            1.6332     0.6123    1.4671    1.8182
race_ethNHOther            1.0222     0.9783    0.8973    1.1645
race_ethNHWhite            1.3370     0.7479    1.2227    1.4620
bmiobese                   0.9142     1.0939    0.8751    0.9550
bmiunderweight             2.4290     0.4117    2.1740    2.7138
male                       1.5448     0.6473    1.4893    1.6024
age2                       1.0005     0.9995    1.0004    1.0006
AGE                        0.7608     1.3144    0.7474    0.7744
poverty:race_ethNHBlack    1.1233     0.8902    0.9315    1.3546
poverty:race_ethNHOther    1.1768     0.8498    0.9053    1.5298
poverty:race_ethNHWhite    1.3626     0.7339    1.1535    1.6094

Concordance= 0.871  (se = 0.001 )
Likelihood ratio test= NA  on 12 df,   p=NA
Wald test            = 19193  on 12 df,   p=<2e-16
Score (logrank) test = NA  on 12 df,   p=NA

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do not).

This model tests the interaction between poverty and race. We see that there is significance between poverty and race. Particularly between the none other race groups.

# A tibble: 6 × 4
  response         estimate statistic p.value
  <chr>               <dbl>     <dbl>   <dbl>
1 poverty         0.000100      0.506  0.613 
2 bmiobese        0.000287      1.20   0.230 
3 bmiunderweight  0.0000680     0.688  0.491 
4 male           -0.000439     -1.78   0.0747
5 age2           -0.0331       -0.197  0.844 
6 AGE            -0.000436     -0.361  0.718 

We can see that there is not significance for the model residuals which is what we want to see. The p-value for gender is close to being significant.