0.1 Objective

The objective of this exercise is to execute the procedures for estimating time-to-event (TTE) functions in event history analysis with a focus on infant mortality experience. The TTE, also referred to as survival time, measures how long an individual or a cohort had survived before experiencing an event \(y\) (in this case, infant death) between two observed time points \(x-x+n\) (in this case, between age 0 and 12th month). The distribution of observed time-to-event is mainly described using three inter-related functions which one can be derived from the others; namely, the survival function (\(S(t)\)), probability density function (\(f(t)\)), and hazard function (\(h(t)\)).

0.2 Methods

0.2.1 Data

The pooled birth-recode data sets of 2003, 2008, 2013 and 2018 Nigeria Demographic and Health Surveys were employed:

hwdata <- read.dta("C:/A/02/01F20/DEM223 Event History Analysis/hwdata.dta")

hwdata <- hwdata %>% select(strata, psu, iwt, d_age, status_12, status_12b, bcat01) %>% 
    filter(complete.cases(hwdata))
hwdata$caseid <- c(1:nrow(hwdata))

hwdata$bcat01 <- factor(hwdata$bcat01, levels = c("Group 01", "Group 02", "Group 03", 
    "Group 04", "Group 05"), labels = c("Grp01", "Grp02", "Grp03", "Grp04", "Grp05"))

hwdata$pwt = c(hwdata$iwt/1000000)
dhsdes = svydesign(id = ~psu, strata = ~strata, weights = ~pwt, data = hwdata)

0.2.2 Outcome

The event variable or censoring indicator is whether a child born in the last five years prior to the surveys died during infancy (uncensored) or had survived the period (censored).

0.2.3 Predictor

The comparative indicator or main grouping variable is a child’s demographic risk attribute at birth (DRAB); a composite predictor derived from maternal age at birth, birth order and birth interval. The DRAB predictor is classified as follow:

  • Grp01: Middle-maternal age (20-34), combined with birth order 2-4, birth interval 24-59 months
  • Grp02: Middle-maternal age (20-34), combined with birth order 1
  • Grp03: Early-maternal age (<20), combined with any birth order, and any birth interval**
  • Grp04: Middle-maternal age (20-34), combined with birth order 5+, and birth interval 24-59 months
  • Grp05: Advanced-maternal age (35+), combined with any birth order, and any birth interval**

The resultant composite indicator coded as Grp01 (0) vs Grp02 (1), Grp03 (2), Grp04 (3), and Grp05 (4). The analysis compared the pattern of infant mortality risks among these groups hypothesizing that the the mortality risks will be indifferent across the DRAB dimensions.

** It is worthy to note that the DRAB categorization as presented here is for illustration purpose and subject to further refinement. Meanwhile, the categories was so constructed as the my interest is to investigate the effect of stage of childbearing age, especially the ages at the end of childbearing age spectrum, on early-life mortality experience. This is mainly conceived to capture and measure the impact of the mostly avoidable demographic risk associated with extreme maternal age-related fertility behaviors and compare the risk with those characterized by middle maternal age. Also, there were too few cases in birth order 1, and birth interval outside 24-59 range for early-age and advanced-age mothers.

0.2.4 Analytical strategies

I estimated the TTE functions for the pooled kids population and specific categories of infants by birth circumstances. In addition, I presented graphical illustrations of the functions and compared their distributions for the aggregate and classes of infants.



0.3 Results

0.3.1 Data structure

First, I examined the structure of the variables and survival status distribution by birth classification:

glimpse(hwdata)
Rows: 61,213
Columns: 9
$ strata     <chr> "2003-north west - urban", "2003-north west - urban", "2...
$ psu        <chr> "2003-183", "2003-183", "2003-183", "2003-183", "2003-18...
$ iwt        <int> 474592, 474592, 474592, 474592, 474592, 474592, 474592, ...
$ d_age      <dbl> 14, 17, 19, 49, 27, 26, 1, 27, 8, 14, 32, 12, 10, 3, 46,...
$ status_12  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,...
$ status_12b <fct> Alive, Alive, Alive, Alive, Alive, Alive, Alive, Alive, ...
$ bcat01     <fct> Grp02, Grp02, Grp01, Grp05, Grp04, Grp04, Grp03, Grp05, ...
$ caseid     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1...
$ pwt        <dbl> 0.47459, 0.47459, 0.47459, 0.47459, 0.47459, 0.47459, 0....
svytable(~bcat01 + status_12b, design = dhsdes)
       status_12b
bcat01     Alive     Dead
  Grp01 15849.54   505.85
  Grp02  5868.99   282.28
  Grp03  6559.21   447.78
  Grp04 18799.31   867.31
  Grp05 11480.22   727.71
round((prop.table(svytable(~bcat01 + status_12b, design = dhsdes), 1) * 100), 1)
       status_12b
bcat01  Alive Dead
  Grp01  96.9  3.1
  Grp02  95.4  4.6
  Grp03  93.6  6.4
  Grp04  95.6  4.4
  Grp05  94.0  6.0

Then, I examined the mortality pattern for the first 20 children in the data set:

head(hwdata[c("d_age" <= 13), c("d_age", "status_12")], n = 20)
head(Surv(hwdata$d_age, hwdata$status_12), n = 20)
 [1] 14+ 17+ 19+ 49+ 27+ 26+  1+ 27+  8+ 14+ 32+ 12+ 10   3+ 46+  0+ 26+ 14+ 14+
[20] 26+

0.3.2 Time-to-event functions estimates

0.3.2.1 Base survival and hazard functions

The pooled survival function of infant mortality:

inf.surv <- survfit(Surv(d_age, status_12) ~ 1, data = hwdata, conf.type = "none")
plot(inf.surv, ylim = c(0.94, 0.98), xlim = c(0, 12), main = "Survival function of infant mortality, NDHS 2003-2018", 
    lwd = 1.5, col = 2)

Basic life table distribution of survival:

# Store the survivorship function as an object for subsequent use:
surv12 <- inf.surv
summary(inf.surv)
Call: survfit(formula = Surv(d_age, status_12) ~ 1, data = hwdata, 
    conf.type = "none")

 time n.risk n.event survival  std.err
    0  61213    1531    0.975 0.000631
    1  58844     145    0.973 0.000660
    2  57010     113    0.971 0.000684
    3  55366     123    0.969 0.000709
    4  53635      88    0.967 0.000728
    5  51864      94    0.965 0.000749
    6  50075      91    0.963 0.000770
    7  48253     125    0.961 0.000799
    8  46444     101    0.959 0.000824
    9  44691      98    0.957 0.000849
   10  43048      86    0.955 0.000872
   11  41489      61    0.953 0.000889
   12  40041     211    0.948 0.000949

Pooled infant mortality rate:

round(1000 * (1 - summary(inf.surv)$surv[12]), 0)
[1] 47

The pooled hazard function of infant mortality:

inf.haz <- kphaz.fit(time = hwdata$d_age, status = hwdata$status_12, method = "product-limit")
kphaz.plot(inf.haz, main = "Hazard function of infant mortality, NDHS 2003-2018", 
    lwd = 1.5, col = 2)

The basic life table distribution of hazards:

data.frame(inf.haz)

The pooled cumulative hazard function of infant mortality:

plot(cumsum(inf.haz$haz) ~ inf.haz$time, main = "Cumulative hazard function of infant mortality, NDHS 2003-2018", 
    ylab = "H(t)", xlab = "Time in months", type = "s", xlim = c(0, 12), lwd = 1.5, 
    col = 2)

The pooled probability density function (PDF):

ft <- -diff(inf.surv$surv)
plot(ft, xlim = c(0.5, 12), main = "Probability density function of infant mortality, NDHS 2003-2018", 
    type = "s", ylab = "f(t)", xlab = "Time in Months", lwd = 1.5, col = 2)

The pooled cumulative distribution function (CDF):

Ft <- cumsum(ft)
plot(Ft, xlim = c(0.5, 12), main = "Cumulative density function of infant mortality, NDHS 2003-2018", 
    type = "s", ylab = "f(t)", xlab = "Time in Months", lwd = 1.5, col = 2)

0.3.2.2 Comparative survival and hazard functions

Comparative survival function estimates by birth categories:

inf.km <- survfit(Surv(d_age, status_12) ~ bcat01, data = hwdata)
summary(inf.km)
Call: survfit(formula = Surv(d_age, status_12) ~ bcat01, data = hwdata)

                bcat01=Grp01 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
    0  16317     273    0.983 0.00100        0.981        0.985
    1  15777      22    0.982 0.00104        0.980        0.984
    2  15261      23    0.980 0.00109        0.978        0.983
    3  14778      24    0.979 0.00113        0.977        0.981
    4  14284      18    0.978 0.00117        0.975        0.980
    5  13769      17    0.976 0.00120        0.974        0.979
    6  13252      14    0.975 0.00123        0.973        0.978
    7  12749      34    0.973 0.00131        0.970        0.975
    8  12227      22    0.971 0.00136        0.968        0.974
    9  11756      15    0.970 0.00139        0.967        0.972
   10  11263      16    0.968 0.00143        0.966        0.971
   11  10811       9    0.968 0.00146        0.965        0.970
   12  10411      37    0.964 0.00156        0.961        0.967

                bcat01=Grp02 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
    0   6056     166    0.973 0.00210        0.968        0.977
    1   5794      11    0.971 0.00217        0.967        0.975
    2   5590       7    0.970 0.00221        0.965        0.974
    3   5393      12    0.967 0.00229        0.963        0.972
    4   5170      11    0.965 0.00237        0.961        0.970
    5   4951       8    0.964 0.00243        0.959        0.969
    6   4756       7    0.962 0.00249        0.957        0.967
    7   4548       7    0.961 0.00254        0.956        0.966
    8   4320      10    0.959 0.00263        0.953        0.964
    9   4109      11    0.956 0.00274        0.951        0.961
   10   3912      11    0.953 0.00285        0.948        0.959
   11   3728       2    0.953 0.00287        0.947        0.959
   12   3541       7    0.951 0.00295        0.945        0.957

                bcat01=Grp03 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
    0   6914     254    0.963 0.00226        0.959        0.968
    1   6589      20    0.960 0.00235        0.956        0.965
    2   6405      15    0.958 0.00241        0.953        0.963
    3   6265      18    0.955 0.00249        0.950        0.960
    4   6092      11    0.954 0.00254        0.949        0.959
    5   5886      15    0.951 0.00261        0.946        0.956
    6   5718      14    0.949 0.00268        0.944        0.954
    7   5548      12    0.947 0.00274        0.941        0.952
    8   5378      18    0.944 0.00283        0.938        0.949
    9   5172      13    0.941 0.00290        0.936        0.947
   10   4996      11    0.939 0.00296        0.933        0.945
   11   4821      12    0.937 0.00303        0.931        0.943
   12   4652      25    0.932 0.00317        0.926        0.938

                bcat01=Grp04 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
    0  19554     436    0.978 0.00106        0.976        0.980
    1  18871      44    0.975 0.00111        0.973        0.978
    2  18311      30    0.974 0.00114        0.972        0.976
    3  17818      39    0.972 0.00119        0.969        0.974
    4  17296      30    0.970 0.00123        0.968        0.972
    5  16810      38    0.968 0.00128        0.965        0.970
    6  16234      28    0.966 0.00131        0.964        0.969
    7  15663      49    0.963 0.00138        0.960        0.966
    8  15090      30    0.961 0.00142        0.958        0.964
    9  14533      32    0.959 0.00146        0.956        0.962
   10  14041      32    0.957 0.00151        0.954        0.960
   11  13566      24    0.955 0.00155        0.952        0.958
   12  13106      79    0.949 0.00167        0.946        0.953

                bcat01=Grp05 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
    0  12372     402    0.968 0.00159        0.964        0.971
    1  11813      48    0.964 0.00169        0.960        0.967
    2  11443      38    0.960 0.00176        0.957        0.964
    3  11112      30    0.958 0.00182        0.954        0.961
    4  10793      18    0.956 0.00185        0.953        0.960
    5  10448      16    0.955 0.00188        0.951        0.958
    6  10115      28    0.952 0.00194        0.948        0.956
    7   9745      23    0.950 0.00200        0.946        0.954
    8   9429      21    0.948 0.00204        0.944        0.952
    9   9121      27    0.945 0.00211        0.941        0.949
   10   8836      16    0.943 0.00215        0.939        0.947
   11   8563      14    0.942 0.00218        0.937        0.946
   12   8331      63    0.935 0.00234        0.930        0.939

ANOVA test of Grpdifference in infant deaths:

survdiff(Surv(d_age, status_12) ~ bcat01, data = hwdata)
Call:
survdiff(formula = Surv(d_age, status_12) ~ bcat01, data = hwdata)

                 N Observed Expected (O-E)^2/E (O-E)^2/V
bcat01=Grp01 16317      524      761    73.908   102.115
bcat01=Grp02  6056      270      277     0.179     0.201
bcat01=Grp03  6914      438      326    38.622    44.215
bcat01=Grp04 19554      891      922     1.050     1.570
bcat01=Grp05 12372      744      581    45.837    58.331

 Chisq= 162  on 4 degrees of freedom, p= <2e-16 
ggsurvplot(inf.km, xlim = c(0, 12), conf.int = T, title = "Survival function for infant mortality by DRAB, NDHS 2003-2018", 
    ylim = c(0.92, 1))

Comparative hazard function estimates by birth categories:

fit1 <- coxph(Surv(d_age, status_12) ~ bcat01, data = hwdata, weights = pwt)
summary(fit1)
Call:
coxph(formula = Surv(d_age, status_12) ~ bcat01, data = hwdata, 
    weights = pwt)

  n= 61213, number of events= 2867 

              coef exp(coef) se(coef) robust se     z    Pr(>|z|)    
bcat01Grp02 0.4176    1.5184   0.0743    0.0873  4.78 0.000001739 ***
bcat01Grp03 0.7226    2.0598   0.0649    0.0741  9.75     < 2e-16 ***
bcat01Grp04 0.3467    1.4143   0.0559    0.0642  5.40 0.000000066 ***
bcat01Grp05 0.6555    1.9261   0.0579    0.0647 10.13     < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

            exp(coef) exp(-coef) lower .95 upper .95
bcat01Grp02      1.52      0.659      1.28      1.80
bcat01Grp03      2.06      0.485      1.78      2.38
bcat01Grp04      1.41      0.707      1.25      1.60
bcat01Grp05      1.93      0.519      1.70      2.19

Concordance= 0.572  (se = 0.006 )
Likelihood ratio test= 181  on 4 df,   p=<2e-16
Wald test            = 137  on 4 df,   p=<2e-16
Score (logrank) test = 180  on 4 df,   p=<2e-16,   Robust = 143  p=<2e-16

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do not).
ggsurvplot(inf.km, data = hwdata, risk.table = F, fun = "cumhaz", conf.int = T, title = "CHF of infant mortality by DRAB, NDHS 2003-2018", 
    xlim = c(0.2, 12), ylim = c(0, 0.08))



0.4 Interpretations

The analysis revealed that pooled infant mortality rate for the entire period is 47 per 1000 live births. However, the numerical and graphical results of the pooled time-to-event distribution of risks indicate that the risk of dying during infancy is highest at age 12 month; suggesting the possibility of digit preference in reporting age at death of infants. Considering the inter-Grpdifferentials, the results showed that the infant mortality risks were generally higher for children in other comparative DRAB categories relative to their peers in the reference group. Meanwhile, the results specifically showed that the chances of surviving in infancy is lowest for children whose births were characterized by early maternal age at birth, followed closely by those in advanced maternal age risk set, compared with other categories.



0.5 Acknowledgements

  1. This exercise was inspired by Dr. Corey Sparks’ lab session on “DEM 7223: Event History Analysis” Fall 2020 coursework, and his easy-to-follow instructional materials accessible at https://rpubs.com/corey_sparks

  2. The Demographic and Health Survey data can be downloaded with permission from the DHS Program at https://www.dhsprogram.com/data/

  3. R package https://www.R-project.org/ was used for data wrangling and analysis.



Created on September 21, 2020