________________________________________________________________________________________________________________

Introduction

The COVID-19 pandemic caused disruptions in Uganda’s health systems including HIV exposed infant diagnosis (EID) services. Despite the disruptions, there is paucity of knowledge as to whether they affected the risk of Mother To Child HIV infection. During EID care, infants can be lost to follow-up, transferred, or even die before testing HIV positive. If an HIV exposed infant dies when they are still HIV negtive, it precludes them from the HIV infection risk group. In this case, Infant mortality is a competing event of Infant HIV infection. Because of the need to account for censoring and competing nature of EID outcomes, survival analysis approaches become handy. However, due to improvement in PMTCT services, infant HIV infection during EID care is rare. This would require a longer follow up time to obtain sufficient events for an adequately powered study in case traditional survival analysis approaches are to be used. In these settings, it is possible to improve upon existing estimators by considering estimation in a bounded statistical model. This bounded model incorporates existing scientific knowledge about the incidence of an event in the population. This study utilizes the framework of targeted minimum loss-based estimation (TMLE) to construct an estimator of cumulative incidence that respects the bounds, and thereby compare cumulative incidences of infant HIV infection before and after the COVID-19 pandemic at Kisenyi Health Centre IV in Kampala, Uganda.

Research Questions

  1. How does EID care Infant HIV cumulative incidence at Kisenyi Health Centre IV compare before and after the COVID-19 pandemic?

  2. Do bounded and non-bounded estimates of infant HIV cumulative incidence differ?

  3. What factors were associated with EID care infant HIV infection at Kisenyi Health Centre IV before and after the COVID-19 pandemic?

Hypothesis

The hypothesis of this study is that due to restrictions aimed at preventing the spread of COVID-19 in Uganda such as nation-wide lockdowns, suspension of public transport and curfew, travels of mothers and their infants to health centers for HIV testing, ARV refills and attendance of follow up visits were limited. The ensued interruptions in infant feeding counseling, uptake of ART and initiation or completion of infant HIV prophylaxis could have led to an increase in the risk of mothers transmitting HIV to their children.

Study Objectives

General objective

To study EID care infant HIV infection cumulative incidence at Kisenyi Health Centre IV in Kampala, Uganda, before and after the COVID-19 pandemic, using survival analysis.

Specific objectives

  1. To compare EID care Infant HIV cumulative incidence at Kisenyi HC IV in Kampala, Uganda, before and after the COVID-19 pandemic.

  2. To compare bounded and non-bounded estimates of cumulative incidence at Kisenyi HC IV in Kampala, Uganda, before and after the COVID-19 pandemic.

  3. To study factors associated with EID care Infant HIV cumulative incidence at Kisenyi HC IV in Kampala, Uganda, before and after the COVID-19 pandemic.

Analysis Plan

Section A: Data Preparation

1. Importing the EID Dataset , specifying the datatypes

library(readxl)
EID_Dataset <- read_excel("EID Dataset.xlsx", 
                          col_types = c("text", "text", "text", 
                                        "text", "text", "numeric", "text", 
                                        "text", "numeric", "text", "numeric", 
                                        "text", "text", "text", "text", "text", 
                                        "text", "numeric", "text", "text", 
                                        "text", "numeric", "text", "text", 
                                        "text", "numeric", "text", "numeric", 
                                        "text", "numeric", "text", "text", 
                                        "text", "numeric", "text"))

Description of Variables

Variable Categories Description
Cohort

-After

-Before

An indication of whether the infant was enrolled into care before or after the COVID-19 Pandemic.
Sex

1- Male

2- Female

Sex of the infant.
Age_in_months Age of the infant at registration in Months.
Clinic_referred_from

1-EMTCT

2-OPD

3-YCC

4-Post natal

5-Other

6-Referral

The Clinic through which the infant was enrolled into EID Care.
Age_NVP_initiation Age at NVP Initiation.
Age_at_cotrim Age at Cotrimoxazole Initiation.
ART_B_feeding

0-No

1-Yes

2-Already on ART

An indication of whether the Mother was initiated on ART during breastfeeding.
ARVs_EMTCT

1-Lifelong ART

2-No ART

3-Unknown

An indication of whether the Mother was given EMTCT ARVs during ANC and PNC.
Infant_ARVs_EMTCT

1-Received NVP within 72 hours of birth until 6 or 12 weeks

2-NVPV taken 72 hours after birth

3-No ARVs were given at birth

4- Unknown: infant eMTCT regimen is not known.

An indication of whether/how the infant was given EMTCT ARVs.
B_feeding_1st_DBS

1-Exclusive Breastfeeding

2-Replacement feeding(never breastfed)

3-Mixed feeding(<6Months)

4-Complementary feeding(>6Months)

5-Weaning

6-No longer breastfeeding

Breastfeeding practice at the 1st DBS.
Number_of_visits The maximum number of visits the Infant/caregiver made during EID Care.
Age_at_visit Age of the infant at the last EID Care Visit.
Age_last_visit Age of the infant at the first EID Care Visit.
feeding_at_visit

1-Exclusive Breastfeeding

2-Replacement feeding(never breastfed)

3-Mixed feeding(<6Months)

4-Complementary feeding(>6Months)

5-Weaning

6-No longer breastfeeding

Feeding practice of the infant at the last EID Care Visit.
Wasting_2 An indication of wasting in the infant at the last EID Care Visit.
final_outcome

1-Discharged Negative

2-Referred for ART

3-Lost

4-Died Negative

5-Transferred

Outcome of the EID Care

2. View a few observations of the imported Dataset

head(EID_Dataset)
##   Cohort Infant_No Date_registration Sex Date_of_birth Age_in_months
## 1  After    620001         14/7/2020   2         43867          1.50
## 2  After    620002         14/7/2020   2         43867          1.50
## 3  After    620003         15/7/2020   2         43896          1.50
## 4  After    620004         15/7/2020   2         43867          1.50
## 5  After    620005         16/7/2020   1         44080          1.25
## 6  After    620006         16/7/2020   1         43927          1.50
##   Clinic_referred_from Date_at_NVP_initiation Age_NVP_initiation Date_at_cotrim
## 1                    1                  43867                1.5      14/7/2020
## 2                    1                  43836                1.5      14/7/2020
## 3                    1                  43896                1.5      15/7/2020
## 4                    1                  43867                1.5      15/7/2020
## 5                    1                  44080                1.5      21/7/2020
## 6                    1                  43988                1.5      16/7/2020
##   Age_at_cotrim HIV_B_feeding ART_B_feeding ARVs_EMTCT Infant_ARVs_EMTCT
## 1           1.5             0             1          1                 1
## 2           1.5             0             1          1                 1
## 3           1.5             0             1          1                 1
## 4           1.5             0             1          1                 1
## 5           1.5             0             1          1                 1
## 6           1.5             0             1          1                 1
##   _1st_PCR_test Date_1st_DBS Age_1st_DBS_001 B_feeding_1st_DBS _1st_DBS_result
## 1             1    14/7/2020            1.50                 1               2
## 2             1    14/7/2020            1.50                 1               2
## 3             1    15/7/2020            1.50                 1               2
## 4             1    15/7/2020            1.50                 1               2
## 5             1    16/7/2020            1.25                 1               2
## 6             1    16/7/2020            1.50                 6               2
##   Date_2nd_DBS Age_2nd_DBS_001 B_feeding_2nd_DBS _2nd_DBS_result
## 1        44258            9.00                 4               1
## 2        44258            9.00                 4               2
## 3        44258            9.00                 4               2
## 4    24/2/2021            9.00              <NA>               2
## 5    24/2/2021            8.75                 6               2
## 6    24/2/2022            9.00                 6               2
##   result_rapid_test Number_of_visits Date_of_first_visit Age_at_visit
## 1              <NA>                7           14/7/2020         1.50
## 2                 2                6           14/7/2020         1.50
## 3                 2                6           15/7/2020         1.50
## 4                 2                4           15/7/2020         1.50
## 5              <NA>                2           16/7/2020         1.25
## 6              <NA>                5           16/7/2020         1.50
##   Date_of_last_visit Age_last_visit Cotrim_at_visit NVP_at_visit
## 1          26/5/2021           11.0               1            2
## 2          26/5/2021           11.5               2            2
## 3          26/5/2021           12.0               1            1
## 4          26/5/2021           11.0               1            2
## 5          20/7/2020            2.5               1            2
## 6          31/5/2021           12.0               1            2
##   feeding_at_visit Wasting_2 final_outcome
## 1                4      14.8             1
## 2                6      14.5             1
## 3                1      15.3             3
## 4                1      14.5             3
## 5                1        NA             1
## 6                6      14.5             1

3. View Column heads of the dataset

colnames(EID_Dataset)
##  [1] "Cohort"                 "Infant_No"              "Date_registration"     
##  [4] "Sex"                    "Date_of_birth"          "Age_in_months"         
##  [7] "Clinic_referred_from"   "Date_at_NVP_initiation" "Age_NVP_initiation"    
## [10] "Date_at_cotrim"         "Age_at_cotrim"          "HIV_B_feeding"         
## [13] "ART_B_feeding"          "ARVs_EMTCT"             "Infant_ARVs_EMTCT"     
## [16] "_1st_PCR_test"          "Date_1st_DBS"           "Age_1st_DBS_001"       
## [19] "B_feeding_1st_DBS"      "_1st_DBS_result"        "Date_2nd_DBS"          
## [22] "Age_2nd_DBS_001"        "B_feeding_2nd_DBS"      "_2nd_DBS_result"       
## [25] "result_rapid_test"      "Number_of_visits"       "Date_of_first_visit"   
## [28] "Age_at_visit"           "Date_of_last_visit"     "Age_last_visit"        
## [31] "Cotrim_at_visit"        "NVP_at_visit"           "feeding_at_visit"      
## [34] "Wasting_2"              "final_outcome"

4. Subset the dataset to keep fewer/important variables

EID_Data<-EID_Dataset[,c(
  "Cohort","Infant_No","Sex","Age_in_months","Clinic_referred_from","Age_NVP_initiation",
  "Age_at_cotrim","HIV_B_feeding","ART_B_feeding","ARVs_EMTCT","Infant_ARVs_EMTCT","B_feeding_1st_DBS",
  "Number_of_visits","Age_at_visit","Age_last_visit","feeding_at_visit", "Wasting_2",
  "final_outcome"
)]

Note:

We need to prepare the dataset to a structure suitable for the “Survtmle” package/function. The simple data structure should contain:

5. Calculate follow up time from age (in months) at registration and age at last visit(t)

library(tidyverse)

Age_last<-transmute(EID_Data,Age_last=as.integer(Age_last_visit))
Age_first<-transmute(EID_Data,Age_first=as.integer(Age_in_months))
age_diff<-cbind(Age_last,Age_first)
time<-transmute(age_diff,ftime=Age_last-Age_first)
EID_Data<-cbind(EID_Data,time)

6. View the first observations of the new Dataset

head(EID_Data)
##   Cohort Infant_No Sex Age_in_months Clinic_referred_from Age_NVP_initiation
## 1  After    620001   2          1.50                    1                1.5
## 2  After    620002   2          1.50                    1                1.5
## 3  After    620003   2          1.50                    1                1.5
## 4  After    620004   2          1.50                    1                1.5
## 5  After    620005   1          1.25                    1                1.5
## 6  After    620006   1          1.50                    1                1.5
##   Age_at_cotrim HIV_B_feeding ART_B_feeding ARVs_EMTCT Infant_ARVs_EMTCT
## 1           1.5             0             1          1                 1
## 2           1.5             0             1          1                 1
## 3           1.5             0             1          1                 1
## 4           1.5             0             1          1                 1
## 5           1.5             0             1          1                 1
## 6           1.5             0             1          1                 1
##   B_feeding_1st_DBS Number_of_visits Age_at_visit Age_last_visit
## 1                 1                7         1.50           11.0
## 2                 1                6         1.50           11.5
## 3                 1                6         1.50           12.0
## 4                 1                4         1.50           11.0
## 5                 1                2         1.25            2.5
## 6                 6                5         1.50           12.0
##   feeding_at_visit Wasting_2 final_outcome ftime
## 1                4      14.8             1    10
## 2                6      14.5             1    10
## 3                1      15.3             3    11
## 4                1      14.5             3    10
## 5                1        NA             1     1
## 6                6      14.5             1    11

7. Renaming and Re-coding the Cohort and final_outcome Variables :

EID_Data<-EID_Data %>% rename(
  trt = Cohort, ftype = final_outcome
)

8. View the data structure

str(EID_Data)
## 'data.frame':    557 obs. of  19 variables:
##  $ trt                 : chr  "After" "After" "After" "After" ...
##  $ Infant_No           : chr  "620001" "620002" "620003" "620004" ...
##  $ Sex                 : chr  "2" "2" "2" "2" ...
##  $ Age_in_months       : num  1.5 1.5 1.5 1.5 1.25 1.5 1.5 1.5 1.5 1 ...
##  $ Clinic_referred_from: chr  "1" "1" "1" "1" ...
##  $ Age_NVP_initiation  : num  1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
##  $ Age_at_cotrim       : num  1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
##  $ HIV_B_feeding       : chr  "0" "0" "0" "0" ...
##  $ ART_B_feeding       : chr  "1" "1" "1" "1" ...
##  $ ARVs_EMTCT          : chr  "1" "1" "1" "1" ...
##  $ Infant_ARVs_EMTCT   : chr  "1" "1" "1" "1" ...
##  $ B_feeding_1st_DBS   : chr  "1" "1" "1" "1" ...
##  $ Number_of_visits    : num  7 6 6 4 2 5 6 8 8 7 ...
##  $ Age_at_visit        : num  1.5 1.5 1.5 1.5 1.25 1.5 1.5 1.5 1.5 1 ...
##  $ Age_last_visit      : num  11 11.5 12 11 2.5 12 12 9 13 12.5 ...
##  $ feeding_at_visit    : chr  "4" "6" "1" "1" ...
##  $ Wasting_2           : num  14.8 14.5 15.3 14.5 NA 14.5 15 14.5 14.5 14 ...
##  $ ftype               : chr  "1" "1" "3" "3" ...
##  $ ftime               : int  10 10 11 10 1 11 11 8 12 11 ...

9 . Re-coding of “trt”, “ftype” and “Clinic_referred_from” variables

library(Hmisc)

EID_Data$trt<-recode(EID_Data$trt, After = "1", Before = "0")

EID_Data$ftype<-recode(EID_Data$ftype,
                       "1"="0", "2" = "1", "3" = "0", "4" = "2", "5" = "2","6" = "0" )

# Recoding the clinic through which the infant joined care to:
# 0. EMTCT and 1. Other
EID_Data$Clinic_referred_from<-recode(EID_Data$Clinic_referred_from,
                                      "1"="0","2"="1","3"="1","4"="1","5"="1","6"="1")
# Categorizing the "Number_of_visits" variable
EID_Data$Visits<- cut(EID_Data$Number_of_visits,
                       breaks=c(0,3,6,9,12),
                       labels=c('<=3','3 to 5','6 to 8','8 to 11'))

# Recoding indication of whether the Mother was given EMTCT ARVs during ANC and PNC
#0. No, 1. Yes

EID_Data$ARVs_EMTCT<-recode(EID_Data$ARVs_EMTCT,
                                      "1"="1","2"="0","3"="0","NA"="0")

# Recoding indication of whether the Infant was given EMTCT ARVs during ANC and PNC
#0. No, 1. Yes

EID_Data$Infant_ARVs_EMTCT<-recode(EID_Data$Infant_ARVs_EMTCT,
                                      "1"="1","2"="1","3"="0","4"="0","NA"="0")

EID_Data$feeding_at_visit<-recode(EID_Data$feeding_at_visit,
                       "1"="1", "2" = "0", "3" = "1", "4" = "1", "5" = "0","6" = "0","NA"="0" )

Section B: Data Exploration and Descriptive Statistics

10.(a) Exploratory Analysis of Categorical Variables: Proportions/Cross Tabulation

# Preparing for cross Tabulation: load the "sjplot" library

req <- substitute(require(x, character.only = TRUE))
libs<-c("sjPlot")
sapply(libs, function(x) eval(req) || {install.packages(x); eval(req)})
## sjPlot 
##   TRUE

(i) Proportions/Cross Tabulation: Categorical Variables/Characteristics by Cohort

sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$ftype, 
                 title = "Cross Tabulation of the Failure Type by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of the Failure Type by Cohort
trt ftype Total
0 1 2
0 240
91.6 %
6
2.3 %
16
6.1 %
262
100 %
1 291
98.6 %
2
0.7 %
2
0.7 %
295
100 %
Total 531
95.3 %
8
1.4 %
18
3.2 %
557
100 %
χ2=15.888 · df=2 · Cramer’s V=0.169 · Fisher’s p=0.000
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$feeding_at_visit, 
                 title = "Cross Tabulation of the Feeding Type by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of the Feeding Type by Cohort
trt feeding_at_visit Total
0 1
0 137
56.4 %
106
43.6 %
243
100 %
1 161
60.3 %
106
39.7 %
267
100 %
Total 298
58.4 %
212
41.6 %
510
100 %
χ2=0.652 · df=1 · φ=0.040 · p=0.419
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Sex, 
                 title = "Cross Tabulation of Sex by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of Sex by Cohort
trt Sex Total
1 2
0 132
50.4 %
130
49.6 %
262
100 %
1 155
52.5 %
140
47.5 %
295
100 %
Total 287
51.5 %
270
48.5 %
557
100 %
χ2=0.180 · df=1 · φ=0.022 · p=0.671
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$HIV_B_feeding, 
                 title = "Cross Tabulation of HIV Testing at breastfeeding by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of HIV Testing at breastfeeding by Cohort
trt HIV_B_feeding Total
0 1
0 253
96.9 %
8
3.1 %
261
100 %
1 295
100 %
0
0 %
295
100 %
Total 548
98.6 %
8
1.4 %
556
100 %
χ2=7.140 · df=1 · φ=0.128 · Fisher’s p=0.002
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$ART_B_feeding, 
                 title = "Cross Tabulation of ART at Breastfeeding by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of ART at Breastfeeding by Cohort
trt ART_B_feeding Total
0 1
0 252
96.2 %
10
3.8 %
262
100 %
1 230
78 %
65
22 %
295
100 %
Total 482
86.5 %
75
13.5 %
557
100 %
χ2=37.973 · df=1 · φ=0.266 · p=0.000
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$ARVs_EMTCT, 
                 title = "Cross Tabulation of Mother's EMTCT ARVs  by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of Mother’s EMTCT ARVs by Cohort
trt ARVs_EMTCT Total
0 1
0 13
5 %
249
95 %
262
100 %
1 5
1.7 %
290
98.3 %
295
100 %
Total 18
3.2 %
539
96.8 %
557
100 %
χ2=3.749 · df=1 · φ=0.092 · Fisher’s p=0.033
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Infant_ARVs_EMTCT, 
                 title = "Cross Tabulation of Infant's EMTCT ARVs by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of Infant’s EMTCT ARVs by Cohort
trt Infant_ARVs_EMTCT Total
0 1
0 11
4.2 %
251
95.8 %
262
100 %
1 3
1 %
292
99 %
295
100 %
Total 14
2.5 %
543
97.5 %
557
100 %
χ2=4.507 · df=1 · φ=0.101 · Fisher’s p=0.027
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Clinic_referred_from, 
                 title = "Cross Tabulation of Clinic to EID Care by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of Clinic to EID Care by Cohort
trt Clinic_referred_from Total
0 1
0 183
69.8 %
79
30.2 %
262
100 %
1 276
93.6 %
19
6.4 %
295
100 %
Total 459
82.4 %
98
17.6 %
557
100 %
χ2=52.189 · df=1 · φ=0.311 · p=0.000
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Visits, 
                 title = "Cross Tabulation of Visits by Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of Visits by Cohort
trt Visits Total
<=3 3 to 5 6 to 8 8 to 11
0 45
17.2 %
56
21.4 %
119
45.4 %
42
16 %
262
100 %
1 43
14.7 %
109
37.2 %
137
46.8 %
4
1.4 %
293
100 %
Total 88
15.9 %
165
29.7 %
256
46.1 %
46
8.3 %
555
100 %
χ2=48.145 · df=3 · Cramer’s V=0.295 · p=0.000
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$feeding_at_visit, 
                 title = "Cross Tabulation of the Feeding Type at the last Visit and the Cohort",
                 show.row.prc = TRUE)
Cross Tabulation of the Feeding Type at the last Visit and the Cohort
trt feeding_at_visit Total
0 1
0 137
56.4 %
106
43.6 %
243
100 %
1 161
60.3 %
106
39.7 %
267
100 %
Total 298
58.4 %
212
41.6 %
510
100 %
χ2=0.652 · df=1 · φ=0.040 · p=0.419

(ii) Proportions/Cross Tabulation: Categorical Variables/Characteristics by Failure type

sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$Clinic_referred_from, 
                 title = "Cross Tabulation of the Clinic the infant came from and the Failure Type",
                 show.row.prc = TRUE)
Cross Tabulation of the Clinic the infant came from and the Failure Type
ftype Clinic_referred_from Total
0 1
0 442
83.2 %
89
16.8 %
531
100 %
1 6
75 %
2
25 %
8
100 %
2 11
61.1 %
7
38.9 %
18
100 %
Total 459
82.4 %
98
17.6 %
557
100 %
χ2=6.187 · df=2 · Cramer’s V=0.105 · Fisher’s p=0.031
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$HIV_B_feeding, 
                 title = "Cross Tabulation of HIV Testing at breastfeeding and the Failure Type",
                 show.row.prc = TRUE)
Cross Tabulation of HIV Testing at breastfeeding and the Failure Type
ftype HIV_B_feeding Total
0 1
0 523
98.7 %
7
1.3 %
530
100 %
1 7
87.5 %
1
12.5 %
8
100 %
2 18
100 %
0
0 %
18
100 %
Total 548
98.6 %
8
1.4 %
556
100 %
χ2=7.217 · df=2 · Cramer’s V=0.114 · Fisher’s p=0.127
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$ART_B_feeding, 
                 title = "Cross Tabulation of ARVs at breastfeeding and the Failure Type",
                 show.row.prc = TRUE)
Cross Tabulation of ARVs at breastfeeding and the Failure Type
ftype ART_B_feeding Total
0 1
0 460
86.6 %
71
13.4 %
531
100 %
1 7
87.5 %
1
12.5 %
8
100 %
2 15
83.3 %
3
16.7 %
18
100 %
Total 482
86.5 %
75
13.5 %
557
100 %
χ2=0.169 · df=2 · Cramer’s V=0.017 · Fisher’s p=0.877
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$ARVs_EMTCT, 
                 title = "Cross Tabulation of Mother's ARVs at EMTCT and the Failure Type",
                 show.row.prc = TRUE)
Cross Tabulation of Mother’s ARVs at EMTCT and the Failure Type
ftype ARVs_EMTCT Total
0 1
0 14
2.6 %
517
97.4 %
531
100 %
1 3
37.5 %
5
62.5 %
8
100 %
2 1
5.6 %
17
94.4 %
18
100 %
Total 18
3.2 %
539
96.8 %
557
100 %
χ2=30.954 · df=2 · Cramer’s V=0.236 · Fisher’s p=0.001
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$Infant_ARVs_EMTCT, 
                 title = "Cross Tabulation of Mother's ARVs at EMTCT and the Failure Type",
                 show.row.prc = TRUE)
Cross Tabulation of Mother’s ARVs at EMTCT and the Failure Type
ftype Infant_ARVs_EMTCT Total
0 1
0 11
2.1 %
520
97.9 %
531
100 %
1 1
12.5 %
7
87.5 %
8
100 %
2 2
11.1 %
16
88.9 %
18
100 %
Total 14
2.5 %
543
97.5 %
557
100 %
χ2=9.109 · df=2 · Cramer’s V=0.128 · Fisher’s p=0.042
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$Sex, 
                 title = "Cross Tabulation of Sex and the Failure Type",
                 show.row.prc = TRUE)
Cross Tabulation of Sex and the Failure Type
ftype Sex Total
1 2
0 278
52.4 %
253
47.6 %
531
100 %
1 2
25 %
6
75 %
8
100 %
2 7
38.9 %
11
61.1 %
18
100 %
Total 287
51.5 %
270
48.5 %
557
100 %
χ2=3.550 · df=2 · Cramer’s V=0.080 · Fisher’s p=0.179
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$feeding_at_visit, 
                 title = "Cross Tabulation of the Feeding Type at the last Visit and the Failure Type",
                 show.row.prc = TRUE)
Cross Tabulation of the Feeding Type at the last Visit and the Failure Type
ftype feeding_at_visit Total
0 1
0 295
60.7 %
191
39.3 %
486
100 %
1 2
25 %
6
75 %
8
100 %
2 1
6.2 %
15
93.8 %
16
100 %
Total 298
58.4 %
212
41.6 %
510
100 %
χ2=22.647 · df=2 · Cramer’s V=0.211 · Fisher’s p=0.000

10(b): Exploratory analysis of continuous variables: Summary Statistics

#Summary statistics for Age at registration/first visit and the Clinic of entry into EID care

# Age at Registration by cohort
tapply(EID_Data$Age_in_months, EID_Data$trt, summary)
## $`0`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.400   1.433   1.971   1.533  13.833 
## 
## $`1`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.500   1.500   2.255   2.500  16.000
# Age at last Visit by Clinic of entry into EID
tapply(EID_Data$Age_in_months, EID_Data$Clinic_referred_from, summary)
## $`0`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.450   1.500   1.876   1.817  12.000 
## 
## $`1`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.433   1.500   3.275   3.875  16.000
# Age at last Visit by Cohort
tapply(EID_Data$Age_last_visit, EID_Data$trt, summary)
## $`0`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     1.5     8.0    13.0    12.1    17.0    45.5 
## 
## $`1`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.50    9.50   13.00   12.20   15.43   20.00
# Age at NVP Initiation by Cohort
tapply(EID_Data$Age_NVP_initiation, EID_Data$trt, summary)
## $`0`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  0.0000  0.0000  0.0000  0.1818  0.0000  6.0000     196 
## 
## $`1`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  0.0000  0.0000  0.0000  0.8166  1.5000 14.0000       6
# Age at Cotrimoxazole Initiation by Cohort
tapply(EID_Data$Age_at_cotrim, EID_Data$trt, summary)
## $`0`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.500   1.500   1.500   2.003   1.500  15.000     148 
## 
## $`1`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   1.500   1.500   2.092   2.000  13.000       5

10(b) ii Use the Kruskal-Wallis test to check for difference in the groups

If the p-value is less than the significance level 0.05, we can conclude that there are significant differences between the treatment groups. The test can be performed using the function “kruskal.test()”

From Multiple pairwise-comparison between groups From the output of the Kruskal-Wallis test, we may know that there is a significant difference between groups, but we don’t know which pairs of groups are different.It’s possible to use the function “pairwise.wilcox.test()” to calculate pairwise comparisons between group levels with corrections for multiple testing.

## Comparison of age at first visit in the two cohorts

kruskal.test(Age_in_months ~ trt, data = EID_Data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  Age_in_months by trt
## Kruskal-Wallis chi-squared = 120.48, df = 1, p-value < 2.2e-16
## Comparison of age at first visit in the treatment groups

kruskal.test(Age_in_months ~ ftype, data = EID_Data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  Age_in_months by ftype
## Kruskal-Wallis chi-squared = 3.1702, df = 2, p-value = 0.2049

Section C: Cumulative Incidence Estimates

11. Infant HIV Infection Cumulative Incidence Estimates

# Load the necessary Packages

library(dplyr)
library(tidycmprsk)
library(ggsurvfit)
library(survival)
library(gtsummary)
library(ggplot2)
# Convert the Cohort (trt) and Outcome (ftype) variables factors

EID<-EID_Data

EID$trt<-as.factor(EID$trt)
EID$ftype<-as.factor(EID$ftype)
# Split the Dataset by cohort

EID_Before<-filter(EID,trt==0)
EID_After<-filter(EID,trt==1)
# Cumulative Incidence Estimates before COVID-19

Z<-cuminc(Surv(ftime, ftype) ~ 1, data = EID_Before)
tidy(Z,time = c(0, 6,12, 18))
## # A tibble: 8 × 12
##    time outcome estimate std.er…¹ conf.…² conf.…³ n.risk n.event n.cen…⁴ cum.e…⁵
##   <dbl> <chr>      <dbl>    <dbl>   <dbl>   <dbl>  <int>   <int>   <int>   <int>
## 1     0 1        0.00763  0.00539 0.00154  0.0255    262       2       7       2
## 2     6 1        0.0155   0.00770 0.00517  0.0369    196       2      49       4
## 3    12 1        0.0155   0.00770 0.00517  0.0369    127       0      67       4
## 4    18 1        0.0317   0.0137  0.0121   0.0672     11       2     115       6
## 5     0 2        0.0191   0.00847 0.00721  0.0417    262       5       7       5
## 6     6 2        0.0533   0.0145  0.0297   0.0866    196       8      49      13
## 7    12 2        0.0643   0.0163  0.0373   0.101     127       2      67      15
## 8    18 2        0.0880   0.0286  0.0425   0.154      11       1     115      16
## # … with 2 more variables: cum.censor <int>, time_max <dbl>, and abbreviated
## #   variable names ¹​std.error, ²​conf.low, ³​conf.high, ⁴​n.censor, ⁵​cum.event
# Cumulative Incidence Estimates After COVID-19

Z<-cuminc(Surv(ftime, ftype) ~ 1, data = EID_After)
tidy(Z,time = c(0, 6,12, 18))
## # A tibble: 8 × 12
##    time outcome estimate std.er…¹ conf.…² conf.…³ n.risk n.event n.cen…⁴ cum.e…⁵
##   <dbl> <chr>      <dbl>    <dbl>   <dbl>   <dbl>  <int>   <int>   <int>   <int>
## 1     0 1        0.00678  0.00479 1.38e-3  0.0227    295       2      11       2
## 2     6 1        0.00678  0.00479 1.38e-3  0.0227    236       0      50       2
## 3    12 1        0.00678  0.00479 1.38e-3  0.0227    136       0     120       2
## 4    18 1        0.00678  0.00479 1.38e-3  0.0227      3       0     110       2
## 5     0 2        0.00339  0.00339 3.27e-4  0.0178    295       1      11       1
## 6     6 2        0.00339  0.00339 3.27e-4  0.0178    236       0      50       1
## 7    12 2        0.00905  0.00658 1.71e-3  0.0310    136       1     120       2
## 8    18 2        0.00905  0.00658 1.71e-3  0.0310      3       0     110       2
## # … with 2 more variables: cum.censor <int>, time_max <dbl>, and abbreviated
## #   variable names ¹​std.error, ²​conf.low, ³​conf.high, ⁴​n.censor, ⁵​cum.event
#Estimate of the cumulative incidence at various times by Cohort and Gray’s test to test for a difference between groups over the entire follow-up period

cuminc(Surv(ftime, ftype) ~ trt, data = EID)%>% 
  tbl_cuminc(
    times =c(0,6,12,18), 
    label_header = "**Month {time}**") %>% 
  add_p()
Characteristic Month 0 Month 6 Month 12 Month 18 p-value1
trt 0.15
    0 0.76% (0.15%, 2.5%) 1.5% (0.52%, 3.7%) 1.5% (0.52%, 3.7%) 3.2% (1.2%, 6.7%)
    1 0.68% (0.14%, 2.3%) 0.68% (0.14%, 2.3%) 0.68% (0.14%, 2.3%) 0.68% (0.14%, 2.3%)
1 Gray's Test

11 (a) Cumulative Incidence Plot: Infant HIV Infection

# Plot of HIV Infant Infection over the follow-up period

J<-cuminc(Surv(ftime, ftype) ~ trt, data = EID)


ggcuminc(
  J,
  outcome = c("1"),
  linetype_aes = FALSE,
  theme = theme_ggsurvfit_default()
)+add_confidence_interval()+add_risktable()+
  geom_point()+
  coord_cartesian(
    xlim = c(0,18),
    ylim = NULL,
  )+
  theme(axis.text.x = element_text(),
        axis.text.y = element_text())+
  xlab("Follow up Time (Months)")

11 (b) Factors Associated with EID care Infant HIV Infection

# Estimation of the subdistribution hazards, adjusting for the Cohort and Infant's Sex
crr(Surv(ftime, ftype) ~ trt+Sex,data = EID)
## 
## Variable   Coef    SE      HR     95% CI       p-value    
## trt1       -1.09   0.806   0.34   0.07, 1.63   0.18       
## Sex2       1.20    0.808   3.32   0.68, 16.2   0.14
# Estimation of the subdistribution hazards, adjusting for the Cohort and Age at first visit
crr(Surv(ftime, ftype) ~ trt+Age_in_months,data = EID)
## 
## Variable        Coef    SE      HR     95% CI       p-value    
## trt1            -1.08   0.791   0.34   0.07, 1.59   0.17       
## Age_in_months   0.301   0.055   1.35   1.21, 1.50   <0.001
# Estimation of the subdistribution hazards, adjusting for the Cohort  and Mother's ARVs enrollment
crr(Surv(ftime, ftype) ~ trt+ARVs_EMTCT,data = EID)
## 
## Variable      Coef     SE      HR     95% CI       p-value    
## trt1          -0.801   0.861   0.45   0.08, 2.43   0.35       
## ARVs_EMTCT1   -2.95    0.807   0.05   0.01, 0.25   <0.001
# Estimation of the subdistribution hazards, adjusting for the Cohort and Infant's ARVs Prophylaxis
crr(Surv(ftime, ftype) ~ trt+Infant_ARVs_EMTCT,data = EID)
## 
## Variable             Coef    SE      HR     95% CI       p-value    
## trt1                 -1.01   0.805   0.36   0.07, 1.76   0.21       
## Infant_ARVs_EMTCT1   -1.67   1.12    0.19   0.02, 1.70   0.14
# Estimation of the subdistribution hazards, adjusting for the Cohort and the Clinic through which the infant joined EID Care
crr(Surv(ftime, ftype) ~ trt+Clinic_referred_from,data = EID)
## 
## Variable                Coef    SE      HR     95% CI       p-value    
## trt1                    -1.09   0.810   0.34   0.07, 1.64   0.18       
## Clinic_referred_from1   0.128   0.830   1.14   0.22, 5.78   0.88
# Estimation of the subdistribution hazards, adjusting for the Cohort and Breast feeding status at the last visit
crr(Surv(ftime, ftype) ~ trt+feeding_at_visit,data = EID)
## 47 cases omitted due to missing values
## 
## Variable            Coef     SE      HR     95% CI       p-value    
## trt1                -0.985   0.809   0.37   0.08, 1.82   0.22       
## feeding_at_visit1   1.84     0.655   6.31   1.75, 22.7   0.005
# Estimation of the subdistribution hazards, adjusting for the Cohort and the other variables; Full Model
crr(Surv(ftime, ftype) ~ trt+Sex+Age_in_months+ARVs_EMTCT+Infant_ARVs_EMTCT+Clinic_referred_from+feeding_at_visit,data = EID)
## 47 cases omitted due to missing values
## 
## Variable                Coef     SE      HR     95% CI       p-value    
## trt1                    -0.589   0.835   0.55   0.11, 2.85   0.48       
## Sex2                    0.852    0.923   2.34   0.38, 14.3   0.36       
## Age_in_months           0.385    0.106   1.47   1.20, 1.81   <0.001     
## ARVs_EMTCT1             -2.45    0.736   0.09   0.02, 0.37   <0.001     
## Infant_ARVs_EMTCT1      0.276    1.42    1.32   0.08, 21.5   0.85       
## Clinic_referred_from1   -1.37    1.14    0.25   0.03, 2.36   0.23       
## feeding_at_visit1       2.12     0.795   8.31   1.75, 39.4   0.008

Section D:Targeted Learning for Survival Analysis with Competing Risks

12. Iterated mean-based TMLE

A common goal is to compare the incidence of failure at a fixed time between the two treatment groups. Covariate adjustment is often desirable in this comparison to improve efficiency.This covariate adjustment may be facilitated by estimating a series of iterated covariate-conditional means.The final iterated covariate-conditional mean is marginalized over the empirical distribution of baseline covariates,to obtain an estimate of the marginal cumulative incidence.Here, we invoke the eponymous survtmle function to compute the iterated mean-based (method = “mean”) covariate-adjusted estimates of the cumulative incidence at time six (t0 = 6) in each of the treatment groups using quasi-logistic regression (formula specified via glm.ftime) to estimate the iterated means. The glm.ftime argument should be a valid right-hand-side formula specification based on colnames(adjustVars) and “trt”. Here we use a simple main terms regression.

Internally, survtmle estimates the covariate-conditional treatment probability (via glm.trt or SL.trt) and covariate-conditional censoring distribution (via glm.ctime or SL.ctime,). In the above (fit0), the treatment probability does not depend on covariates and so we did not specify a way to adjust for covariates in estimating the treatment probability.In this case, survtmle sets glm.trt = “1”, which corresponds with empirical estimates of treatment probability,and sets glm.ctime to be equivalent to the Kaplan-Meier censoring distribution estimates.

In practice, we may wish to adjust for covariates when computing estimates of the covariate-conditional treatment and censoring probabilities. In observational studies,the distribution of treatment may differ by measured covariates,while in almost any study (including randomized trials) it is possible that censoring differs by covariates. Thus, we often wish to adjust for covariates to account for measured confounders of treatment receipt and censoring. This adjustment may be accomplished using logistic regression through the “glm.trt” and “glm.ctime” arguments, respectively.The “glm.trt” argument should be a valid right-hand-side formula specification based on colnames(adjustVars).The “glm.ctime” argument should be a valid right-hand-side formula specification based on colnames(adjustVars), “trt”, and “t” used to model the hazard function for censoring. By including “trt” and “t”, the function allows censoring probabilities to depend on treatment assignment and time, respectively.

library(survtmle)
## survtmle: Targeted Learning for Survival Analysis
## Version: 1.1.1

It is important to note that the current “survtmle” distribution only supports integer-valued failure times. We shall convert the “ftype” and “trt” variables to integers prior to applying the “survtmle” function.

EID_Data$ftype<-as.integer(EID_Data$ftype)
EID_Data$trt<-as.integer(EID_Data$trt)
#EID_Data$ftime<-EID_Data$ftime

Some failure times are less than or equal zero. We need remove these observations otherwise the function will return an error. Let’s eliminate zero failure times by adding 1 to the “ftime” variable

EID_Data$ftime<-EID_Data$ftime+1
EID_Data <- EID_Data %>%
    mutate_at(c('ARVs_EMTCT'), ~replace_na(.,'0'))
EID_Data <- EID_Data %>%
    mutate_at(c('Infant_ARVs_EMTCT'), ~replace_na(.,'0'))
EID_Data <- EID_Data %>%
    mutate_at(c('feeding_at_visit'), ~replace_na(.,'0'))

Note:

Since it is not logical to assume that HIV Exposed infants belong to either the pre-COVID 19 or post-COVID-19 group by chance, we take the probability of treatment/exposure to the COVID-19 pandemic to be 1. However, the censoring probability could be confounded by covariates. In the fit below (fit0b), we eliminate “glm.trt” and include “glm.ctime

12(a): Estimation in the unbounded model

fita <- survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,
                 trt = EID_Data$trt, adjustVars = EID_Data[c("Sex","Clinic_referred_from","Age_in_months")],
                 glm.ctime = "trt + Sex + Clinic_referred_from + Age_in_months",
                 glm.ftime = "trt + Sex + Clinic_referred_from + Age_in_months",
                 glm.trt = "Sex + Clinic_referred_from + Age_in_months",
                 method = "mean", t0 = 1, returnModels=TRUE)
fita
## $est
##            [,1]
## 0 1 0.008326190
## 1 1 0.006509832
## 0 2 0.016694358
## 1 2 0.003154582
## 
## $var
##               0 1           1 1           0 2           1 2
## 0 1  9.137230e-05  7.662578e-06 -5.594176e-06  2.058494e-08
## 1 1  7.662578e-06  1.421580e-05 -2.652080e-07 -4.751520e-08
## 0 2 -5.594176e-06 -2.652080e-07  7.997958e-05  7.374044e-09
## 1 2  2.058494e-08 -4.751520e-08  7.374044e-09  9.088585e-06

13. Factors associated with Infant HIV Infection: Cause specific hazards TMLE

Bivariate Analysis: Infant’s Sex

haz_sex<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex")],
                  glm.ctime = "trt + Sex",
                  glm.ftime = "trt + Sex",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sex$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##             Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)   -6.975     0.7381  -9.450 3.400e-21 ***
## trt           -1.185     0.8171  -1.450 1.469e-01    
## Sex2           1.247     0.8170   1.526 1.270e-01    
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 117.4;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 123.399; log Likelihood: -58.69948;
## RSS: 5388.8; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 6.24e-12; convergence: TRUE.

Bivariate Analysis: Clinic from which the Infant/Mother was referred to EID Care

haz_clinic<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Clinic_referred_from")],
                  glm.ctime = "trt + Clinic_referred_from",
                  glm.ftime = "trt + Clinic_referred_from",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_clinic$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error  z value  Pr(>|z|)    
## (Intercept)             -6.270     0.4915 -12.7568 2.857e-37 ***
## trt                     -1.145     0.8515  -1.3452 1.786e-01    
## Clinic_referred_from1    0.212     0.8520   0.2489 8.035e-01    
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 120.06;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 126.0637; log Likelihood: -60.03186;
## RSS: 6298.6; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 2.62e-13; convergence: TRUE.

Bivariate Analysis: Infant’s Age at first EID Care Visit

haz_age<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Age_in_months")],
                  glm.ctime = "trt + Age_in_months",
                  glm.ftime = "trt + Age_in_months",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_age$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##               Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)    -7.2198    0.58170 -12.412 2.263e-35 ***
## trt            -1.0489    0.83497  -1.256 2.090e-01    
## Age_in_months   0.3542    0.07561   4.685 2.804e-06 ***
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 107.36;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 113.362; log Likelihood: -53.681;
## RSS: 4622.7; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 8.19e-13; convergence: TRUE.

Bivariate Analysis: Infant’s initiation on EMTCT ARVs at birth

haz_infant<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Infant_ARVs_EMTCT")],
                  glm.ctime = "trt + Infant_ARVs_EMTCT",
                  glm.ftime = "trt + Infant_ARVs_EMTCT",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_infant$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                    Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)          -4.167     1.0128  -4.115 3.878e-05 ***
## trt                  -1.123     0.8215  -1.367 1.715e-01    
## Infant_ARVs_EMTCT1   -2.177     1.0815  -2.013 4.414e-02 *  
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 117.67;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 123.6698; log Likelihood: -58.83489;
## RSS: 6418.5; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 3.73e-13; convergence: TRUE.

Bivariate Analysis: Mother’s Prolonged ART status

haz_mother<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("ARVs_EMTCT")],
                  glm.ctime = "trt + ARVs_EMTCT",
                  glm.ftime = "trt + ARVs_EMTCT",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_mother$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##             Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)  -3.4178     0.6000  -5.696 1.226e-08 ***
## trt          -0.9282     0.8309  -1.117 2.639e-01    
## ARVs_EMTCT1  -3.3127     0.7479  -4.429 9.454e-06 ***
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 107.62;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 113.6184; log Likelihood: -53.80921;
## RSS: 5615.8; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 1.58e-12; convergence: TRUE.

Bivariate Analysis: Infant’s breastfeeding status at the last recorded EID Care visit

haz_feed<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("feeding_at_visit")],
                  glm.ctime = "trt + feeding_at_visit",
                  glm.ftime = "trt + feeding_at_visit",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_feed$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                   Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)         -7.290     0.7384  -9.873 5.462e-23 ***
## trt                 -1.170     0.8174  -1.431 1.524e-01    
## feeding_at_visit1    2.107     0.8172   2.578 9.942e-03 ** 
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 112.07;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 118.0654; log Likelihood: -56.03268;
## RSS: 4800.1; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 2.46e-10; convergence: TRUE.

Mutivariable Analysis

Model 1:Sex, Age

haz_sexage<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex","Age_in_months")],
                  glm.ctime = "trt + Sex + Age_in_months",
                  glm.ftime = "trt + Sex + Age_in_months",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexage$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##               Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)    -7.9721    0.84422  -9.443 3.618e-21 ***
## trt            -1.0381    0.83340  -1.246 2.129e-01    
## Sex2            1.2292    0.82197   1.495 1.348e-01    
## Age_in_months   0.3528    0.07571   4.660 3.161e-06 ***
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6278; residuals deviance: 104.76;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 112.7583; log Likelihood: -52.37916;
## RSS: 5344; dispersion: 1; iterations: 10;
## rank: 4; max tolerance: 2.64e-11; convergence: TRUE.

Model 2: Sex, Age, Clinic

haz_sexageclin<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex","Age_in_months","Clinic_referred_from")],
                  glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from",
                  glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)             -7.994     0.8688  -9.201 3.537e-20 ***
## trt                     -1.260     0.8413  -1.498 1.343e-01    
## Sex2                     1.282     0.8257   1.552 1.206e-01    
## Age_in_months            0.477     0.1211   3.940 8.150e-05 ***
## Clinic_referred_from1   -1.795     1.3407  -1.339 1.806e-01    
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6277; residuals deviance: 102.62;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 112.6166; log Likelihood: -51.30829;
## RSS: 6827.6; dispersion: 1; iterations: 10;
## rank: 5; max tolerance: 1.04e-10; convergence: TRUE.

Model 3: Sex, Age, Clinic, Infant ARVs

haz_sexageclin_infant<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex","Age_in_months","Clinic_referred_from","Infant_ARVs_EMTCT")],
                  glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT",
                  glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_infant$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)            -7.0942     1.3146 -5.3966 6.791e-08 ***
## trt                    -1.1562     0.8567 -1.3497 1.771e-01    
## Sex2                    1.4407     0.8678  1.6602 9.687e-02 .  
## Age_in_months           0.4464     0.1207  3.6995 2.160e-04 ***
## Clinic_referred_from1  -1.4414     1.2998 -1.1090 2.674e-01    
## Infant_ARVs_EMTCT1     -1.0772     1.2372 -0.8707 3.839e-01    
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6276; residuals deviance: 101.99;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 113.9884; log Likelihood: -50.99419;
## RSS: 5477.5; dispersion: 1; iterations: 10;
## rank: 6; max tolerance: 3.16e-10; convergence: TRUE.

Model 4: Sex, Age, Clinic, Maternal ART

haz_sexageclin_mot<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex","Age_in_months","Clinic_referred_from", "ARVs_EMTCT")],
                  glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT",
                  glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_mot$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)            -6.1442     1.1226  -5.473 4.418e-08 ***
## trt                    -1.0710     0.8558  -1.252 2.107e-01    
## Sex2                    1.6428     0.9394   1.749 8.034e-02 .  
## Age_in_months           0.4307     0.1306   3.299 9.715e-04 ***
## Clinic_referred_from1  -1.9704     1.2806  -1.539 1.239e-01    
## ARVs_EMTCT1            -2.3706     0.8491  -2.792 5.238e-03 ** 
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6276; residuals deviance: 96.09;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 108.0906; log Likelihood: -48.0453;
## RSS: 6384.1; dispersion: 1; iterations: 11;
## rank: 6; max tolerance: 2.22e-15; convergence: TRUE.

Model 5: Sex, Age, Clinic, Infant ARVs,Maternal ART

haz_sexageclin_infant_mot<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex","Age_in_months","Clinic_referred_from","Infant_ARVs_EMTCT", "ARVs_EMTCT")],
                  glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT + ARVs_EMTCT",
                  glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT + ARVs_EMTCT",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_infant_mot$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)            -6.2693     1.4287 -4.3881 1.143e-05 ***
## trt                    -1.0849     0.8599 -1.2616 2.071e-01    
## Sex2                    1.5962     0.9790  1.6304 1.030e-01    
## Age_in_months           0.4360     0.1373  3.1754 1.496e-03 ** 
## Clinic_referred_from1  -2.0318     1.3665 -1.4869 1.370e-01    
## Infant_ARVs_EMTCT1      0.2072     1.4196  0.1459 8.840e-01    
## ARVs_EMTCT1            -2.4154     0.8940 -2.7018 6.896e-03 ** 
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6275; residuals deviance: 96.07;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 110.0687; log Likelihood: -48.03437;
## RSS: 6244.7; dispersion: 1; iterations: 11;
## rank: 7; max tolerance: 5.32e-15; convergence: TRUE.

Model 6: Sex, Age, Clinic, Maternal ART,Breastfeeding status

haz_sexageclin_motfeed<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex","Age_in_months","Clinic_referred_from", "ARVs_EMTCT","feeding_at_visit")],
                  glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
                  glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_motfeed$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)            -7.3913     1.2012 -6.1532 7.593e-10 ***
## trt                    -0.7743     0.8886 -0.8714 3.836e-01    
## Sex2                    1.0232     0.8700  1.1761 2.396e-01    
## Age_in_months           0.4994     0.1232  4.0537 5.042e-05 ***
## Clinic_referred_from1  -1.6846     1.1335 -1.4862 1.372e-01    
## ARVs_EMTCT1            -2.5477     0.8300 -3.0696 2.143e-03 ** 
## feeding_at_visit1       2.6553     0.9315  2.8504 4.366e-03 ** 
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6275; residuals deviance: 85.85;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 99.85082; log Likelihood: -42.92541;
## RSS: 12784; dispersion: 1; iterations: 11;
## rank: 7; max tolerance: 9.47e-14; convergence: TRUE.

Model 7: Sex, Age, Clinic, Infant ARVs, Maternal ART, Breastfeeding status

fitahaz <- survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,
                 trt = EID_Data$trt, adjustVars = EID_Data[c("Sex","Clinic_referred_from","Age_in_months","Infant_ARVs_EMTCT","ARVs_EMTCT","feeding_at_visit")],
                 glm.ctime = "trt + Sex + Clinic_referred_from + Age_in_months + Infant_ARVs_EMTCT + ARVs_EMTCT + feeding_at_visit",
                 glm.ftime = "trt + Sex + Clinic_referred_from + Age_in_months + Infant_ARVs_EMTCT + ARVs_EMTCT + feeding_at_visit",
                 
                 method = "hazard", t0 = 1, returnModels=TRUE)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(fitahaz$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error  z value  Pr(>|z|)    
## (Intercept)            -7.3783     1.4016 -5.26429 1.407e-07 ***
## trt                    -0.7729     0.8923 -0.86614 3.864e-01    
## Sex2                    1.0286     0.9214  1.11641 2.642e-01    
## Clinic_referred_from1  -1.6811     1.1488 -1.46341 1.434e-01    
## Age_in_months           0.4990     0.1244  4.01063 6.056e-05 ***
## Infant_ARVs_EMTCT1     -0.0241     1.3408 -0.01797 9.857e-01    
## ARVs_EMTCT1            -2.5416     0.8968 -2.83416 4.595e-03 ** 
## feeding_at_visit1       2.6561     0.9328  2.84754 4.406e-03 ** 
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6274; residuals deviance: 85.85;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 101.8505; log Likelihood: -42.92525;
## RSS: 12805.5; dispersion: 1; iterations: 11;
## rank: 8; max tolerance: 1.25e-13; convergence: TRUE.

Multivariable Model Goodness of Fit Comparison: Akaike Information Criterion (AIC), RSS, DF, etc

Parameter Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7
Log-likelihood -52.37 -51.31 -50.99 -48.05 -48.03 -42.93 -42.93
Residual Sum of Squares 5344.00 6827.60 5477.50 6384.10 6244.70 12784.00 12805.50
Observations 6282 6282 6282 6282 6282 6282 6282
Degrees of Freedom 6281 6281 6281 6281 6281 6281 6281
AIC 112.76 112.62 113.99 108.09 110.07 99.85 101.85

From the above Table, Model 6 has the lowest AIC value and the highest Log-likelihood value.We shall carry on with Model 6 as ‘fithaz’

fithaz<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars = 
                    EID_Data[c("Sex","Age_in_months","Clinic_referred_from", "ARVs_EMTCT","feeding_at_visit")],
                  glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
                  glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
                  method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(fithaz$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
## 
## Call:  speedglm::speedglm(formula = reg_form, data = data, family = family,      sparse = ifelse(calling_fun == "estimateTreatment", FALSE,          TRUE), trace = FALSE, method = "Cholesky") 
## 
## Coefficients:
##  ------------------------------------------------------------------ 
##                       Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)            -7.3913     1.2012 -6.1532 7.593e-10 ***
## trt                    -0.7743     0.8886 -0.8714 3.836e-01    
## Sex2                    1.0232     0.8700  1.1761 2.396e-01    
## Age_in_months           0.4994     0.1232  4.0537 5.042e-05 ***
## Clinic_referred_from1  -1.6846     1.1335 -1.4862 1.372e-01    
## ARVs_EMTCT1            -2.5477     0.8300 -3.0696 2.143e-03 ** 
## feeding_at_visit1       2.6553     0.9315  2.8504 4.366e-03 ** 
## 
## ------------------------------------------------------------------- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6275; residuals deviance: 85.85;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 99.85082; log Likelihood: -42.92541;
## RSS: 12784; dispersion: 1; iterations: 11;
## rank: 7; max tolerance: 9.47e-14; convergence: TRUE.
# coefficients(fitahaz$ftimeMod$J1)
tp.fithaz <- timepoints(fithaz, times = seq_len(19))  # t_0 = 18

# print the object
tp.fithaz
## $est
##               t1          t2          t3          t4          t5         t6
## 0 1 0.0048202828 0.008875314 0.012389183 0.015503285 0.018310145 0.02087274
## 1 1 0.0024215192 0.004629900 0.006664568 0.008555788 0.010326970 0.01199638
## 0 2 0.0067775751 0.013346996 0.019729907 0.025942991 0.031999517 0.03791033
## 1 2 0.0009777206 0.001945440 0.002903919 0.003853795 0.004795607 0.00572982
##              t7          t8          t9         t10        t11        t12
## 0 1 0.023235659 0.025431700 0.027485782 0.029417386 0.03124210 0.03297262
## 1 1 0.013578406 0.015084504 0.016523895 0.017904089 0.01923128 0.02051063
## 0 2 0.043684484 0.049329687 0.054852600 0.060259055 0.06555422 0.07074271
## 1 2 0.006656836 0.007577008 0.008490647 0.009398029 0.01029941 0.01119500
##            t13        t14        t15        t16        t17        t18
## 0 1 0.03461946 0.03619140 0.03769590 0.03913929 0.04052701 0.04186379
## 1 1 0.02174651 0.02294264 0.02410224 0.02522811 0.02632271 0.02738823
## 0 2 0.07582870 0.08081599 0.08570804 0.09050806 0.09521903 0.09984370
## 1 2 0.01208501 0.01296964 0.01384904 0.01472337 0.01559279 0.01645742
##            t19
## 0 1 0.04315371
## 1 1 0.02842663
## 0 2 0.10438467
## 1 2 0.01731739
## 
## $var
##               t1           t2           t3           t4           t5
## 0 1 1.654503e-05 4.839176e-05 4.560346e-05 4.532798e-05 4.654452e-05
## 1 1 7.041655e-05 8.627864e-05 1.268453e-04 1.514886e-04 1.757644e-04
## 0 2 2.156742e-05 2.079303e-05 2.045204e-05 2.042525e-05 2.058985e-05
## 1 2 1.145558e-05 1.143984e-05 1.144357e-05 1.147431e-05 1.153112e-05
##               t6           t7           t8           t9          t10
## 0 1 4.966405e-05 5.387139e-05 5.912679e-05 6.620294e-05 7.016317e-05
## 1 1 2.834521e-04 2.836570e-04 3.413778e-04 3.444033e-04 3.468073e-04
## 0 2 2.092015e-05 2.137484e-05 2.196573e-05 2.258566e-05 2.325815e-05
## 1 2 1.161952e-05 1.171705e-05 1.185052e-05 1.200811e-05 1.215642e-05
##              t11          t12          t13          t14          t15
## 0 1 6.937105e-05 6.898776e-05 6.900238e-05 1.763088e-04 2.686324e-04
## 1 1 4.448937e-04 4.402236e-04 4.372478e-04 4.355286e-04 4.364386e-04
## 0 2 2.399484e-05 2.473368e-05 2.541246e-05 2.610179e-05 2.681669e-05
## 1 2 1.234037e-05 6.551036e-05 6.560608e-05 6.571289e-05 6.584721e-05
##              t16          t17          t18          t19
## 0 1 2.688449e-04 2.690909e-04 2.694440e-04 2.697537e-04
## 1 1 4.389367e-04 4.391585e-04 9.254904e-04 9.087779e-04
## 0 2 2.748662e-05 2.815891e-05 2.884002e-05 2.944542e-05
## 1 2 6.595067e-05 6.604683e-05 6.616420e-05 6.619671e-05
plot(tp.fithaz, type = "raw")

plot(tp.fithaz)

Call timepoints based on this fit

The survtmle function provides the function timepoints to compute the estimated cumulative incidence over multiple timepoints. This function is invoked after an initial call to survtmle with option returnModels = TRUE. By setting this option, the timepoints function is able to recycle fits for the conditional treatment probability, censoring distribution, and, in the case of method = “hazard”, the hazard fits. Thus, invoking timepoints is faster than making repeated calls to survtmle with different t0.Now we can call timepoints to return estimates of cumulative incidence at each time “seq_len(t_0)”.

tp.fita <- timepoints(fita, times = seq_len(19))  # t_0 = 19
tp.fita
## $est
##              t1          t2          t3          t4          t5          t6
## 0 1 0.008326190 0.015595895 0.015595895 0.015595895 0.015595895 0.015595895
## 1 1 0.006509832 0.006699801 0.006699801 0.006699801 0.006699801 0.006699801
## 0 2 0.016694358 0.020763858 0.027501088 0.032341184 0.037245864 0.049206022
## 1 2 0.003154582 0.003163791 0.003190885 0.003165478 0.003149360 0.003127804
##              t7          t8          t9         t10         t11         t12
## 0 1 0.015595895 0.015595895 0.015595895 0.015595895 0.015595895 0.015595895
## 1 1 0.006699801 0.006699801 0.006699801 0.006699801 0.006699801 0.006699801
## 0 2 0.049206022 0.054442995 0.054442995 0.054442995 0.058599052 0.057913574
## 1 2 0.003127804 0.003131106 0.003131106 0.003131106 0.003389268 0.009235901
##             t13         t14         t15         t16         t17         t18
## 0 1 0.015595895 0.035004870 0.040363331 0.040363331 0.040363331 0.040363331
## 1 1 0.006699801 0.006111573 0.005339743 0.005339743 0.005339743 0.005339743
## 0 2 0.057913574 0.057913574 0.057913574 0.057913574 0.057913574 0.128790230
## 1 2 0.009235901 0.009235901 0.009235901 0.009235901 0.009235901 0.135015120
##             t19
## 0 1 0.040363331
## 1 1 0.005339743
## 0 2 0.128782499
## 1 2 0.135244181
## 
## $var
##               t1           t2           t3           t4           t5
## 0 1 9.137230e-05 2.561452e-04 2.561452e-04 2.561452e-04 2.561455e-04
## 1 1 1.421580e-05 1.414104e-05 1.414104e-05 1.414104e-05 1.414104e-05
## 0 2 7.997958e-05 9.873305e-05 1.298444e-04 1.590920e-04 1.918718e-04
## 1 2 9.088585e-06 9.074888e-06 9.076871e-06 9.069391e-06 9.066657e-06
##               t6           t7           t8           t9          t10
## 0 1 2.561536e-04 2.565076e-04 3.231470e-04 1.240670e-03 1.839954e-03
## 1 1 1.414104e-05 1.414104e-05 1.414104e-05 1.414104e-05 1.414104e-05
## 0 2 2.782822e-04 2.782822e-04 3.253026e-04 3.253026e-04 3.253026e-04
## 1 2 9.066553e-06 9.066553e-06 9.063113e-06 9.063113e-06 9.063113e-06
##              t11          t12          t13          t14          t15
## 0 1 2.499533e-03 3.455108e-03 2.624278e-04 4.667918e-04 6.814101e-04
## 1 1 1.414104e-05 1.414104e-05 1.414104e-05 1.321218e-05 1.297481e-05
## 0 2 3.457804e-04 3.456246e-04 3.456246e-04 3.456246e-04 3.456246e-04
## 1 2 9.097782e-06 5.199292e-05 5.199292e-05 5.199292e-05 5.199292e-05
##              t16          t17          t18          t19
## 0 1 1.672649e-02 2.366163e-02 3.405483e-02 4.860499e-02
## 1 1 1.297481e-05 1.297481e-05 1.297481e-05 1.297481e-05
## 0 2 3.456246e-04 3.456246e-04 1.626114e-03 1.626053e-03
## 1 2 5.199292e-05 5.199292e-05 3.297126e-04 3.298089e-04
# Confidence intervals for estimates

ci_fita<-confint(tp.fita)
ci_fita
## $`0 1`
##              2.5 %     97.5 %
##  [1,] -0.010408881 0.02706126
##  [2,] -0.015772421 0.04696421
##  [3,] -0.015772421 0.04696421
##  [4,] -0.015772422 0.04696421
##  [5,] -0.015772440 0.04696423
##  [6,] -0.015772935 0.04696472
##  [7,] -0.015794604 0.04698639
##  [8,] -0.019636988 0.05082878
##  [9,] -0.053440190 0.08463198
## [10,] -0.068476135 0.09966792
## [11,] -0.082393145 0.11358494
## [12,] -0.099611112 0.13080290
## [13,] -0.016154784 0.04734657
## [14,] -0.007340868 0.07735061
## [15,] -0.010799243 0.09152591
## [16,] -0.213120655 0.29384732
## [17,] -0.261124953 0.34185162
## [18,] -0.321327468 0.40205413
## [19,] -0.391740893 0.47246756
## 
## $`1 1`
##               2.5 %     97.5 %
##  [1,] -0.0008799862 0.01389965
##  [2,] -0.0006705593 0.01407016
##  [3,] -0.0006705593 0.01407016
##  [4,] -0.0006705593 0.01407016
##  [5,] -0.0006705593 0.01407016
##  [6,] -0.0006705593 0.01407016
##  [7,] -0.0006705593 0.01407016
##  [8,] -0.0006705593 0.01407016
##  [9,] -0.0006705593 0.01407016
## [10,] -0.0006705593 0.01407016
## [11,] -0.0006705593 0.01407016
## [12,] -0.0006705593 0.01407016
## [13,] -0.0006705593 0.01407016
## [14,] -0.0010126134 0.01323576
## [15,] -0.0017201581 0.01239964
## [16,] -0.0017201581 0.01239964
## [17,] -0.0017201581 0.01239964
## [18,] -0.0017201581 0.01239964
## [19,] -0.0017201581 0.01239964
## 
## $`0 2`
##               2.5 %     97.5 %
##  [1,] -0.0008338551 0.03422257
##  [2,]  0.0012887723 0.04023894
##  [3,]  0.0051674410 0.04983473
##  [4,]  0.0076198263 0.05706254
##  [5,]  0.0100968723 0.06439486
##  [6,]  0.0165103095 0.08190173
##  [7,]  0.0165103095 0.08190173
##  [8,]  0.0190927960 0.08979319
##  [9,]  0.0190927960 0.08979319
## [10,]  0.0190927960 0.08979319
## [11,]  0.0221531840 0.09504492
## [12,]  0.0214759170 0.09435123
## [13,]  0.0214759170 0.09435123
## [14,]  0.0214759170 0.09435123
## [15,]  0.0214759170 0.09435123
## [16,]  0.0214759170 0.09435123
## [17,]  0.0214759170 0.09435123
## [18,]  0.0497544722 0.20782599
## [19,]  0.0497482183 0.20781678
## 
## $`1 2`
##              2.5 %      97.5 %
##  [1,] -0.002754177 0.009063340
##  [2,] -0.002740513 0.009068095
##  [3,] -0.002714065 0.009095834
##  [4,] -0.002737038 0.009067994
##  [5,] -0.002752266 0.009050986
##  [6,] -0.002773788 0.009029396
##  [7,] -0.002773788 0.009029396
##  [8,] -0.002769366 0.009031579
##  [9,] -0.002769366 0.009031579
## [10,] -0.002769366 0.009031579
## [11,] -0.002522479 0.009301015
## [12,] -0.004896638 0.023368440
## [13,] -0.004896638 0.023368440
## [14,] -0.004896638 0.023368440
## [15,] -0.004896638 0.023368440
## [16,] -0.004896638 0.023368440
## [17,] -0.004896638 0.023368440
## [18,]  0.099426115 0.170604125
## [19,]  0.099649977 0.170838385

Return aic for the Model

fita[["ftimeMod"]][["J1"]][["t1"]][["aic"]]
## [1] 41.42972

12(a)ii: Plot the Cumulative Incidence Estimates from the Unbounded Model

Because the cumulative incidence function is being invoked pointwise, it is possible that the resulting curve is not monotone. However, it is possible to show that projecting this curve onto a monotone function via isotonic regression results in an estimate with identical asymptotic properties to the pointwise estimate. Therefore, we additionally provide an option type = “iso” (the default) that provides these smoothed curves.

plot(tp.fita, type = "raw")

plot(tp.fita)

summary(fita$ftimeMod$J1)
##    Length Class    Mode
## t1 26     speedglm list

12(b): Estimation in bounded models

In certain situations, we have knowledge that the incidence of an event is bounded below/above for every strata in the population. It is possible to incorporate these bounds into the TMLE estimation procedure to ensure that any resulting estimate of cumulative incidence is compatible with these bounds

Bounds can be passed to survtmle by creating a “data.frame” that contains columns with specific names. In particular, there should be:

For example if there is only one type of failure (ftype = 1 or ftype = 0) then the bounds data.frame can contain columns “l1”, and “u1” denote the lower and upper bounds, respectively, on the iterated conditional mean (for method = “mean”) or the conditional hazard function (for method = “hazard”).

If there are two types of failure (ftype = 1, ftype = 2, or ftype = 0) then there can additionally be columns “l2” and “u2” denoting the lower and upper bounds, respectively, on the iterated conditional mean for type two failures (for method = “mean”) or the conditional cause-specific hazard function for type two failures (for method = “hazard”).

12(b) i: Constructing bounds

Bounds lower bound for outcome 1 (l1),upper bound for outcome 1 (u1) lower bound for outcome 2 (l2),upper bound for outcome 2 (u2) t = seq_len(t_0)

Note:

For the observation period, in the study population:

  1. The highest monthly rate of HIV infection = 0.06, and lowest = 0.00;
  2. The highest monthly rate of infant mortality = 0.08, and lowest = 0.00; l1 = rep(0.00, t_0), u1 = rep(0.06, t_0); l2 = rep(0.00, t_0), u2 = rep(0.08, t_0)
bf18 <- data.frame(t = seq_len(19),
                  l1 = rep(0.00, 19), u1 = rep(0.06, 19),
                  l2 = rep(0.00, 19), u2 = rep(0.08, 19)
)

bf18
##     t l1   u1 l2   u2
## 1   1  0 0.06  0 0.08
## 2   2  0 0.06  0 0.08
## 3   3  0 0.06  0 0.08
## 4   4  0 0.06  0 0.08
## 5   5  0 0.06  0 0.08
## 6   6  0 0.06  0 0.08
## 7   7  0 0.06  0 0.08
## 8   8  0 0.06  0 0.08
## 9   9  0 0.06  0 0.08
## 10 10  0 0.06  0 0.08
## 11 11  0 0.06  0 0.08
## 12 12  0 0.06  0 0.08
## 13 13  0 0.06  0 0.08
## 14 14  0 0.06  0 0.08
## 15 15  0 0.06  0 0.08
## 16 16  0 0.06  0 0.08
## 17 17  0 0.06  0 0.08
## 18 18  0 0.06  0 0.08
## 19 19  0 0.06  0 0.08

12(b) iii: Full Bounded Model

library(Rcpp)
fit18ab <- survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,
                 trt = EID_Data$trt, adjustVars = EID_Data[c("Sex","Clinic_referred_from","Age_in_months")],
                 glm.ctime = "trt + Sex + Clinic_referred_from + Age_in_months",
                 glm.ftime = "trt + Sex + Clinic_referred_from + Age_in_months",
                 glm.trt = "Sex + Clinic_referred_from + Age_in_months",
                 method = "mean", t0 = 1, returnModels=TRUE,bounds=bf18)
fit18ab
## $est
##            [,1]
## 0 1 0.015520068
## 1 1 0.006963061
## 0 2 0.016783956
## 1 2 0.003199820
## 
## $var
##               0 1          1 1           0 2          1 2
## 0 1  8.710949e-04 2.482888e-06 -1.610862e-05 1.371050e-07
## 1 1  2.482888e-06 1.365718e-05  4.296230e-07 8.828334e-08
## 0 2 -1.610862e-05 4.296230e-07  7.991389e-05 9.249250e-09
## 1 2  1.371050e-07 8.828334e-08  9.249250e-09 9.106529e-06

Call timepoints based on this fit

tp.fit18ab <- timepoints(fit18ab, times = seq_len(19))  # t_0 = 18

# print the object
tp.fit18ab
## $est
##              t1          t2          t3          t4          t5          t6
## 0 1 0.015520068 0.030777416 0.030777416 0.030777416 0.030777416 0.030777416
## 1 1 0.006963061 0.006174571 0.006174571 0.006174571 0.006174571 0.006174571
## 0 2 0.016783956 0.020861794 0.027808995 0.032579641 0.037460856 0.049561328
## 1 2 0.003199820 0.003222324 0.003387215 0.003295824 0.003248530 0.003298410
##              t7          t8          t9         t10         t11         t12
## 0 1 0.030777416 0.030777416 0.030777416 0.030777416 0.030777416 0.030777416
## 1 1 0.006174571 0.006174571 0.006174571 0.006174571 0.006174571 0.006174571
## 0 2 0.049561328 0.054731404 0.054731404 0.054731404 0.054039395 0.058439589
## 1 2 0.003298410 0.003385959 0.003385959 0.003385959 0.006801199 0.011217264
##             t13         t14         t15         t16         t17         t18
## 0 1 0.030777416 0.031848648 0.044912626 0.044912626 0.044912626 0.044912626
## 1 1 0.006174571 0.006910134 0.006177893 0.006177893 0.006177893 0.006177893
## 0 2 0.058439589 0.058439589 0.058439589 0.058439589 0.058439589 0.077544510
## 1 2 0.011217264 0.011217264 0.011217264 0.011217264 0.011217264 0.032682966
##             t19
## 0 1 0.044912626
## 1 1 0.006177893
## 0 2 0.077949390
## 1 2 0.033735847
## 
## $var
##               t1           t2           t3           t4           t5
## 0 1 8.710949e-04 1.118266e-03 1.118266e-03 1.118266e-03 1.118266e-03
## 1 1 1.365718e-05 1.288593e-05 1.288593e-05 1.288593e-05 1.288593e-05
## 0 2 7.991389e-05 9.864723e-05 1.295488e-04 1.588874e-04 1.916890e-04
## 1 2 9.106529e-06 9.089349e-06 9.140553e-06 9.093127e-06 9.075959e-06
##               t6           t7           t8           t9          t10
## 0 1 1.118266e-03 1.118266e-03 1.118266e-03 1.118266e-03 1.118266e-03
## 1 1 1.288593e-05 1.288593e-05 1.288593e-05 1.288593e-05 1.288593e-05
## 0 2 2.780570e-04 2.780570e-04 3.255470e-04 3.255470e-04 3.255470e-04
## 1 2 9.102358e-06 9.102358e-06 9.098804e-06 9.098804e-06 9.098804e-06
##              t11          t12          t13          t14          t15
## 0 1 1.118266e-03 1.118266e-03 1.118266e-03 1.153926e-03 1.308430e-03
## 1 1 1.288593e-05 1.288593e-05 1.288593e-05 1.373118e-05 1.318372e-05
## 0 2 3.435561e-04 3.430637e-04 3.430637e-04 3.430637e-04 3.430637e-04
## 1 2 1.193697e-05 5.928630e-05 5.928630e-05 5.928630e-05 5.928630e-05
##              t16          t17          t18          t19
## 0 1 1.308430e-03 1.308430e-03 1.308430e-03 1.308430e-03
## 1 1 1.318372e-05 1.318372e-05 1.318372e-05 1.318372e-05
## 0 2 3.430637e-04 3.430637e-04 4.106369e-04 4.108522e-04
## 1 2 5.928630e-05 5.928630e-05 6.099240e-05 6.107559e-05

Confidence intervals for estimates

ci_fit18ab<-confint(tp.fit18ab)
ci_fit18ab
## $`0 1`
##             2.5 %     97.5 %
##  [1,] -0.04232693 0.07336707
##  [2,] -0.03476474 0.09631957
##  [3,] -0.03476474 0.09631957
##  [4,] -0.03476474 0.09631957
##  [5,] -0.03476474 0.09631957
##  [6,] -0.03476474 0.09631957
##  [7,] -0.03476474 0.09631957
##  [8,] -0.03476474 0.09631957
##  [9,] -0.03476474 0.09631957
## [10,] -0.03476474 0.09631957
## [11,] -0.03476474 0.09631957
## [12,] -0.03476474 0.09631957
## [13,] -0.03476474 0.09631957
## [14,] -0.03473033 0.09842762
## [15,] -0.02598364 0.11580889
## [16,] -0.02598364 0.11580889
## [17,] -0.02598364 0.11580889
## [18,] -0.02598364 0.11580889
## [19,] -0.02598364 0.11580889
## 
## $`1 1`
##               2.5 %     97.5 %
##  [1,] -0.0002801078 0.01420623
##  [2,] -0.0008611070 0.01321025
##  [3,] -0.0008611070 0.01321025
##  [4,] -0.0008611070 0.01321025
##  [5,] -0.0008611070 0.01321025
##  [6,] -0.0008611070 0.01321025
##  [7,] -0.0008611070 0.01321025
##  [8,] -0.0008611070 0.01321025
##  [9,] -0.0008611070 0.01321025
## [10,] -0.0008611070 0.01321025
## [11,] -0.0008611070 0.01321025
## [12,] -0.0008611070 0.01321025
## [13,] -0.0008611070 0.01321025
## [14,] -0.0003526318 0.01417290
## [15,] -0.0009386164 0.01329440
## [16,] -0.0009386164 0.01329440
## [17,] -0.0009386164 0.01329440
## [18,] -0.0009386164 0.01329440
## [19,] -0.0009386164 0.01329440
## 
## $`0 2`
##               2.5 %     97.5 %
##  [1,] -0.0007370583 0.03430497
##  [2,]  0.0013951748 0.04032841
##  [3,]  0.0055007805 0.05011721
##  [4,]  0.0078741915 0.05728509
##  [5,]  0.0103248002 0.06459691
##  [6,]  0.0168788500 0.08224381
##  [7,]  0.0168788500 0.08224381
##  [8,]  0.0193679268 0.09009488
##  [9,]  0.0193679268 0.09009488
## [10,]  0.0193679268 0.09009488
## [11,]  0.0177109381 0.09036785
## [12,]  0.0221371768 0.09474200
## [13,]  0.0221371768 0.09474200
## [14,]  0.0221371768 0.09474200
## [15,]  0.0221371768 0.09474200
## [16,]  0.0221371768 0.09474200
## [17,]  0.0221371768 0.09474200
## [18,]  0.0378274508 0.11726157
## [19,]  0.0382219223 0.11767686
## 
## $`1 2`
##               2.5 %      97.5 %
##  [1,] -2.714769e-03 0.009114408
##  [2,] -2.686682e-03 0.009131331
##  [3,] -2.538412e-03 0.009312842
##  [4,] -2.614411e-03 0.009206058
##  [5,] -2.656123e-03 0.009153182
##  [6,] -2.614824e-03 0.009211643
##  [7,] -2.614824e-03 0.009211643
##  [8,] -2.526120e-03 0.009298038
##  [9,] -2.526120e-03 0.009298038
## [10,] -2.526120e-03 0.009298038
## [11,]  2.953918e-05 0.013572860
## [12,] -3.873988e-03 0.026308516
## [13,] -3.873988e-03 0.026308516
## [14,] -3.873988e-03 0.026308516
## [15,] -3.873988e-03 0.026308516
## [16,] -3.873988e-03 0.026308516
## [17,] -3.873988e-03 0.026308516
## [18,]  1.737611e-02 0.047989820
## [19,]  1.841856e-02 0.049053137

Plot smoothed cumulative incidence

Plot raw cumulative incidence

plot(tp.fit18ab, type = "raw")

Return aic for the Model

fit18ab[["ftimeMod"]][["J1"]][["t1"]][["aic"]]
## NULL