________________________________________________________________________________________________________________
Introduction
The COVID-19 pandemic caused disruptions in Uganda’s health systems including HIV exposed infant diagnosis (EID) services. Despite the disruptions, there is paucity of knowledge as to whether they affected the risk of Mother To Child HIV infection. During EID care, infants can be lost to follow-up, transferred, or even die before testing HIV positive. If an HIV exposed infant dies when they are still HIV negtive, it precludes them from the HIV infection risk group. In this case, Infant mortality is a competing event of Infant HIV infection. Because of the need to account for censoring and competing nature of EID outcomes, survival analysis approaches become handy. However, due to improvement in PMTCT services, infant HIV infection during EID care is rare. This would require a longer follow up time to obtain sufficient events for an adequately powered study in case traditional survival analysis approaches are to be used. In these settings, it is possible to improve upon existing estimators by considering estimation in a bounded statistical model. This bounded model incorporates existing scientific knowledge about the incidence of an event in the population. This study utilizes the framework of targeted minimum loss-based estimation (TMLE) to construct an estimator of cumulative incidence that respects the bounds, and thereby compare cumulative incidences of infant HIV infection before and after the COVID-19 pandemic at Kisenyi Health Centre IV in Kampala, Uganda.
Research Questions
How does EID care Infant HIV cumulative incidence at Kisenyi Health Centre IV compare before and after the COVID-19 pandemic?
Do bounded and non-bounded estimates of infant HIV cumulative incidence differ?
What factors were associated with EID care infant HIV infection at Kisenyi Health Centre IV before and after the COVID-19 pandemic?
Hypothesis
The hypothesis of this study is that due to restrictions aimed at preventing the spread of COVID-19 in Uganda such as nation-wide lockdowns, suspension of public transport and curfew, travels of mothers and their infants to health centers for HIV testing, ARV refills and attendance of follow up visits were limited. The ensued interruptions in infant feeding counseling, uptake of ART and initiation or completion of infant HIV prophylaxis could have led to an increase in the risk of mothers transmitting HIV to their children.
Study Objectives
General objective
To study EID care infant HIV infection cumulative incidence at Kisenyi Health Centre IV in Kampala, Uganda, before and after the COVID-19 pandemic, using survival analysis.
Specific objectives
To compare EID care Infant HIV cumulative incidence at Kisenyi HC IV in Kampala, Uganda, before and after the COVID-19 pandemic.
To compare bounded and non-bounded estimates of cumulative incidence at Kisenyi HC IV in Kampala, Uganda, before and after the COVID-19 pandemic.
To study factors associated with EID care Infant HIV cumulative incidence at Kisenyi HC IV in Kampala, Uganda, before and after the COVID-19 pandemic.
Analysis Plan
The HIV exposed infant characteristics shall be summarised using mean (SD) and median (IQR) for continuous variables and proportions for categorical variables. Maternal and Infant baseline characteristics shall be compared between the group before and the group after the COVID-19 pandemic, and the failure types/events observed. The Student’s t-test shall be used for continuous variables while Fisher’s exact test and the Chi-square test shall be used for categorical variables.
For comparison of cumulative incidence estimates before and after the COVID-19 pandemic, bounded and non-bounded estimates of cumulative incidence shall be obtained using Targeted minimum loss-based estimation (TMLE) and cumulative incidence curves plotted.
The bounded model shall be based on iterated means of the cumulative incidence rather than the cause specific hazards. The R package survtmle, facilitates use of TMLE to compute baseline covariate-adjusted estimates of marginal cumulative incidence in right-censored survival settings with and without competing risks. (David Benkeser & Nima Hejazi, 2019).
The 95% Confidence intervals of the obtained cumulative incidence estimates in the bounded covariate adjusted model shall be used to select for factors associated with EID care Infant HIV cumulative incidence rates at Kisenyi HC IV in Kampala, Uganda, before and after the COVID-19 pandemic.
1. Importing the EID Dataset , specifying the datatypes
library(readxl)
EID_Dataset <- read_excel("EID Dataset.xlsx",
col_types = c("text", "text", "text",
"text", "text", "numeric", "text",
"text", "numeric", "text", "numeric",
"text", "text", "text", "text", "text",
"text", "numeric", "text", "text",
"text", "numeric", "text", "text",
"text", "numeric", "text", "numeric",
"text", "numeric", "text", "text",
"text", "numeric", "text"))
Description of Variables
| Variable | Categories | Description |
|---|---|---|
| Cohort | -After -Before |
An indication of whether the infant was enrolled into care before or after the COVID-19 Pandemic. |
| Sex | 1- Male 2- Female |
Sex of the infant. |
| Age_in_months | Age of the infant at registration in Months. | |
| Clinic_referred_from | 1-EMTCT 2-OPD 3-YCC 4-Post natal 5-Other 6-Referral |
The Clinic through which the infant was enrolled into EID Care. |
| Age_NVP_initiation | Age at NVP Initiation. | |
| Age_at_cotrim | Age at Cotrimoxazole Initiation. | |
| ART_B_feeding | 0-No 1-Yes 2-Already on ART |
An indication of whether the Mother was initiated on ART during breastfeeding. |
| ARVs_EMTCT | 1-Lifelong ART 2-No ART 3-Unknown |
An indication of whether the Mother was given EMTCT ARVs during ANC and PNC. |
| Infant_ARVs_EMTCT | 1-Received NVP within 72 hours of birth until 6 or 12 weeks 2-NVPV taken 72 hours after birth 3-No ARVs were given at birth 4- Unknown: infant eMTCT regimen is not known. |
An indication of whether/how the infant was given EMTCT ARVs. |
| B_feeding_1st_DBS | 1-Exclusive Breastfeeding 2-Replacement feeding(never breastfed) 3-Mixed feeding(<6Months) 4-Complementary feeding(>6Months) 5-Weaning 6-No longer breastfeeding |
Breastfeeding practice at the 1st DBS. |
| Number_of_visits | The maximum number of visits the Infant/caregiver made during EID Care. | |
| Age_at_visit | Age of the infant at the last EID Care Visit. | |
| Age_last_visit | Age of the infant at the first EID Care Visit. | |
| feeding_at_visit | 1-Exclusive Breastfeeding 2-Replacement feeding(never breastfed) 3-Mixed feeding(<6Months) 4-Complementary feeding(>6Months) 5-Weaning 6-No longer breastfeeding |
Feeding practice of the infant at the last EID Care Visit. |
| Wasting_2 | An indication of wasting in the infant at the last EID Care Visit. | |
| final_outcome | 1-Discharged Negative 2-Referred for ART 3-Lost 4-Died Negative 5-Transferred |
Outcome of the EID Care |
2. View a few observations of the imported Dataset
head(EID_Dataset)
## Cohort Infant_No Date_registration Sex Date_of_birth Age_in_months
## 1 After 620001 14/7/2020 2 43867 1.50
## 2 After 620002 14/7/2020 2 43867 1.50
## 3 After 620003 15/7/2020 2 43896 1.50
## 4 After 620004 15/7/2020 2 43867 1.50
## 5 After 620005 16/7/2020 1 44080 1.25
## 6 After 620006 16/7/2020 1 43927 1.50
## Clinic_referred_from Date_at_NVP_initiation Age_NVP_initiation Date_at_cotrim
## 1 1 43867 1.5 14/7/2020
## 2 1 43836 1.5 14/7/2020
## 3 1 43896 1.5 15/7/2020
## 4 1 43867 1.5 15/7/2020
## 5 1 44080 1.5 21/7/2020
## 6 1 43988 1.5 16/7/2020
## Age_at_cotrim HIV_B_feeding ART_B_feeding ARVs_EMTCT Infant_ARVs_EMTCT
## 1 1.5 0 1 1 1
## 2 1.5 0 1 1 1
## 3 1.5 0 1 1 1
## 4 1.5 0 1 1 1
## 5 1.5 0 1 1 1
## 6 1.5 0 1 1 1
## _1st_PCR_test Date_1st_DBS Age_1st_DBS_001 B_feeding_1st_DBS _1st_DBS_result
## 1 1 14/7/2020 1.50 1 2
## 2 1 14/7/2020 1.50 1 2
## 3 1 15/7/2020 1.50 1 2
## 4 1 15/7/2020 1.50 1 2
## 5 1 16/7/2020 1.25 1 2
## 6 1 16/7/2020 1.50 6 2
## Date_2nd_DBS Age_2nd_DBS_001 B_feeding_2nd_DBS _2nd_DBS_result
## 1 44258 9.00 4 1
## 2 44258 9.00 4 2
## 3 44258 9.00 4 2
## 4 24/2/2021 9.00 <NA> 2
## 5 24/2/2021 8.75 6 2
## 6 24/2/2022 9.00 6 2
## result_rapid_test Number_of_visits Date_of_first_visit Age_at_visit
## 1 <NA> 7 14/7/2020 1.50
## 2 2 6 14/7/2020 1.50
## 3 2 6 15/7/2020 1.50
## 4 2 4 15/7/2020 1.50
## 5 <NA> 2 16/7/2020 1.25
## 6 <NA> 5 16/7/2020 1.50
## Date_of_last_visit Age_last_visit Cotrim_at_visit NVP_at_visit
## 1 26/5/2021 11.0 1 2
## 2 26/5/2021 11.5 2 2
## 3 26/5/2021 12.0 1 1
## 4 26/5/2021 11.0 1 2
## 5 20/7/2020 2.5 1 2
## 6 31/5/2021 12.0 1 2
## feeding_at_visit Wasting_2 final_outcome
## 1 4 14.8 1
## 2 6 14.5 1
## 3 1 15.3 3
## 4 1 14.5 3
## 5 1 NA 1
## 6 6 14.5 1
3. View Column heads of the dataset
colnames(EID_Dataset)
## [1] "Cohort" "Infant_No" "Date_registration"
## [4] "Sex" "Date_of_birth" "Age_in_months"
## [7] "Clinic_referred_from" "Date_at_NVP_initiation" "Age_NVP_initiation"
## [10] "Date_at_cotrim" "Age_at_cotrim" "HIV_B_feeding"
## [13] "ART_B_feeding" "ARVs_EMTCT" "Infant_ARVs_EMTCT"
## [16] "_1st_PCR_test" "Date_1st_DBS" "Age_1st_DBS_001"
## [19] "B_feeding_1st_DBS" "_1st_DBS_result" "Date_2nd_DBS"
## [22] "Age_2nd_DBS_001" "B_feeding_2nd_DBS" "_2nd_DBS_result"
## [25] "result_rapid_test" "Number_of_visits" "Date_of_first_visit"
## [28] "Age_at_visit" "Date_of_last_visit" "Age_last_visit"
## [31] "Cotrim_at_visit" "NVP_at_visit" "feeding_at_visit"
## [34] "Wasting_2" "final_outcome"
4. Subset the dataset to keep fewer/important variables
EID_Data<-EID_Dataset[,c(
"Cohort","Infant_No","Sex","Age_in_months","Clinic_referred_from","Age_NVP_initiation",
"Age_at_cotrim","HIV_B_feeding","ART_B_feeding","ARVs_EMTCT","Infant_ARVs_EMTCT","B_feeding_1st_DBS",
"Number_of_visits","Age_at_visit","Age_last_visit","feeding_at_visit", "Wasting_2",
"final_outcome"
)]
We need to prepare the dataset to a structure suitable for the “Survtmle” package/function. The simple data structure should contain:
A set of baseline covariates (adjustVars),
A binary treatment variable (trt),
A failure time that is a function of the treatment, adjustment variables, and a random error (ftime),
A failure type (ftype), which denotes the cause of failure (0 means no failure, 1 means failure)
5. Calculate follow up time from age (in months) at registration and age at last visit(t)
library(tidyverse)
Age_last<-transmute(EID_Data,Age_last=as.integer(Age_last_visit))
Age_first<-transmute(EID_Data,Age_first=as.integer(Age_in_months))
age_diff<-cbind(Age_last,Age_first)
time<-transmute(age_diff,ftime=Age_last-Age_first)
EID_Data<-cbind(EID_Data,time)
6. View the first observations of the new Dataset
head(EID_Data)
## Cohort Infant_No Sex Age_in_months Clinic_referred_from Age_NVP_initiation
## 1 After 620001 2 1.50 1 1.5
## 2 After 620002 2 1.50 1 1.5
## 3 After 620003 2 1.50 1 1.5
## 4 After 620004 2 1.50 1 1.5
## 5 After 620005 1 1.25 1 1.5
## 6 After 620006 1 1.50 1 1.5
## Age_at_cotrim HIV_B_feeding ART_B_feeding ARVs_EMTCT Infant_ARVs_EMTCT
## 1 1.5 0 1 1 1
## 2 1.5 0 1 1 1
## 3 1.5 0 1 1 1
## 4 1.5 0 1 1 1
## 5 1.5 0 1 1 1
## 6 1.5 0 1 1 1
## B_feeding_1st_DBS Number_of_visits Age_at_visit Age_last_visit
## 1 1 7 1.50 11.0
## 2 1 6 1.50 11.5
## 3 1 6 1.50 12.0
## 4 1 4 1.50 11.0
## 5 1 2 1.25 2.5
## 6 6 5 1.50 12.0
## feeding_at_visit Wasting_2 final_outcome ftime
## 1 4 14.8 1 10
## 2 6 14.5 1 10
## 3 1 15.3 3 11
## 4 1 14.5 3 10
## 5 1 NA 1 1
## 6 6 14.5 1 11
7. Renaming and Re-coding the Cohort and final_outcome Variables :
Cohort to trt: a binary treatment variable.i.e “0” - Unexposed/Before COVID-19 , “1” - Exposed/After the COVID-19 Pandemic.
final_outcome to ftype: which denotes the cause of failure (“0” means no failure, “1” means failure type 1/HIV Infection, “2” means failure type 2/ Infant Mortality)
EID_Data<-EID_Data %>% rename(
trt = Cohort, ftype = final_outcome
)
8. View the data structure
str(EID_Data)
## 'data.frame': 557 obs. of 19 variables:
## $ trt : chr "After" "After" "After" "After" ...
## $ Infant_No : chr "620001" "620002" "620003" "620004" ...
## $ Sex : chr "2" "2" "2" "2" ...
## $ Age_in_months : num 1.5 1.5 1.5 1.5 1.25 1.5 1.5 1.5 1.5 1 ...
## $ Clinic_referred_from: chr "1" "1" "1" "1" ...
## $ Age_NVP_initiation : num 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
## $ Age_at_cotrim : num 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
## $ HIV_B_feeding : chr "0" "0" "0" "0" ...
## $ ART_B_feeding : chr "1" "1" "1" "1" ...
## $ ARVs_EMTCT : chr "1" "1" "1" "1" ...
## $ Infant_ARVs_EMTCT : chr "1" "1" "1" "1" ...
## $ B_feeding_1st_DBS : chr "1" "1" "1" "1" ...
## $ Number_of_visits : num 7 6 6 4 2 5 6 8 8 7 ...
## $ Age_at_visit : num 1.5 1.5 1.5 1.5 1.25 1.5 1.5 1.5 1.5 1 ...
## $ Age_last_visit : num 11 11.5 12 11 2.5 12 12 9 13 12.5 ...
## $ feeding_at_visit : chr "4" "6" "1" "1" ...
## $ Wasting_2 : num 14.8 14.5 15.3 14.5 NA 14.5 15 14.5 14.5 14 ...
## $ ftype : chr "1" "1" "3" "3" ...
## $ ftime : int 10 10 11 10 1 11 11 8 12 11 ...
9 . Re-coding of “trt”, “ftype” and “Clinic_referred_from” variables
library(Hmisc)
EID_Data$trt<-recode(EID_Data$trt, After = "1", Before = "0")
EID_Data$ftype<-recode(EID_Data$ftype,
"1"="0", "2" = "1", "3" = "0", "4" = "2", "5" = "2","6" = "0" )
# Recoding the clinic through which the infant joined care to:
# 0. EMTCT and 1. Other
EID_Data$Clinic_referred_from<-recode(EID_Data$Clinic_referred_from,
"1"="0","2"="1","3"="1","4"="1","5"="1","6"="1")
# Categorizing the "Number_of_visits" variable
EID_Data$Visits<- cut(EID_Data$Number_of_visits,
breaks=c(0,3,6,9,12),
labels=c('<=3','3 to 5','6 to 8','8 to 11'))
# Recoding indication of whether the Mother was given EMTCT ARVs during ANC and PNC
#0. No, 1. Yes
EID_Data$ARVs_EMTCT<-recode(EID_Data$ARVs_EMTCT,
"1"="1","2"="0","3"="0","NA"="0")
# Recoding indication of whether the Infant was given EMTCT ARVs during ANC and PNC
#0. No, 1. Yes
EID_Data$Infant_ARVs_EMTCT<-recode(EID_Data$Infant_ARVs_EMTCT,
"1"="1","2"="1","3"="0","4"="0","NA"="0")
EID_Data$feeding_at_visit<-recode(EID_Data$feeding_at_visit,
"1"="1", "2" = "0", "3" = "1", "4" = "1", "5" = "0","6" = "0","NA"="0" )
10.(a) Exploratory Analysis of Categorical Variables: Proportions/Cross Tabulation
# Preparing for cross Tabulation: load the "sjplot" library
req <- substitute(require(x, character.only = TRUE))
libs<-c("sjPlot")
sapply(libs, function(x) eval(req) || {install.packages(x); eval(req)})
## sjPlot
## TRUE
(i) Proportions/Cross Tabulation: Categorical Variables/Characteristics by Cohort
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$ftype,
title = "Cross Tabulation of the Failure Type by Cohort",
show.row.prc = TRUE)
| trt | ftype | Total | ||
|---|---|---|---|---|
| 0 | 1 | 2 | ||
| 0 |
240 91.6 % |
6 2.3 % |
16 6.1 % |
262 100 % |
| 1 |
291 98.6 % |
2 0.7 % |
2 0.7 % |
295 100 % |
| Total |
531 95.3 % |
8 1.4 % |
18 3.2 % |
557 100 % |
χ2=15.888 · df=2 · Cramer’s V=0.169 · Fisher’s p=0.000 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$feeding_at_visit,
title = "Cross Tabulation of the Feeding Type by Cohort",
show.row.prc = TRUE)
| trt | feeding_at_visit | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
137 56.4 % |
106 43.6 % |
243 100 % |
| 1 |
161 60.3 % |
106 39.7 % |
267 100 % |
| Total |
298 58.4 % |
212 41.6 % |
510 100 % |
χ2=0.652 · df=1 · φ=0.040 · p=0.419 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Sex,
title = "Cross Tabulation of Sex by Cohort",
show.row.prc = TRUE)
| trt | Sex | Total | |
|---|---|---|---|
| 1 | 2 | ||
| 0 |
132 50.4 % |
130 49.6 % |
262 100 % |
| 1 |
155 52.5 % |
140 47.5 % |
295 100 % |
| Total |
287 51.5 % |
270 48.5 % |
557 100 % |
χ2=0.180 · df=1 · φ=0.022 · p=0.671 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$HIV_B_feeding,
title = "Cross Tabulation of HIV Testing at breastfeeding by Cohort",
show.row.prc = TRUE)
| trt | HIV_B_feeding | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
253 96.9 % |
8 3.1 % |
261 100 % |
| 1 |
295 100 % |
0 0 % |
295 100 % |
| Total |
548 98.6 % |
8 1.4 % |
556 100 % |
χ2=7.140 · df=1 · φ=0.128 · Fisher’s p=0.002 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$ART_B_feeding,
title = "Cross Tabulation of ART at Breastfeeding by Cohort",
show.row.prc = TRUE)
| trt | ART_B_feeding | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
252 96.2 % |
10 3.8 % |
262 100 % |
| 1 |
230 78 % |
65 22 % |
295 100 % |
| Total |
482 86.5 % |
75 13.5 % |
557 100 % |
χ2=37.973 · df=1 · φ=0.266 · p=0.000 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$ARVs_EMTCT,
title = "Cross Tabulation of Mother's EMTCT ARVs by Cohort",
show.row.prc = TRUE)
| trt | ARVs_EMTCT | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
13 5 % |
249 95 % |
262 100 % |
| 1 |
5 1.7 % |
290 98.3 % |
295 100 % |
| Total |
18 3.2 % |
539 96.8 % |
557 100 % |
χ2=3.749 · df=1 · φ=0.092 · Fisher’s p=0.033 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Infant_ARVs_EMTCT,
title = "Cross Tabulation of Infant's EMTCT ARVs by Cohort",
show.row.prc = TRUE)
| trt | Infant_ARVs_EMTCT | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
11 4.2 % |
251 95.8 % |
262 100 % |
| 1 |
3 1 % |
292 99 % |
295 100 % |
| Total |
14 2.5 % |
543 97.5 % |
557 100 % |
χ2=4.507 · df=1 · φ=0.101 · Fisher’s p=0.027 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Clinic_referred_from,
title = "Cross Tabulation of Clinic to EID Care by Cohort",
show.row.prc = TRUE)
| trt | Clinic_referred_from | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
183 69.8 % |
79 30.2 % |
262 100 % |
| 1 |
276 93.6 % |
19 6.4 % |
295 100 % |
| Total |
459 82.4 % |
98 17.6 % |
557 100 % |
χ2=52.189 · df=1 · φ=0.311 · p=0.000 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$Visits,
title = "Cross Tabulation of Visits by Cohort",
show.row.prc = TRUE)
| trt | Visits | Total | |||
|---|---|---|---|---|---|
| <=3 | 3 to 5 | 6 to 8 | 8 to 11 | ||
| 0 |
45 17.2 % |
56 21.4 % |
119 45.4 % |
42 16 % |
262 100 % |
| 1 |
43 14.7 % |
109 37.2 % |
137 46.8 % |
4 1.4 % |
293 100 % |
| Total |
88 15.9 % |
165 29.7 % |
256 46.1 % |
46 8.3 % |
555 100 % |
χ2=48.145 · df=3 · Cramer’s V=0.295 · p=0.000 |
sjPlot::tab_xtab(var.row =EID_Data$trt , var.col = EID_Data$feeding_at_visit,
title = "Cross Tabulation of the Feeding Type at the last Visit and the Cohort",
show.row.prc = TRUE)
| trt | feeding_at_visit | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
137 56.4 % |
106 43.6 % |
243 100 % |
| 1 |
161 60.3 % |
106 39.7 % |
267 100 % |
| Total |
298 58.4 % |
212 41.6 % |
510 100 % |
χ2=0.652 · df=1 · φ=0.040 · p=0.419 |
(ii) Proportions/Cross Tabulation: Categorical Variables/Characteristics by Failure type
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$Clinic_referred_from,
title = "Cross Tabulation of the Clinic the infant came from and the Failure Type",
show.row.prc = TRUE)
| ftype | Clinic_referred_from | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
442 83.2 % |
89 16.8 % |
531 100 % |
| 1 |
6 75 % |
2 25 % |
8 100 % |
| 2 |
11 61.1 % |
7 38.9 % |
18 100 % |
| Total |
459 82.4 % |
98 17.6 % |
557 100 % |
χ2=6.187 · df=2 · Cramer’s V=0.105 · Fisher’s p=0.031 |
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$HIV_B_feeding,
title = "Cross Tabulation of HIV Testing at breastfeeding and the Failure Type",
show.row.prc = TRUE)
| ftype | HIV_B_feeding | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
523 98.7 % |
7 1.3 % |
530 100 % |
| 1 |
7 87.5 % |
1 12.5 % |
8 100 % |
| 2 |
18 100 % |
0 0 % |
18 100 % |
| Total |
548 98.6 % |
8 1.4 % |
556 100 % |
χ2=7.217 · df=2 · Cramer’s V=0.114 · Fisher’s p=0.127 |
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$ART_B_feeding,
title = "Cross Tabulation of ARVs at breastfeeding and the Failure Type",
show.row.prc = TRUE)
| ftype | ART_B_feeding | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
460 86.6 % |
71 13.4 % |
531 100 % |
| 1 |
7 87.5 % |
1 12.5 % |
8 100 % |
| 2 |
15 83.3 % |
3 16.7 % |
18 100 % |
| Total |
482 86.5 % |
75 13.5 % |
557 100 % |
χ2=0.169 · df=2 · Cramer’s V=0.017 · Fisher’s p=0.877 |
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$ARVs_EMTCT,
title = "Cross Tabulation of Mother's ARVs at EMTCT and the Failure Type",
show.row.prc = TRUE)
| ftype | ARVs_EMTCT | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
14 2.6 % |
517 97.4 % |
531 100 % |
| 1 |
3 37.5 % |
5 62.5 % |
8 100 % |
| 2 |
1 5.6 % |
17 94.4 % |
18 100 % |
| Total |
18 3.2 % |
539 96.8 % |
557 100 % |
χ2=30.954 · df=2 · Cramer’s V=0.236 · Fisher’s p=0.001 |
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$Infant_ARVs_EMTCT,
title = "Cross Tabulation of Mother's ARVs at EMTCT and the Failure Type",
show.row.prc = TRUE)
| ftype | Infant_ARVs_EMTCT | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
11 2.1 % |
520 97.9 % |
531 100 % |
| 1 |
1 12.5 % |
7 87.5 % |
8 100 % |
| 2 |
2 11.1 % |
16 88.9 % |
18 100 % |
| Total |
14 2.5 % |
543 97.5 % |
557 100 % |
χ2=9.109 · df=2 · Cramer’s V=0.128 · Fisher’s p=0.042 |
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$Sex,
title = "Cross Tabulation of Sex and the Failure Type",
show.row.prc = TRUE)
| ftype | Sex | Total | |
|---|---|---|---|
| 1 | 2 | ||
| 0 |
278 52.4 % |
253 47.6 % |
531 100 % |
| 1 |
2 25 % |
6 75 % |
8 100 % |
| 2 |
7 38.9 % |
11 61.1 % |
18 100 % |
| Total |
287 51.5 % |
270 48.5 % |
557 100 % |
χ2=3.550 · df=2 · Cramer’s V=0.080 · Fisher’s p=0.179 |
sjPlot::tab_xtab(var.row =EID_Data$ftype , var.col = EID_Data$feeding_at_visit,
title = "Cross Tabulation of the Feeding Type at the last Visit and the Failure Type",
show.row.prc = TRUE)
| ftype | feeding_at_visit | Total | |
|---|---|---|---|
| 0 | 1 | ||
| 0 |
295 60.7 % |
191 39.3 % |
486 100 % |
| 1 |
2 25 % |
6 75 % |
8 100 % |
| 2 |
1 6.2 % |
15 93.8 % |
16 100 % |
| Total |
298 58.4 % |
212 41.6 % |
510 100 % |
χ2=22.647 · df=2 · Cramer’s V=0.211 · Fisher’s p=0.000 |
10(b): Exploratory analysis of continuous variables: Summary Statistics
#Summary statistics for Age at registration/first visit and the Clinic of entry into EID care
# Age at Registration by cohort
tapply(EID_Data$Age_in_months, EID_Data$trt, summary)
## $`0`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 1.400 1.433 1.971 1.533 13.833
##
## $`1`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.500 1.500 2.255 2.500 16.000
# Age at last Visit by Clinic of entry into EID
tapply(EID_Data$Age_in_months, EID_Data$Clinic_referred_from, summary)
## $`0`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 1.450 1.500 1.876 1.817 12.000
##
## $`1`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 1.433 1.500 3.275 3.875 16.000
# Age at last Visit by Cohort
tapply(EID_Data$Age_last_visit, EID_Data$trt, summary)
## $`0`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.5 8.0 13.0 12.1 17.0 45.5
##
## $`1`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.50 9.50 13.00 12.20 15.43 20.00
# Age at NVP Initiation by Cohort
tapply(EID_Data$Age_NVP_initiation, EID_Data$trt, summary)
## $`0`
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.0000 0.0000 0.0000 0.1818 0.0000 6.0000 196
##
## $`1`
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.0000 0.0000 0.0000 0.8166 1.5000 14.0000 6
# Age at Cotrimoxazole Initiation by Cohort
tapply(EID_Data$Age_at_cotrim, EID_Data$trt, summary)
## $`0`
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.500 1.500 1.500 2.003 1.500 15.000 148
##
## $`1`
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.000 1.500 1.500 2.092 2.000 13.000 5
10(b) ii Use the Kruskal-Wallis test to check for difference in the groups
If the p-value is less than the significance level 0.05, we can conclude that there are significant differences between the treatment groups. The test can be performed using the function “kruskal.test()”
From Multiple pairwise-comparison between groups From the output of the Kruskal-Wallis test, we may know that there is a significant difference between groups, but we don’t know which pairs of groups are different.It’s possible to use the function “pairwise.wilcox.test()” to calculate pairwise comparisons between group levels with corrections for multiple testing.
## Comparison of age at first visit in the two cohorts
kruskal.test(Age_in_months ~ trt, data = EID_Data)
##
## Kruskal-Wallis rank sum test
##
## data: Age_in_months by trt
## Kruskal-Wallis chi-squared = 120.48, df = 1, p-value < 2.2e-16
## Comparison of age at first visit in the treatment groups
kruskal.test(Age_in_months ~ ftype, data = EID_Data)
##
## Kruskal-Wallis rank sum test
##
## data: Age_in_months by ftype
## Kruskal-Wallis chi-squared = 3.1702, df = 2, p-value = 0.2049
11. Infant HIV Infection Cumulative Incidence Estimates
# Load the necessary Packages
library(dplyr)
library(tidycmprsk)
library(ggsurvfit)
library(survival)
library(gtsummary)
library(ggplot2)
# Convert the Cohort (trt) and Outcome (ftype) variables factors
EID<-EID_Data
EID$trt<-as.factor(EID$trt)
EID$ftype<-as.factor(EID$ftype)
# Split the Dataset by cohort
EID_Before<-filter(EID,trt==0)
EID_After<-filter(EID,trt==1)
# Cumulative Incidence Estimates before COVID-19
Z<-cuminc(Surv(ftime, ftype) ~ 1, data = EID_Before)
tidy(Z,time = c(0, 6,12, 18))
## # A tibble: 8 × 12
## time outcome estimate std.er…¹ conf.…² conf.…³ n.risk n.event n.cen…⁴ cum.e…⁵
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int>
## 1 0 1 0.00763 0.00539 0.00154 0.0255 262 2 7 2
## 2 6 1 0.0155 0.00770 0.00517 0.0369 196 2 49 4
## 3 12 1 0.0155 0.00770 0.00517 0.0369 127 0 67 4
## 4 18 1 0.0317 0.0137 0.0121 0.0672 11 2 115 6
## 5 0 2 0.0191 0.00847 0.00721 0.0417 262 5 7 5
## 6 6 2 0.0533 0.0145 0.0297 0.0866 196 8 49 13
## 7 12 2 0.0643 0.0163 0.0373 0.101 127 2 67 15
## 8 18 2 0.0880 0.0286 0.0425 0.154 11 1 115 16
## # … with 2 more variables: cum.censor <int>, time_max <dbl>, and abbreviated
## # variable names ¹std.error, ²conf.low, ³conf.high, ⁴n.censor, ⁵cum.event
# Cumulative Incidence Estimates After COVID-19
Z<-cuminc(Surv(ftime, ftype) ~ 1, data = EID_After)
tidy(Z,time = c(0, 6,12, 18))
## # A tibble: 8 × 12
## time outcome estimate std.er…¹ conf.…² conf.…³ n.risk n.event n.cen…⁴ cum.e…⁵
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int>
## 1 0 1 0.00678 0.00479 1.38e-3 0.0227 295 2 11 2
## 2 6 1 0.00678 0.00479 1.38e-3 0.0227 236 0 50 2
## 3 12 1 0.00678 0.00479 1.38e-3 0.0227 136 0 120 2
## 4 18 1 0.00678 0.00479 1.38e-3 0.0227 3 0 110 2
## 5 0 2 0.00339 0.00339 3.27e-4 0.0178 295 1 11 1
## 6 6 2 0.00339 0.00339 3.27e-4 0.0178 236 0 50 1
## 7 12 2 0.00905 0.00658 1.71e-3 0.0310 136 1 120 2
## 8 18 2 0.00905 0.00658 1.71e-3 0.0310 3 0 110 2
## # … with 2 more variables: cum.censor <int>, time_max <dbl>, and abbreviated
## # variable names ¹std.error, ²conf.low, ³conf.high, ⁴n.censor, ⁵cum.event
#Estimate of the cumulative incidence at various times by Cohort and Gray’s test to test for a difference between groups over the entire follow-up period
cuminc(Surv(ftime, ftype) ~ trt, data = EID)%>%
tbl_cuminc(
times =c(0,6,12,18),
label_header = "**Month {time}**") %>%
add_p()
| Characteristic | Month 0 | Month 6 | Month 12 | Month 18 | p-value1 |
|---|---|---|---|---|---|
| trt | 0.15 | ||||
| 0 | 0.76% (0.15%, 2.5%) | 1.5% (0.52%, 3.7%) | 1.5% (0.52%, 3.7%) | 3.2% (1.2%, 6.7%) | |
| 1 | 0.68% (0.14%, 2.3%) | 0.68% (0.14%, 2.3%) | 0.68% (0.14%, 2.3%) | 0.68% (0.14%, 2.3%) | |
| 1 Gray's Test | |||||
11 (a) Cumulative Incidence Plot: Infant HIV Infection
# Plot of HIV Infant Infection over the follow-up period
J<-cuminc(Surv(ftime, ftype) ~ trt, data = EID)
ggcuminc(
J,
outcome = c("1"),
linetype_aes = FALSE,
theme = theme_ggsurvfit_default()
)+add_confidence_interval()+add_risktable()+
geom_point()+
coord_cartesian(
xlim = c(0,18),
ylim = NULL,
)+
theme(axis.text.x = element_text(),
axis.text.y = element_text())+
xlab("Follow up Time (Months)")
11 (b) Factors Associated with EID care Infant HIV Infection
# Estimation of the subdistribution hazards, adjusting for the Cohort and Infant's Sex
crr(Surv(ftime, ftype) ~ trt+Sex,data = EID)
##
## Variable Coef SE HR 95% CI p-value
## trt1 -1.09 0.806 0.34 0.07, 1.63 0.18
## Sex2 1.20 0.808 3.32 0.68, 16.2 0.14
# Estimation of the subdistribution hazards, adjusting for the Cohort and Age at first visit
crr(Surv(ftime, ftype) ~ trt+Age_in_months,data = EID)
##
## Variable Coef SE HR 95% CI p-value
## trt1 -1.08 0.791 0.34 0.07, 1.59 0.17
## Age_in_months 0.301 0.055 1.35 1.21, 1.50 <0.001
# Estimation of the subdistribution hazards, adjusting for the Cohort and Mother's ARVs enrollment
crr(Surv(ftime, ftype) ~ trt+ARVs_EMTCT,data = EID)
##
## Variable Coef SE HR 95% CI p-value
## trt1 -0.801 0.861 0.45 0.08, 2.43 0.35
## ARVs_EMTCT1 -2.95 0.807 0.05 0.01, 0.25 <0.001
# Estimation of the subdistribution hazards, adjusting for the Cohort and Infant's ARVs Prophylaxis
crr(Surv(ftime, ftype) ~ trt+Infant_ARVs_EMTCT,data = EID)
##
## Variable Coef SE HR 95% CI p-value
## trt1 -1.01 0.805 0.36 0.07, 1.76 0.21
## Infant_ARVs_EMTCT1 -1.67 1.12 0.19 0.02, 1.70 0.14
# Estimation of the subdistribution hazards, adjusting for the Cohort and the Clinic through which the infant joined EID Care
crr(Surv(ftime, ftype) ~ trt+Clinic_referred_from,data = EID)
##
## Variable Coef SE HR 95% CI p-value
## trt1 -1.09 0.810 0.34 0.07, 1.64 0.18
## Clinic_referred_from1 0.128 0.830 1.14 0.22, 5.78 0.88
# Estimation of the subdistribution hazards, adjusting for the Cohort and Breast feeding status at the last visit
crr(Surv(ftime, ftype) ~ trt+feeding_at_visit,data = EID)
## 47 cases omitted due to missing values
##
## Variable Coef SE HR 95% CI p-value
## trt1 -0.985 0.809 0.37 0.08, 1.82 0.22
## feeding_at_visit1 1.84 0.655 6.31 1.75, 22.7 0.005
# Estimation of the subdistribution hazards, adjusting for the Cohort and the other variables; Full Model
crr(Surv(ftime, ftype) ~ trt+Sex+Age_in_months+ARVs_EMTCT+Infant_ARVs_EMTCT+Clinic_referred_from+feeding_at_visit,data = EID)
## 47 cases omitted due to missing values
##
## Variable Coef SE HR 95% CI p-value
## trt1 -0.589 0.835 0.55 0.11, 2.85 0.48
## Sex2 0.852 0.923 2.34 0.38, 14.3 0.36
## Age_in_months 0.385 0.106 1.47 1.20, 1.81 <0.001
## ARVs_EMTCT1 -2.45 0.736 0.09 0.02, 0.37 <0.001
## Infant_ARVs_EMTCT1 0.276 1.42 1.32 0.08, 21.5 0.85
## Clinic_referred_from1 -1.37 1.14 0.25 0.03, 2.36 0.23
## feeding_at_visit1 2.12 0.795 8.31 1.75, 39.4 0.008
12. Iterated mean-based TMLE
A common goal is to compare the incidence of failure at a fixed time between the two treatment groups. Covariate adjustment is often desirable in this comparison to improve efficiency.This covariate adjustment may be facilitated by estimating a series of iterated covariate-conditional means.The final iterated covariate-conditional mean is marginalized over the empirical distribution of baseline covariates,to obtain an estimate of the marginal cumulative incidence.Here, we invoke the eponymous survtmle function to compute the iterated mean-based (method = “mean”) covariate-adjusted estimates of the cumulative incidence at time six (t0 = 6) in each of the treatment groups using quasi-logistic regression (formula specified via glm.ftime) to estimate the iterated means. The glm.ftime argument should be a valid right-hand-side formula specification based on colnames(adjustVars) and “trt”. Here we use a simple main terms regression.
Internally, survtmle estimates the covariate-conditional treatment probability (via glm.trt or SL.trt) and covariate-conditional censoring distribution (via glm.ctime or SL.ctime,). In the above (fit0), the treatment probability does not depend on covariates and so we did not specify a way to adjust for covariates in estimating the treatment probability.In this case, survtmle sets glm.trt = “1”, which corresponds with empirical estimates of treatment probability,and sets glm.ctime to be equivalent to the Kaplan-Meier censoring distribution estimates.
In practice, we may wish to adjust for covariates when computing estimates of the covariate-conditional treatment and censoring probabilities. In observational studies,the distribution of treatment may differ by measured covariates,while in almost any study (including randomized trials) it is possible that censoring differs by covariates. Thus, we often wish to adjust for covariates to account for measured confounders of treatment receipt and censoring. This adjustment may be accomplished using logistic regression through the “glm.trt” and “glm.ctime” arguments, respectively.The “glm.trt” argument should be a valid right-hand-side formula specification based on colnames(adjustVars).The “glm.ctime” argument should be a valid right-hand-side formula specification based on colnames(adjustVars), “trt”, and “t” used to model the hazard function for censoring. By including “trt” and “t”, the function allows censoring probabilities to depend on treatment assignment and time, respectively.
library(survtmle)
## survtmle: Targeted Learning for Survival Analysis
## Version: 1.1.1
It is important to note that the current “survtmle” distribution only supports integer-valued failure times. We shall convert the “ftype” and “trt” variables to integers prior to applying the “survtmle” function.
EID_Data$ftype<-as.integer(EID_Data$ftype)
EID_Data$trt<-as.integer(EID_Data$trt)
#EID_Data$ftime<-EID_Data$ftime
Some failure times are less than or equal zero. We need remove these observations otherwise the function will return an error. Let’s eliminate zero failure times by adding 1 to the “ftime” variable
EID_Data$ftime<-EID_Data$ftime+1
EID_Data <- EID_Data %>%
mutate_at(c('ARVs_EMTCT'), ~replace_na(.,'0'))
EID_Data <- EID_Data %>%
mutate_at(c('Infant_ARVs_EMTCT'), ~replace_na(.,'0'))
EID_Data <- EID_Data %>%
mutate_at(c('feeding_at_visit'), ~replace_na(.,'0'))
Note:
Since it is not logical to assume that HIV Exposed infants belong to either the pre-COVID 19 or post-COVID-19 group by chance, we take the probability of treatment/exposure to the COVID-19 pandemic to be 1. However, the censoring probability could be confounded by covariates. In the fit below (fit0b), we eliminate “glm.trt” and include “glm.ctime
12(a): Estimation in the unbounded model
fita <- survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,
trt = EID_Data$trt, adjustVars = EID_Data[c("Sex","Clinic_referred_from","Age_in_months")],
glm.ctime = "trt + Sex + Clinic_referred_from + Age_in_months",
glm.ftime = "trt + Sex + Clinic_referred_from + Age_in_months",
glm.trt = "Sex + Clinic_referred_from + Age_in_months",
method = "mean", t0 = 1, returnModels=TRUE)
fita
## $est
## [,1]
## 0 1 0.008326190
## 1 1 0.006509832
## 0 2 0.016694358
## 1 2 0.003154582
##
## $var
## 0 1 1 1 0 2 1 2
## 0 1 9.137230e-05 7.662578e-06 -5.594176e-06 2.058494e-08
## 1 1 7.662578e-06 1.421580e-05 -2.652080e-07 -4.751520e-08
## 0 2 -5.594176e-06 -2.652080e-07 7.997958e-05 7.374044e-09
## 1 2 2.058494e-08 -4.751520e-08 7.374044e-09 9.088585e-06
13. Factors associated with Infant HIV Infection: Cause specific hazards TMLE
Bivariate Analysis: Infant’s Sex
haz_sex<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex")],
glm.ctime = "trt + Sex",
glm.ftime = "trt + Sex",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sex$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.975 0.7381 -9.450 3.400e-21 ***
## trt -1.185 0.8171 -1.450 1.469e-01
## Sex2 1.247 0.8170 1.526 1.270e-01
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 117.4;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 123.399; log Likelihood: -58.69948;
## RSS: 5388.8; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 6.24e-12; convergence: TRUE.
Bivariate Analysis: Clinic from which the Infant/Mother was referred to EID Care
haz_clinic<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Clinic_referred_from")],
glm.ctime = "trt + Clinic_referred_from",
glm.ftime = "trt + Clinic_referred_from",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_clinic$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.270 0.4915 -12.7568 2.857e-37 ***
## trt -1.145 0.8515 -1.3452 1.786e-01
## Clinic_referred_from1 0.212 0.8520 0.2489 8.035e-01
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 120.06;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 126.0637; log Likelihood: -60.03186;
## RSS: 6298.6; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 2.62e-13; convergence: TRUE.
Bivariate Analysis: Infant’s Age at first EID Care Visit
haz_age<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Age_in_months")],
glm.ctime = "trt + Age_in_months",
glm.ftime = "trt + Age_in_months",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_age$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.2198 0.58170 -12.412 2.263e-35 ***
## trt -1.0489 0.83497 -1.256 2.090e-01
## Age_in_months 0.3542 0.07561 4.685 2.804e-06 ***
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 107.36;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 113.362; log Likelihood: -53.681;
## RSS: 4622.7; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 8.19e-13; convergence: TRUE.
Bivariate Analysis: Infant’s initiation on EMTCT ARVs at birth
haz_infant<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Infant_ARVs_EMTCT")],
glm.ctime = "trt + Infant_ARVs_EMTCT",
glm.ftime = "trt + Infant_ARVs_EMTCT",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_infant$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.167 1.0128 -4.115 3.878e-05 ***
## trt -1.123 0.8215 -1.367 1.715e-01
## Infant_ARVs_EMTCT1 -2.177 1.0815 -2.013 4.414e-02 *
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 117.67;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 123.6698; log Likelihood: -58.83489;
## RSS: 6418.5; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 3.73e-13; convergence: TRUE.
Bivariate Analysis: Mother’s Prolonged ART status
haz_mother<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("ARVs_EMTCT")],
glm.ctime = "trt + ARVs_EMTCT",
glm.ftime = "trt + ARVs_EMTCT",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_mother$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.4178 0.6000 -5.696 1.226e-08 ***
## trt -0.9282 0.8309 -1.117 2.639e-01
## ARVs_EMTCT1 -3.3127 0.7479 -4.429 9.454e-06 ***
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 107.62;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 113.6184; log Likelihood: -53.80921;
## RSS: 5615.8; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 1.58e-12; convergence: TRUE.
Bivariate Analysis: Infant’s breastfeeding status at the last recorded EID Care visit
haz_feed<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("feeding_at_visit")],
glm.ctime = "trt + feeding_at_visit",
glm.ftime = "trt + feeding_at_visit",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_feed$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.290 0.7384 -9.873 5.462e-23 ***
## trt -1.170 0.8174 -1.431 1.524e-01
## feeding_at_visit1 2.107 0.8172 2.578 9.942e-03 **
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6279; residuals deviance: 112.07;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 118.0654; log Likelihood: -56.03268;
## RSS: 4800.1; dispersion: 1; iterations: 10;
## rank: 3; max tolerance: 2.46e-10; convergence: TRUE.
Mutivariable Analysis
Model 1:Sex, Age
haz_sexage<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex","Age_in_months")],
glm.ctime = "trt + Sex + Age_in_months",
glm.ftime = "trt + Sex + Age_in_months",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexage$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.9721 0.84422 -9.443 3.618e-21 ***
## trt -1.0381 0.83340 -1.246 2.129e-01
## Sex2 1.2292 0.82197 1.495 1.348e-01
## Age_in_months 0.3528 0.07571 4.660 3.161e-06 ***
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6278; residuals deviance: 104.76;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 112.7583; log Likelihood: -52.37916;
## RSS: 5344; dispersion: 1; iterations: 10;
## rank: 4; max tolerance: 2.64e-11; convergence: TRUE.
Model 2: Sex, Age, Clinic
haz_sexageclin<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex","Age_in_months","Clinic_referred_from")],
glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from",
glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.994 0.8688 -9.201 3.537e-20 ***
## trt -1.260 0.8413 -1.498 1.343e-01
## Sex2 1.282 0.8257 1.552 1.206e-01
## Age_in_months 0.477 0.1211 3.940 8.150e-05 ***
## Clinic_referred_from1 -1.795 1.3407 -1.339 1.806e-01
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6277; residuals deviance: 102.62;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 112.6166; log Likelihood: -51.30829;
## RSS: 6827.6; dispersion: 1; iterations: 10;
## rank: 5; max tolerance: 1.04e-10; convergence: TRUE.
Model 3: Sex, Age, Clinic, Infant ARVs
haz_sexageclin_infant<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex","Age_in_months","Clinic_referred_from","Infant_ARVs_EMTCT")],
glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT",
glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_infant$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.0942 1.3146 -5.3966 6.791e-08 ***
## trt -1.1562 0.8567 -1.3497 1.771e-01
## Sex2 1.4407 0.8678 1.6602 9.687e-02 .
## Age_in_months 0.4464 0.1207 3.6995 2.160e-04 ***
## Clinic_referred_from1 -1.4414 1.2998 -1.1090 2.674e-01
## Infant_ARVs_EMTCT1 -1.0772 1.2372 -0.8707 3.839e-01
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6276; residuals deviance: 101.99;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 113.9884; log Likelihood: -50.99419;
## RSS: 5477.5; dispersion: 1; iterations: 10;
## rank: 6; max tolerance: 3.16e-10; convergence: TRUE.
Model 4: Sex, Age, Clinic, Maternal ART
haz_sexageclin_mot<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex","Age_in_months","Clinic_referred_from", "ARVs_EMTCT")],
glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT",
glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_mot$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.1442 1.1226 -5.473 4.418e-08 ***
## trt -1.0710 0.8558 -1.252 2.107e-01
## Sex2 1.6428 0.9394 1.749 8.034e-02 .
## Age_in_months 0.4307 0.1306 3.299 9.715e-04 ***
## Clinic_referred_from1 -1.9704 1.2806 -1.539 1.239e-01
## ARVs_EMTCT1 -2.3706 0.8491 -2.792 5.238e-03 **
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6276; residuals deviance: 96.09;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 108.0906; log Likelihood: -48.0453;
## RSS: 6384.1; dispersion: 1; iterations: 11;
## rank: 6; max tolerance: 2.22e-15; convergence: TRUE.
Model 5: Sex, Age, Clinic, Infant ARVs,Maternal ART
haz_sexageclin_infant_mot<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex","Age_in_months","Clinic_referred_from","Infant_ARVs_EMTCT", "ARVs_EMTCT")],
glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT + ARVs_EMTCT",
glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + Infant_ARVs_EMTCT + ARVs_EMTCT",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_infant_mot$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.2693 1.4287 -4.3881 1.143e-05 ***
## trt -1.0849 0.8599 -1.2616 2.071e-01
## Sex2 1.5962 0.9790 1.6304 1.030e-01
## Age_in_months 0.4360 0.1373 3.1754 1.496e-03 **
## Clinic_referred_from1 -2.0318 1.3665 -1.4869 1.370e-01
## Infant_ARVs_EMTCT1 0.2072 1.4196 0.1459 8.840e-01
## ARVs_EMTCT1 -2.4154 0.8940 -2.7018 6.896e-03 **
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6275; residuals deviance: 96.07;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 110.0687; log Likelihood: -48.03437;
## RSS: 6244.7; dispersion: 1; iterations: 11;
## rank: 7; max tolerance: 5.32e-15; convergence: TRUE.
Model 6: Sex, Age, Clinic, Maternal ART,Breastfeeding status
haz_sexageclin_motfeed<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex","Age_in_months","Clinic_referred_from", "ARVs_EMTCT","feeding_at_visit")],
glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(haz_sexageclin_motfeed$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.3913 1.2012 -6.1532 7.593e-10 ***
## trt -0.7743 0.8886 -0.8714 3.836e-01
## Sex2 1.0232 0.8700 1.1761 2.396e-01
## Age_in_months 0.4994 0.1232 4.0537 5.042e-05 ***
## Clinic_referred_from1 -1.6846 1.1335 -1.4862 1.372e-01
## ARVs_EMTCT1 -2.5477 0.8300 -3.0696 2.143e-03 **
## feeding_at_visit1 2.6553 0.9315 2.8504 4.366e-03 **
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6275; residuals deviance: 85.85;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 99.85082; log Likelihood: -42.92541;
## RSS: 12784; dispersion: 1; iterations: 11;
## rank: 7; max tolerance: 9.47e-14; convergence: TRUE.
Model 7: Sex, Age, Clinic, Infant ARVs, Maternal ART, Breastfeeding status
fitahaz <- survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,
trt = EID_Data$trt, adjustVars = EID_Data[c("Sex","Clinic_referred_from","Age_in_months","Infant_ARVs_EMTCT","ARVs_EMTCT","feeding_at_visit")],
glm.ctime = "trt + Sex + Clinic_referred_from + Age_in_months + Infant_ARVs_EMTCT + ARVs_EMTCT + feeding_at_visit",
glm.ftime = "trt + Sex + Clinic_referred_from + Age_in_months + Infant_ARVs_EMTCT + ARVs_EMTCT + feeding_at_visit",
method = "hazard", t0 = 1, returnModels=TRUE)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(fitahaz$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.3783 1.4016 -5.26429 1.407e-07 ***
## trt -0.7729 0.8923 -0.86614 3.864e-01
## Sex2 1.0286 0.9214 1.11641 2.642e-01
## Clinic_referred_from1 -1.6811 1.1488 -1.46341 1.434e-01
## Age_in_months 0.4990 0.1244 4.01063 6.056e-05 ***
## Infant_ARVs_EMTCT1 -0.0241 1.3408 -0.01797 9.857e-01
## ARVs_EMTCT1 -2.5416 0.8968 -2.83416 4.595e-03 **
## feeding_at_visit1 2.6561 0.9328 2.84754 4.406e-03 **
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6274; residuals deviance: 85.85;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 101.8505; log Likelihood: -42.92525;
## RSS: 12805.5; dispersion: 1; iterations: 11;
## rank: 8; max tolerance: 1.25e-13; convergence: TRUE.
Multivariable Model Goodness of Fit Comparison: Akaike Information Criterion (AIC), RSS, DF, etc
| Parameter | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 |
|---|---|---|---|---|---|---|---|
| Log-likelihood | -52.37 | -51.31 | -50.99 | -48.05 | -48.03 | -42.93 | -42.93 |
| Residual Sum of Squares | 5344.00 | 6827.60 | 5477.50 | 6384.10 | 6244.70 | 12784.00 | 12805.50 |
| Observations | 6282 | 6282 | 6282 | 6282 | 6282 | 6282 | 6282 |
| Degrees of Freedom | 6281 | 6281 | 6281 | 6281 | 6281 | 6281 | 6281 |
| AIC | 112.76 | 112.62 | 113.99 | 108.09 | 110.07 | 99.85 | 101.85 |
From the above Table, Model 6 has the lowest AIC value and the highest Log-likelihood value.We shall carry on with Model 6 as ‘fithaz’
fithaz<-survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,trt = EID_Data$trt,adjustVars =
EID_Data[c("Sex","Age_in_months","Clinic_referred_from", "ARVs_EMTCT","feeding_at_visit")],
glm.ctime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
glm.ftime = "trt + Sex + Age_in_months + Clinic_referred_from + ARVs_EMTCT + feeding_at_visit",
method = "hazard",t0 = 1, returnModels = T)
## Warning in checkInputs(ftime = ftime, ftype = ftype, trt = trt, t0 = t0, :
## glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
summary(fithaz$ftimeMod$J1)
## Generalized Linear Model of class 'speedglm':
##
## Call: speedglm::speedglm(formula = reg_form, data = data, family = family, sparse = ifelse(calling_fun == "estimateTreatment", FALSE, TRUE), trace = FALSE, method = "Cholesky")
##
## Coefficients:
## ------------------------------------------------------------------
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.3913 1.2012 -6.1532 7.593e-10 ***
## trt -0.7743 0.8886 -0.8714 3.836e-01
## Sex2 1.0232 0.8700 1.1761 2.396e-01
## Age_in_months 0.4994 0.1232 4.0537 5.042e-05 ***
## Clinic_referred_from1 -1.6846 1.1335 -1.4862 1.372e-01
## ARVs_EMTCT1 -2.5477 0.8300 -3.0696 2.143e-03 **
## feeding_at_visit1 2.6553 0.9315 2.8504 4.366e-03 **
##
## -------------------------------------------------------------------
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ---
## null df: 6281; null deviance: 122.65;
## residuals df: 6275; residuals deviance: 85.85;
## # obs.: 6282; # non-zero weighted obs.: 6282;
## AIC: 99.85082; log Likelihood: -42.92541;
## RSS: 12784; dispersion: 1; iterations: 11;
## rank: 7; max tolerance: 9.47e-14; convergence: TRUE.
# coefficients(fitahaz$ftimeMod$J1)
tp.fithaz <- timepoints(fithaz, times = seq_len(19)) # t_0 = 18
# print the object
tp.fithaz
## $est
## t1 t2 t3 t4 t5 t6
## 0 1 0.0048202828 0.008875314 0.012389183 0.015503285 0.018310145 0.02087274
## 1 1 0.0024215192 0.004629900 0.006664568 0.008555788 0.010326970 0.01199638
## 0 2 0.0067775751 0.013346996 0.019729907 0.025942991 0.031999517 0.03791033
## 1 2 0.0009777206 0.001945440 0.002903919 0.003853795 0.004795607 0.00572982
## t7 t8 t9 t10 t11 t12
## 0 1 0.023235659 0.025431700 0.027485782 0.029417386 0.03124210 0.03297262
## 1 1 0.013578406 0.015084504 0.016523895 0.017904089 0.01923128 0.02051063
## 0 2 0.043684484 0.049329687 0.054852600 0.060259055 0.06555422 0.07074271
## 1 2 0.006656836 0.007577008 0.008490647 0.009398029 0.01029941 0.01119500
## t13 t14 t15 t16 t17 t18
## 0 1 0.03461946 0.03619140 0.03769590 0.03913929 0.04052701 0.04186379
## 1 1 0.02174651 0.02294264 0.02410224 0.02522811 0.02632271 0.02738823
## 0 2 0.07582870 0.08081599 0.08570804 0.09050806 0.09521903 0.09984370
## 1 2 0.01208501 0.01296964 0.01384904 0.01472337 0.01559279 0.01645742
## t19
## 0 1 0.04315371
## 1 1 0.02842663
## 0 2 0.10438467
## 1 2 0.01731739
##
## $var
## t1 t2 t3 t4 t5
## 0 1 1.654503e-05 4.839176e-05 4.560346e-05 4.532798e-05 4.654452e-05
## 1 1 7.041655e-05 8.627864e-05 1.268453e-04 1.514886e-04 1.757644e-04
## 0 2 2.156742e-05 2.079303e-05 2.045204e-05 2.042525e-05 2.058985e-05
## 1 2 1.145558e-05 1.143984e-05 1.144357e-05 1.147431e-05 1.153112e-05
## t6 t7 t8 t9 t10
## 0 1 4.966405e-05 5.387139e-05 5.912679e-05 6.620294e-05 7.016317e-05
## 1 1 2.834521e-04 2.836570e-04 3.413778e-04 3.444033e-04 3.468073e-04
## 0 2 2.092015e-05 2.137484e-05 2.196573e-05 2.258566e-05 2.325815e-05
## 1 2 1.161952e-05 1.171705e-05 1.185052e-05 1.200811e-05 1.215642e-05
## t11 t12 t13 t14 t15
## 0 1 6.937105e-05 6.898776e-05 6.900238e-05 1.763088e-04 2.686324e-04
## 1 1 4.448937e-04 4.402236e-04 4.372478e-04 4.355286e-04 4.364386e-04
## 0 2 2.399484e-05 2.473368e-05 2.541246e-05 2.610179e-05 2.681669e-05
## 1 2 1.234037e-05 6.551036e-05 6.560608e-05 6.571289e-05 6.584721e-05
## t16 t17 t18 t19
## 0 1 2.688449e-04 2.690909e-04 2.694440e-04 2.697537e-04
## 1 1 4.389367e-04 4.391585e-04 9.254904e-04 9.087779e-04
## 0 2 2.748662e-05 2.815891e-05 2.884002e-05 2.944542e-05
## 1 2 6.595067e-05 6.604683e-05 6.616420e-05 6.619671e-05
plot(tp.fithaz, type = "raw")
plot(tp.fithaz)
Call timepoints based on this fit
The survtmle function provides the function timepoints to compute the estimated cumulative incidence over multiple timepoints. This function is invoked after an initial call to survtmle with option returnModels = TRUE. By setting this option, the timepoints function is able to recycle fits for the conditional treatment probability, censoring distribution, and, in the case of method = “hazard”, the hazard fits. Thus, invoking timepoints is faster than making repeated calls to survtmle with different t0.Now we can call timepoints to return estimates of cumulative incidence at each time “seq_len(t_0)”.
tp.fita <- timepoints(fita, times = seq_len(19)) # t_0 = 19
tp.fita
## $est
## t1 t2 t3 t4 t5 t6
## 0 1 0.008326190 0.015595895 0.015595895 0.015595895 0.015595895 0.015595895
## 1 1 0.006509832 0.006699801 0.006699801 0.006699801 0.006699801 0.006699801
## 0 2 0.016694358 0.020763858 0.027501088 0.032341184 0.037245864 0.049206022
## 1 2 0.003154582 0.003163791 0.003190885 0.003165478 0.003149360 0.003127804
## t7 t8 t9 t10 t11 t12
## 0 1 0.015595895 0.015595895 0.015595895 0.015595895 0.015595895 0.015595895
## 1 1 0.006699801 0.006699801 0.006699801 0.006699801 0.006699801 0.006699801
## 0 2 0.049206022 0.054442995 0.054442995 0.054442995 0.058599052 0.057913574
## 1 2 0.003127804 0.003131106 0.003131106 0.003131106 0.003389268 0.009235901
## t13 t14 t15 t16 t17 t18
## 0 1 0.015595895 0.035004870 0.040363331 0.040363331 0.040363331 0.040363331
## 1 1 0.006699801 0.006111573 0.005339743 0.005339743 0.005339743 0.005339743
## 0 2 0.057913574 0.057913574 0.057913574 0.057913574 0.057913574 0.128790230
## 1 2 0.009235901 0.009235901 0.009235901 0.009235901 0.009235901 0.135015120
## t19
## 0 1 0.040363331
## 1 1 0.005339743
## 0 2 0.128782499
## 1 2 0.135244181
##
## $var
## t1 t2 t3 t4 t5
## 0 1 9.137230e-05 2.561452e-04 2.561452e-04 2.561452e-04 2.561455e-04
## 1 1 1.421580e-05 1.414104e-05 1.414104e-05 1.414104e-05 1.414104e-05
## 0 2 7.997958e-05 9.873305e-05 1.298444e-04 1.590920e-04 1.918718e-04
## 1 2 9.088585e-06 9.074888e-06 9.076871e-06 9.069391e-06 9.066657e-06
## t6 t7 t8 t9 t10
## 0 1 2.561536e-04 2.565076e-04 3.231470e-04 1.240670e-03 1.839954e-03
## 1 1 1.414104e-05 1.414104e-05 1.414104e-05 1.414104e-05 1.414104e-05
## 0 2 2.782822e-04 2.782822e-04 3.253026e-04 3.253026e-04 3.253026e-04
## 1 2 9.066553e-06 9.066553e-06 9.063113e-06 9.063113e-06 9.063113e-06
## t11 t12 t13 t14 t15
## 0 1 2.499533e-03 3.455108e-03 2.624278e-04 4.667918e-04 6.814101e-04
## 1 1 1.414104e-05 1.414104e-05 1.414104e-05 1.321218e-05 1.297481e-05
## 0 2 3.457804e-04 3.456246e-04 3.456246e-04 3.456246e-04 3.456246e-04
## 1 2 9.097782e-06 5.199292e-05 5.199292e-05 5.199292e-05 5.199292e-05
## t16 t17 t18 t19
## 0 1 1.672649e-02 2.366163e-02 3.405483e-02 4.860499e-02
## 1 1 1.297481e-05 1.297481e-05 1.297481e-05 1.297481e-05
## 0 2 3.456246e-04 3.456246e-04 1.626114e-03 1.626053e-03
## 1 2 5.199292e-05 5.199292e-05 3.297126e-04 3.298089e-04
# Confidence intervals for estimates
ci_fita<-confint(tp.fita)
ci_fita
## $`0 1`
## 2.5 % 97.5 %
## [1,] -0.010408881 0.02706126
## [2,] -0.015772421 0.04696421
## [3,] -0.015772421 0.04696421
## [4,] -0.015772422 0.04696421
## [5,] -0.015772440 0.04696423
## [6,] -0.015772935 0.04696472
## [7,] -0.015794604 0.04698639
## [8,] -0.019636988 0.05082878
## [9,] -0.053440190 0.08463198
## [10,] -0.068476135 0.09966792
## [11,] -0.082393145 0.11358494
## [12,] -0.099611112 0.13080290
## [13,] -0.016154784 0.04734657
## [14,] -0.007340868 0.07735061
## [15,] -0.010799243 0.09152591
## [16,] -0.213120655 0.29384732
## [17,] -0.261124953 0.34185162
## [18,] -0.321327468 0.40205413
## [19,] -0.391740893 0.47246756
##
## $`1 1`
## 2.5 % 97.5 %
## [1,] -0.0008799862 0.01389965
## [2,] -0.0006705593 0.01407016
## [3,] -0.0006705593 0.01407016
## [4,] -0.0006705593 0.01407016
## [5,] -0.0006705593 0.01407016
## [6,] -0.0006705593 0.01407016
## [7,] -0.0006705593 0.01407016
## [8,] -0.0006705593 0.01407016
## [9,] -0.0006705593 0.01407016
## [10,] -0.0006705593 0.01407016
## [11,] -0.0006705593 0.01407016
## [12,] -0.0006705593 0.01407016
## [13,] -0.0006705593 0.01407016
## [14,] -0.0010126134 0.01323576
## [15,] -0.0017201581 0.01239964
## [16,] -0.0017201581 0.01239964
## [17,] -0.0017201581 0.01239964
## [18,] -0.0017201581 0.01239964
## [19,] -0.0017201581 0.01239964
##
## $`0 2`
## 2.5 % 97.5 %
## [1,] -0.0008338551 0.03422257
## [2,] 0.0012887723 0.04023894
## [3,] 0.0051674410 0.04983473
## [4,] 0.0076198263 0.05706254
## [5,] 0.0100968723 0.06439486
## [6,] 0.0165103095 0.08190173
## [7,] 0.0165103095 0.08190173
## [8,] 0.0190927960 0.08979319
## [9,] 0.0190927960 0.08979319
## [10,] 0.0190927960 0.08979319
## [11,] 0.0221531840 0.09504492
## [12,] 0.0214759170 0.09435123
## [13,] 0.0214759170 0.09435123
## [14,] 0.0214759170 0.09435123
## [15,] 0.0214759170 0.09435123
## [16,] 0.0214759170 0.09435123
## [17,] 0.0214759170 0.09435123
## [18,] 0.0497544722 0.20782599
## [19,] 0.0497482183 0.20781678
##
## $`1 2`
## 2.5 % 97.5 %
## [1,] -0.002754177 0.009063340
## [2,] -0.002740513 0.009068095
## [3,] -0.002714065 0.009095834
## [4,] -0.002737038 0.009067994
## [5,] -0.002752266 0.009050986
## [6,] -0.002773788 0.009029396
## [7,] -0.002773788 0.009029396
## [8,] -0.002769366 0.009031579
## [9,] -0.002769366 0.009031579
## [10,] -0.002769366 0.009031579
## [11,] -0.002522479 0.009301015
## [12,] -0.004896638 0.023368440
## [13,] -0.004896638 0.023368440
## [14,] -0.004896638 0.023368440
## [15,] -0.004896638 0.023368440
## [16,] -0.004896638 0.023368440
## [17,] -0.004896638 0.023368440
## [18,] 0.099426115 0.170604125
## [19,] 0.099649977 0.170838385
Return aic for the Model
fita[["ftimeMod"]][["J1"]][["t1"]][["aic"]]
## [1] 41.42972
12(a)ii: Plot the Cumulative Incidence Estimates from the Unbounded Model
Because the cumulative incidence function is being invoked pointwise, it is possible that the resulting curve is not monotone. However, it is possible to show that projecting this curve onto a monotone function via isotonic regression results in an estimate with identical asymptotic properties to the pointwise estimate. Therefore, we additionally provide an option type = “iso” (the default) that provides these smoothed curves.
plot(tp.fita, type = "raw")
plot(tp.fita)
summary(fita$ftimeMod$J1)
## Length Class Mode
## t1 26 speedglm list
12(b): Estimation in bounded models
In certain situations, we have knowledge that the incidence of an event is bounded below/above for every strata in the population. It is possible to incorporate these bounds into the TMLE estimation procedure to ensure that any resulting estimate of cumulative incidence is compatible with these bounds
Bounds can be passed to survtmle by creating a “data.frame” that contains columns with specific names. In particular, there should be:
A column named “t”,
There should additionally be columns for the lower and upper bound for each type of failure,
For example if there is only one type of failure (ftype = 1 or ftype = 0) then the bounds data.frame can contain columns “l1”, and “u1” denote the lower and upper bounds, respectively, on the iterated conditional mean (for method = “mean”) or the conditional hazard function (for method = “hazard”).
If there are two types of failure (ftype = 1, ftype = 2, or ftype = 0) then there can additionally be columns “l2” and “u2” denoting the lower and upper bounds, respectively, on the iterated conditional mean for type two failures (for method = “mean”) or the conditional cause-specific hazard function for type two failures (for method = “hazard”).
12(b) i: Constructing bounds
Bounds lower bound for outcome 1 (l1),upper bound for outcome 1 (u1) lower bound for outcome 2 (l2),upper bound for outcome 2 (u2) t = seq_len(t_0)
Note:
For the observation period, in the study population:
bf18 <- data.frame(t = seq_len(19),
l1 = rep(0.00, 19), u1 = rep(0.06, 19),
l2 = rep(0.00, 19), u2 = rep(0.08, 19)
)
bf18
## t l1 u1 l2 u2
## 1 1 0 0.06 0 0.08
## 2 2 0 0.06 0 0.08
## 3 3 0 0.06 0 0.08
## 4 4 0 0.06 0 0.08
## 5 5 0 0.06 0 0.08
## 6 6 0 0.06 0 0.08
## 7 7 0 0.06 0 0.08
## 8 8 0 0.06 0 0.08
## 9 9 0 0.06 0 0.08
## 10 10 0 0.06 0 0.08
## 11 11 0 0.06 0 0.08
## 12 12 0 0.06 0 0.08
## 13 13 0 0.06 0 0.08
## 14 14 0 0.06 0 0.08
## 15 15 0 0.06 0 0.08
## 16 16 0 0.06 0 0.08
## 17 17 0 0.06 0 0.08
## 18 18 0 0.06 0 0.08
## 19 19 0 0.06 0 0.08
12(b) iii: Full Bounded Model
library(Rcpp)
fit18ab <- survtmle(ftime = EID_Data$ftime, ftype = EID_Data$ftype,
trt = EID_Data$trt, adjustVars = EID_Data[c("Sex","Clinic_referred_from","Age_in_months")],
glm.ctime = "trt + Sex + Clinic_referred_from + Age_in_months",
glm.ftime = "trt + Sex + Clinic_referred_from + Age_in_months",
glm.trt = "Sex + Clinic_referred_from + Age_in_months",
method = "mean", t0 = 1, returnModels=TRUE,bounds=bf18)
fit18ab
## $est
## [,1]
## 0 1 0.015520068
## 1 1 0.006963061
## 0 2 0.016783956
## 1 2 0.003199820
##
## $var
## 0 1 1 1 0 2 1 2
## 0 1 8.710949e-04 2.482888e-06 -1.610862e-05 1.371050e-07
## 1 1 2.482888e-06 1.365718e-05 4.296230e-07 8.828334e-08
## 0 2 -1.610862e-05 4.296230e-07 7.991389e-05 9.249250e-09
## 1 2 1.371050e-07 8.828334e-08 9.249250e-09 9.106529e-06
Call timepoints based on this fit
tp.fit18ab <- timepoints(fit18ab, times = seq_len(19)) # t_0 = 18
# print the object
tp.fit18ab
## $est
## t1 t2 t3 t4 t5 t6
## 0 1 0.015520068 0.030777416 0.030777416 0.030777416 0.030777416 0.030777416
## 1 1 0.006963061 0.006174571 0.006174571 0.006174571 0.006174571 0.006174571
## 0 2 0.016783956 0.020861794 0.027808995 0.032579641 0.037460856 0.049561328
## 1 2 0.003199820 0.003222324 0.003387215 0.003295824 0.003248530 0.003298410
## t7 t8 t9 t10 t11 t12
## 0 1 0.030777416 0.030777416 0.030777416 0.030777416 0.030777416 0.030777416
## 1 1 0.006174571 0.006174571 0.006174571 0.006174571 0.006174571 0.006174571
## 0 2 0.049561328 0.054731404 0.054731404 0.054731404 0.054039395 0.058439589
## 1 2 0.003298410 0.003385959 0.003385959 0.003385959 0.006801199 0.011217264
## t13 t14 t15 t16 t17 t18
## 0 1 0.030777416 0.031848648 0.044912626 0.044912626 0.044912626 0.044912626
## 1 1 0.006174571 0.006910134 0.006177893 0.006177893 0.006177893 0.006177893
## 0 2 0.058439589 0.058439589 0.058439589 0.058439589 0.058439589 0.077544510
## 1 2 0.011217264 0.011217264 0.011217264 0.011217264 0.011217264 0.032682966
## t19
## 0 1 0.044912626
## 1 1 0.006177893
## 0 2 0.077949390
## 1 2 0.033735847
##
## $var
## t1 t2 t3 t4 t5
## 0 1 8.710949e-04 1.118266e-03 1.118266e-03 1.118266e-03 1.118266e-03
## 1 1 1.365718e-05 1.288593e-05 1.288593e-05 1.288593e-05 1.288593e-05
## 0 2 7.991389e-05 9.864723e-05 1.295488e-04 1.588874e-04 1.916890e-04
## 1 2 9.106529e-06 9.089349e-06 9.140553e-06 9.093127e-06 9.075959e-06
## t6 t7 t8 t9 t10
## 0 1 1.118266e-03 1.118266e-03 1.118266e-03 1.118266e-03 1.118266e-03
## 1 1 1.288593e-05 1.288593e-05 1.288593e-05 1.288593e-05 1.288593e-05
## 0 2 2.780570e-04 2.780570e-04 3.255470e-04 3.255470e-04 3.255470e-04
## 1 2 9.102358e-06 9.102358e-06 9.098804e-06 9.098804e-06 9.098804e-06
## t11 t12 t13 t14 t15
## 0 1 1.118266e-03 1.118266e-03 1.118266e-03 1.153926e-03 1.308430e-03
## 1 1 1.288593e-05 1.288593e-05 1.288593e-05 1.373118e-05 1.318372e-05
## 0 2 3.435561e-04 3.430637e-04 3.430637e-04 3.430637e-04 3.430637e-04
## 1 2 1.193697e-05 5.928630e-05 5.928630e-05 5.928630e-05 5.928630e-05
## t16 t17 t18 t19
## 0 1 1.308430e-03 1.308430e-03 1.308430e-03 1.308430e-03
## 1 1 1.318372e-05 1.318372e-05 1.318372e-05 1.318372e-05
## 0 2 3.430637e-04 3.430637e-04 4.106369e-04 4.108522e-04
## 1 2 5.928630e-05 5.928630e-05 6.099240e-05 6.107559e-05
Confidence intervals for estimates
ci_fit18ab<-confint(tp.fit18ab)
ci_fit18ab
## $`0 1`
## 2.5 % 97.5 %
## [1,] -0.04232693 0.07336707
## [2,] -0.03476474 0.09631957
## [3,] -0.03476474 0.09631957
## [4,] -0.03476474 0.09631957
## [5,] -0.03476474 0.09631957
## [6,] -0.03476474 0.09631957
## [7,] -0.03476474 0.09631957
## [8,] -0.03476474 0.09631957
## [9,] -0.03476474 0.09631957
## [10,] -0.03476474 0.09631957
## [11,] -0.03476474 0.09631957
## [12,] -0.03476474 0.09631957
## [13,] -0.03476474 0.09631957
## [14,] -0.03473033 0.09842762
## [15,] -0.02598364 0.11580889
## [16,] -0.02598364 0.11580889
## [17,] -0.02598364 0.11580889
## [18,] -0.02598364 0.11580889
## [19,] -0.02598364 0.11580889
##
## $`1 1`
## 2.5 % 97.5 %
## [1,] -0.0002801078 0.01420623
## [2,] -0.0008611070 0.01321025
## [3,] -0.0008611070 0.01321025
## [4,] -0.0008611070 0.01321025
## [5,] -0.0008611070 0.01321025
## [6,] -0.0008611070 0.01321025
## [7,] -0.0008611070 0.01321025
## [8,] -0.0008611070 0.01321025
## [9,] -0.0008611070 0.01321025
## [10,] -0.0008611070 0.01321025
## [11,] -0.0008611070 0.01321025
## [12,] -0.0008611070 0.01321025
## [13,] -0.0008611070 0.01321025
## [14,] -0.0003526318 0.01417290
## [15,] -0.0009386164 0.01329440
## [16,] -0.0009386164 0.01329440
## [17,] -0.0009386164 0.01329440
## [18,] -0.0009386164 0.01329440
## [19,] -0.0009386164 0.01329440
##
## $`0 2`
## 2.5 % 97.5 %
## [1,] -0.0007370583 0.03430497
## [2,] 0.0013951748 0.04032841
## [3,] 0.0055007805 0.05011721
## [4,] 0.0078741915 0.05728509
## [5,] 0.0103248002 0.06459691
## [6,] 0.0168788500 0.08224381
## [7,] 0.0168788500 0.08224381
## [8,] 0.0193679268 0.09009488
## [9,] 0.0193679268 0.09009488
## [10,] 0.0193679268 0.09009488
## [11,] 0.0177109381 0.09036785
## [12,] 0.0221371768 0.09474200
## [13,] 0.0221371768 0.09474200
## [14,] 0.0221371768 0.09474200
## [15,] 0.0221371768 0.09474200
## [16,] 0.0221371768 0.09474200
## [17,] 0.0221371768 0.09474200
## [18,] 0.0378274508 0.11726157
## [19,] 0.0382219223 0.11767686
##
## $`1 2`
## 2.5 % 97.5 %
## [1,] -2.714769e-03 0.009114408
## [2,] -2.686682e-03 0.009131331
## [3,] -2.538412e-03 0.009312842
## [4,] -2.614411e-03 0.009206058
## [5,] -2.656123e-03 0.009153182
## [6,] -2.614824e-03 0.009211643
## [7,] -2.614824e-03 0.009211643
## [8,] -2.526120e-03 0.009298038
## [9,] -2.526120e-03 0.009298038
## [10,] -2.526120e-03 0.009298038
## [11,] 2.953918e-05 0.013572860
## [12,] -3.873988e-03 0.026308516
## [13,] -3.873988e-03 0.026308516
## [14,] -3.873988e-03 0.026308516
## [15,] -3.873988e-03 0.026308516
## [16,] -3.873988e-03 0.026308516
## [17,] -3.873988e-03 0.026308516
## [18,] 1.737611e-02 0.047989820
## [19,] 1.841856e-02 0.049053137
Plot smoothed cumulative incidence
Plot raw cumulative incidence
plot(tp.fit18ab, type = "raw")
Return aic for the Model
fit18ab[["ftimeMod"]][["J1"]][["t1"]][["aic"]]
## NULL