Multiply robust estimation of principal natural (in)direct effects across principal strata

Definition: Principal Natural Indirect and Direct Effects

Let \(X\) denote a vector of baseline covariates, \(Z\in\{0,1\}\) denote a treatment, \(D\in\{0,1\}\) denote a post-treatment event variable, \(M\) denote a mediator, and \(Y\) denote the outcome of interest. We assume \(M\) and \(Y\) are measured after \(D\). Under the potential outcomes framework, we define \(D_{z}\), \(M_{zd}\), and \(Y_{zdm}\) as the potential values of \(D\), \(M\), and \(Y\), respectively. Specifically, \(D_z\) is the value of \(D\) had \(Z=z\), \(M_{zd}\) is the value of \(M\) had \(Z=z\) and \(D=d\), and \(Y_{zdm}\) is the value of \(Y\) had \(Z=z\), \(D=d\), and \(M=m\). Additionally, we define \(Y_{zM_{z'}}\) as the potential outcome if the treatment were set to \(z\) and \(M\) were set to its natural value under \(Z=z'\). We can define four principal strata based on the joint potential values of \(D\), denoted by \(U=D_1D_0\):

Principal Stratum (\(U\))	group name
\(11\)	always takers
\(10\)	compliers
\(00\)	never takers
\(01\)	defiers

Not all four principal strata exist. We assume standard monotonicity to rule out the defiers stratum. Under the one-sided noncompliance problem, we can assume strong monotonicity to rule out both always takers and defiers strata.

For a given principal strata \(U=d_1d_0\), we aim to decompose the principal causal effect (\(\text{PCE}_{d_1d_0}\)) into a principal natural indirect effect (\(\text{PNIE}_{d_1d_0}\)) and a principal natural direct effect (\(\text{PNDE}_{d_1d_0}\)): \[ \underbrace{\mathbb{E}\left[Y_{1M_1}-Y_{0M_0}|U=d_1d_0\right]}_{\text{PCE}_{d_1d_0}} = \underbrace{\mathbb{E}\left[Y_{1M_1}-Y_{1M_0}|U=d_1d_0\right]}_{\text{PNIE}_{d_1d_0}} + \underbrace{\mathbb{E}\left[Y_{1M_0}-Y_{0M_0}|U=d_1d_0\right]}_{\text{PNDE}_{d_1d_0}}, \] where \(d_1d_0\in\{11,10,00\}\) under the standard monotonicity and \(d_1d_0\in\{10,00\}\) under the strong monotonicity. Once we know \(\text{PCE}_{d_1d_0}\), \(\text{PNIE}_{d_1d_0}\), and \(\text{PNDE}_{d_1d_0}\) for all \(d_1d_0\), we can further calculate the intention-to-treat effect (ITT), the intention-to-treat natural indirect effect (ITT-NIE), and the intention-to-treat natural direct effect (ITT-NDE): \[ \text{ITT} = \sum_{d_1d_0} e_{d_1d_0}\times\text{PCE}_{d_1d_0}, \quad \text{ITT-NIE} = \sum_{d_1d_0} e_{d_1d_0}\times\text{PNIE}_{d_1d_0}, \quad \text{ITT-NDE} = \sum_{d_1d_0} e_{d_1d_0}\times\text{PNDE}_{d_1d_0}. \]

Estimation

Singly robust estimators

We propose four singly robust estimators, \(\widehat{\tau}^{\text{a}}\), \(\widehat{\tau}^{\text{b}}\), \(\widehat{\tau}^{\text{c}}\), and \(\widehat{\tau}^{\text{d}}\), for all \(\tau\in\{\text{PCE}_{d_1d_0},\text{PNIE}_{d_1d_0},\text{PNDE}_{d_1d_0},\text{ITT},\text{ITT-NIE},\text{ITT-NDE}\}\). We specify the following parametric models for the four nuisance functions \(f(Z|X)\), \(f(D|Z,X)\), \(f(M|Z,D,X)\), and \(\mathbb{E}[Y|Z,D,M,X]\):

Index	Nuisance function	parametric model
1	\(f(Z\|X)\)	logistic regression
2	\(f(D\|Z,X)\)	logistic regression
3	\(f(M\|Z,D,X)\)	logistic regression
4	\(\mathbb{E}[Y\|Z,D,M,X]\)	linear (or logistic) regression for continuous (or binary) outcome

In the initial psmediate package, we assume \(M\) to be a binary variable; we will incorporate the scenario with a continuous mediator later. All singly robust estimators are sensitive to model misspecification, where \(\widehat{\tau}^{\text{a}}\) is only consistent when models 1, 2, 3 are correctly specified, \(\widehat{\tau}^{\text{b}}\) is only consistent when models 1, 3, 4 are correctly specified, \(\widehat{\tau}^{\text{c}}\) is only consistent when models 1, 2, 4 are correctly specified, and \(\widehat{\tau}^{\text{d}}\) is only consistent when models 2, 3, 4 are correctly specified.

Multiply robust estimator

Leveraging the efficient influence function, we further propose a multiply robust estimator \(\widehat{\tau}^{\text{mr}}\), which is consistent under four types of misspecifications and is fully efficient when all nuisance models are correctly specified. Specifically, under suitable regularity conditions, we show \(\widehat{\tau}^{\text{mr}}\) is consistent and asymptotically normal under either one of the four scenarios regarding correct (\(\checkmark\)) specification of parametric working models:

Nuisance model	Scenario 1	Scenario 2	Scenario 3	Scenario 4
\(f(Z\|X)\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)
\(f(D\|Z,X)\)	\(\checkmark\)		\(\checkmark\)	\(\checkmark\)
\(f(M\|Z,D,X)\)	\(\checkmark\)	\(\checkmark\)		\(\checkmark\)
\(\mathbb{E}[Y\|Z,D,M,X]\)		\(\checkmark\)	\(\checkmark\)	\(\checkmark\)

Nonparametric efficient estimator

To minimize the risk of model misspecification, we can also consider using modern machine learning algorithms to estimate the nuisance functions. Leveraging the efficient influence function, we further propose a nonparametric efficient estimator \(\widehat{\tau}^{\text{np}}\), which is same to \(\widehat{\tau}^{\text{mr}}\) but instead uses machine learning algorithms to estimate the nuisance functions. We show that \(\widehat{\tau}^{\text{np}}\) is consistent, asymptotically normal, and fully efficient if the machine learning estimates converge to the truth with a \(n^{-1/4}\)-rate. In the psmediate package, we use Super Learner, an ensemble learner with multiple user-specified machine learning algorithms, to estimate the nuisance functions.

Basic Syntax

The data-fitting function is mediate_par and mediate_np, where the first performs the singly robust estimators and the multiply robust estimator, and the second performs the nonparametric efficient estimator. We can call mediate_par by

mediate_par(data,estimator,Xname,Yname,Zname,Dname,Mname,monotonicity,B)

We require input the following arguments:

data: data set
estimator: For singly robust estimation, set it to a, b, c, d for \(\widehat{\tau}^{\text{a}}\), \(\widehat{\tau}^{\text{b}}\), \(\widehat{\tau}^{\text{c}}\), and \(\widehat{\tau}^{\text{d}}\), respectively. For multiply robust estimation, set it to mr.
Xname: names of the baseline covariates
Yname: name of the outcome
Zname: name of the treatment
Dname: name of the post-treatment event
Mname: name of the mediator
monotonicity: strong or standard monotonicity assumption (default standard)
B: number of iterations for the nonparametric bootstrap procedure

The output of mediate_par includes the point estimate, standard error, and 95% confidence interval for all \(\tau\in\{\text{PCE}_{d_1d_0},\text{PNIE}_{d_1d_0},\text{PNDE}_{d_1d_0},\text{ITT},\text{ITT-NIE},\text{ITT-NDE}\}\), where the standard error and confidence interval are estimated based on bootstrap.

Similarly, we can call mediate_np to implement the nonparametric efficient estimator as follows

mediate_np(data,Xname,Yname,Zname,Dname,Mname,monotonicity,learners,V)

In addition to the arguments data, Xname, Yname, Zname, Dname, Mname, and monotonicity, which are also used in mediate_par, one should also specify

learners: A list of individual learners (see SuperLearner package for more details. The default is GLM and random forest)
V: number of folds for the cross-fitting procedure (default 5)

The output of mediate_np also includes the point estimate, standard error, and 95% confidence interval.

An Illustrative Example

Please library the psmediate package if needed.

library("psmediate")

Data illustration

We first simulate a data with \(n=500\) observations.

set.seed(12345)
data=data_gen(n=500)
head(round(data,5))
#>   Z D M        Y       X1       X2       X3       X4
#> 1 1 0 1 16.06238 -0.17462  1.67751 -1.42032  0.58553
#> 2 1 0 0 11.86668 -0.67062  0.07947 -2.46694  0.70947
#> 3 1 1 1 13.28386  0.50743 -0.85643  0.48472 -0.10930
#> 4 0 0 0  8.50631  1.24743 -0.77878 -0.93797 -0.45350
#> 5 1 1 1 15.41818 -1.24828 -0.38094  3.33073  0.60589
#> 6 0 0 0  6.95509 -1.93472 -1.89736 -0.16295 -1.81796

In the data dataset, we include four baseline covariates X1, X2, X3, X4, a treatment Z, a post-treatment event D, a binary mediator M, and a continuous outcome Y. The standard monotonicity assumption is satisfied in the data generation process.

Implement the singly robust estimators

Below we implement \(\widehat \tau^{\text{a}}\), where \(\widehat \tau^{\text{b}}\), \(\widehat \tau^{\text{c}}\), and \(\widehat \tau^{\text{d}}\) can be similarly obtained.

mediate_par(data,
            estimator="a",
            Xname=c("X1","X2","X3","X4"),
            Yname="Y",
            Zname="Z",
            Dname="D",
            Mname="M",
            monotonicity="standard",
            B=200)
#>                          Point        SE   CI_lower  CI_upper
#> compliers: PNIE      1.5716887 1.8237686 -1.6876724  4.990598
#> compliers: PNDE      0.8153168 1.8733406 -2.7822961  4.464478
#> compliers: PCE       2.3870055 0.7096206  1.3427050  3.972104
#> never-takers: PNIE  -0.9473416 1.6498550 -4.2798347  2.416744
#> never-takers: PNDE   4.0388962 1.8568245  1.0058930  7.834798
#> never-takers: PCE    3.0915545 0.9736954  1.1122801  4.709319
#> always-takers: PNIE  0.3791346 2.4931473 -6.5201149  3.855533
#> always-takers: PNDE  3.8195939 2.7352813  0.3401836 11.256819
#> always-takers: PCE   4.1987285 0.9000576  2.5700091  6.141375
#> ITT-NIE              0.3273924 1.2064703 -2.5748371  2.299341
#> ITT-NDE              2.9083307 1.3445827  0.6841327  5.939568
#> ITT                  3.2357231 0.4923666  2.2787302  4.147970

Implement the multiply robust estimator

Below we implement \(\widehat \tau^{\text{mr}}\):

mediate_par(data,
            estimator="mr",
            Xname=c("X1","X2","X3","X4"),
            Yname="Y",
            Zname="Z",
            Dname="D",
            Mname="M",
            monotonicity="standard",
            B=200)
#>                         Point        SE  CI_lower  CI_upper
#> compliers: PNIE     1.4319931 0.2157780 0.9837236 1.8844884
#> compliers: PNDE     0.8205567 0.2269874 0.3850943 1.3142462
#> compliers: PCE      2.2525498 0.1871024 1.9005139 2.6188963
#> never-takers: PNIE  0.6068290 0.1596446 0.3292504 0.9146045
#> never-takers: PNDE  3.2386480 0.2167260 2.8211902 3.6838223
#> never-takers: PCE   3.8454771 0.2141267 3.4378292 4.2932870
#> always-takers: PNIE 1.1465900 0.2225495 0.7023467 1.5292411
#> always-takers: PNDE 2.6355692 0.2797952 2.1225002 3.1774441
#> always-takers: PCE  3.7821592 0.2500347 3.2843404 4.3005191
#> ITT-NIE             1.0600455 0.1418286 0.7522176 1.3361535
#> ITT-NDE             2.2420239 0.1799812 1.9207468 2.6141226
#> ITT                 3.3020694 0.1566654 2.9795227 3.6032101

Implement the nonparametric efficient estimator

Below we implement \(\widehat \tau^{\text{np}}\):

mediate_np(data,
           Xname=c("X1","X2","X3","X4"),
           Yname="Y",
           Zname="Z",
           Dname="D",
           Mname="M",
           monotonicity="standard",
           learners=c("SL.glm", "SL.ranger"),V=2)
#>                         point        SE  CI_lower CI_upper
#> compliers: PNIE     1.4944701 0.2929630 0.9202626 2.068677
#> compliers: PNDE     0.7910519 0.3120969 0.1793419 1.402762
#> compliers: PCE      2.2855219 0.2017982 1.8899976 2.681046
#> never-takers: PNIE  0.5609334 0.1690324 0.2296299 0.892237
#> never-takers: PNDE  3.2904226 0.1998585 2.8986999 3.682145
#> never-takers: PCE   3.8513560 0.2076416 3.4443785 4.258333
#> always-takers: PNIE 1.3796536 0.2978555 0.7958569 1.963450
#> always-takers: PNDE 2.4553844 0.4281954 1.6161215 3.294647
#> always-takers: PCE  3.8350381 0.2756097 3.2948431 4.375233
#> ITT-NIE             1.1326948 0.1822785 0.7754290 1.489961
#> ITT-NDE             2.2272546 0.2469746 1.7431843 2.711325
#> ITT                 3.3599494 0.1664414 3.0337242 3.686175