class: center, top, .title-slide, title-slide .title[ # Practical Insights into Causal Methods ] .subtitle[ ## Covariate Adjustment under General Interference ] .author[ ### Ralph Møller Trane ] .institute[ ### University of Wisconsin–Madison
] .date[ ### 2023-12-11.small[(last compiled: 2023-12-10)] ] --- # Overview **Part I: Problem & Setup** 1. Setup 2. Assumptions 3. Unadjusted Estimation and Asymptotic Result **Part II: Covariate Adjustments** 1. Why bother? 2. AIPW Estimator 3. ANCOVA Estimator **Part III: Practical Considerations** 1. Variance estimation using OLS 2. Testing of Sharp Null Hypothesis using ANCOVA 3. Data Application `$$\newcommand{\E}[1]{\mathbb{E}\left[#1\right]} \newcommand{\Var}[1]{\text{Var}\left[#1\right]} \newcommand{\Cov}[1]{\text{Cov}\left[#1\right]} \newcommand{\R}{\mathbb{R}} \newcommand{\N}{\mathbb{N}} \newcommand{\G}{\mathcal{G}} \newcommand{\F}{\mathcal{F}} \newcommand{\iid}{\overset{\text{iid}}{\sim}} \newcommand{\tauHT}{\hat{\tau}^\text{HT}} \newcommand{\tauHA}{\hat{\tau}^\text{HA}} \newcommand{\Yhat}[1]{\hat{Y}_{i,n}^{#1}} \newcommand{\Ytilde}[1]{\tilde{Y}_{i}^{#1}} \newcommand{\tauAIPW}{\hat{\tau}_\text{AIPW}} \newcommand{\tauHTAIPW}{\hat{\tau}^\text{HT}_\text{AIPW}} \newcommand{\tauHAAIPW}{\hat{\tau}^\text{HA}_\text{AIPW}} \newcommand{\tauBarDIR}{\bar{\tau}_\text{DIR}} \newcommand{\tauDIR}{\tau_\text{DIR}} \newcommand{\tauHatDIR}{\hat{\tau}_\text{DIR}^\text{HT}} \newcommand{\betaANCOVA}{\hat{\beta}^{\text{ANCOVA}}} \newcommand{\betaANCOVAA}{\hat{\beta}^{\text{ANCOVA}2}} \newcommand{\one}[1]{\mathbb{1}\left[#1\right]} \newcommand{\bm}[1]{\boldsymbol{#1}} \renewcommand{\tilde}{\widetilde}$$` --- layout: true # Part I: Setup --- **Potential Outcomes in the Presence of General Interference** We are interested in a *population* of subjects `\(i = 1, ..., n\)` Each subject is randomly assigned a *treatment* `\(W_i \in \{0,1\}\)` as `\(W_i \overset{\text{iid}}{\sim} \text{Bernoulli}(\pi)\)` `\((\pi \in (0,1))\)`, i.e. we consider only randomized controlled trials (RCTs). For each subject, an *outcome* `\(Y_i\)` is observed. This observed outcome is one of many *potential outcomes* `\(Y_i(\boldsymbol{w}) \in \mathbb{R}\)`, which we assume exist and are **fixed** for all possible treatment assignment vectors `\(\boldsymbol{w} \in \{0,1\}^n\)`. Assume a *generalized SUTVA*, i.e. `\(Y_i = \sum_{\boldsymbol{w'} \in \{0,1\}^n} 1[\boldsymbol{W} = \boldsymbol{w}'] Y_i(\boldsymbol{w}')\)`. We use `\(Y_i(w_i, \boldsymbol{w}_{-i})\)` to indicate potential outcome for subject `\(i\)` when subject `\(i\)` assigned `\(w_i\)`, and others assigned `\(\boldsymbol{w}_{-i} \in \{0,1\}^{n-1}\)`. **Interference Graph** In parallel to the potential outcomes, we introduce the notion of an *interference graph*. An interference graph `\(\G\)` consists of nodes or vertices at the `\(n\)` subjects, and an edge or adjacency matrix `\(\boldsymbol{E} = \{E_{ij}\}_{i,j=1}^n\)`, where `\(E_{ij} = 1\)` if `\(W_j\)` influences `\(Y_i(\boldsymbol{W})\)`, and `\(E_{ij} = 0\)` otherwise. .medium[ (Formally, `\(Y_i(w, \boldsymbol{w}_{-i}) = Y_i(w, \boldsymbol{w}_{-i}')\)` for all `\(\boldsymbol{w}_{-i}, \boldsymbol{w}_{-i}' \in \{0,1\}^{n-1}\)` where `\(w_j = w_j'\)` for all `\(j\)` with `\(E_{ij} = 1\)`.) ] --- **No Interference Example**: the effect of a drug (for example, aspirin) on disease status (headache). No interference seems reasonable, so `\(E_{ij} = 0\)` Two potential outcomes for each individual because `\(Y_i(w_i, \boldsymbol{w}_{-i}) = Y_i(w_i)\)`. An intuitive estimand: Average Treatment Effect (ATE) = `\(\frac{1}{n} \sum_{i=1}^n Y_i(1) - Y_i(0)\)`. --- **Interference Example**: the effect of vaccination status on disease status. No interference unlikely. `\(E_{ij} = 1\)` if `\(i\)` and `\(j\)` often spend time together. Here, the previous definition of the ATE not exactly useful. <a name=cite-hudgens_causalinferenceinterference_2008></a>[Hudgens and Halloran (2008)](https://www.tandfonline.com/doi/full/10.1198/016214508000000292) provide a nice generalization of the ATE: `\begin{align} \bar{\tau}_\text{DIR} = \frac{1}{n} \sum_{i=1}^n \mathbb{E}[Y_i(1, \boldsymbol{W}_{-i})] - \mathbb{E}[Y_i (0, \boldsymbol{W}_{-i})]. \end{align}` When potential outcomes are considered fixed, the inner average is over treatment assignment. If no interference, `\(\bar{\tau}_\text{DIR} = \text{ATE}\)` because `\(\mathbb{E}[Y_i(1, \boldsymbol{W}_{-i})] = \mathbb{E}[Y_i(1)] = Y_i(1)\)`. We could consider multiple causal estimands. For example, the indirect effect of treating a larger or smaller part of the population. Will defer this question for another time. --- ## Estimand As hinted at on the previous slide, we are interested in estimating what we will refer to as the *direct effect*: `\begin{equation} \bar{\tau}_\text{DIR} = \frac{1}{n} \sum_{i=1}^n \mathbb{E}[Y_i(1, \boldsymbol{W}_{-i}) - Y_i (0, \boldsymbol{W}_{-i})]. \end{equation}` Nice result: unbiased estimation for `\(\bar{\tau}_\text{DIR}\)` is the well-known difference-in-means (aka Hájek) estimator: $$ `\begin{equation} \tauHA = \sum_{i=1}^n \left(\frac{Y_i W_i}{\sum_i^n W_i} - \frac{Y_i(1-W_i)}{\sum_{i=1}^n (1-W_i)}\right) \end{equation}` $$ --- layout: true # Part I: Assumptions --- [<a name=cite-li_randomgraphasymptotics_2022></a>[Li and Wager (2022)](https://doi.org/10.1214/22-AOS2191)](#referencescont) show how viewing the interference graph as a random draw from a graphon can help get asymptotic results. We follow in their footsteps. We consider the following Assumptions on the interference graph. .medium[ **Assumption 1**: `\(E_{ij} = E_{ji}\)` **Assumption 2**: `\(\mathcal{G}\)` is randomly generated as follows: * each unit has latent position `\(U_i \overset{\text{iid}}{\sim} \text{Uniform}(0,1)\)` * `\(G_n: [0,1]^2 \mapsto [0,1]\)` is a symmetric function * `\(P(E_{ij} = 1 | U_i, U_j) = G_n(U_i, U_j)\)`, `\(i < j\)` `\(G_n\)` is called a *graphon*. **Assumption 3**: The graphon `\(G_n(U_i, U_j)\)` is given by `\(\min\{1, \rho_n G(U_i, U_j)\}\)` where * `\(G(\cdot, \cdot): [0,1]^2 \mapsto \R^+ \cup \{0\}\)` is symmetric * `\(0 < \rho_n \le 1\)` such that either `\(\rho_n = 1\)` or `\(\rho_n \to 0\)` and `\(\rho_n n \to \infty\)` ] .medium[ **Additional Graphon Assumptions**: 1. `\(\exists\ c_l > 0\)` s.t. `\(c_l \le g_1(u) = \int_0^1 \min\{1, G(u, t)\} dt\ \forall u \in [0,1]\)`. 2. The graphon has finite second moment, i.e. `\(\E{G(U_1, U_2)^k} \le c_u^k, k=1,2\)`. 3. The sparsity controlling sequence `\(\rho_n\)` satisfies `\(\lim \inf \log \rho_n / \log n > -1\)`. ] --- Also need a few Assumptions on the potential outcomes. **Assumption 4**: of the potential outcomes and treatments we assume * consistency: `\(Y_i = Y_i(\bm{W})\)` * complete ignorability: `\(\bm{W} \perp \bm{Y}(\bm{w})\)` * positivity: `\(0 < \pi < 1\)` **Assumption 5**: `\(Y_i(w, \boldsymbol{w}_{-i}) = f(w, M_i / N_i, \bm{X}_i, U_i; \epsilon_i)\)` where `\(M_i = \sum_{j \neq i} E_{ij} W_j\)` and `\(N_i = \sum_{j\neq i}E_{ij}\)`. (This is similar to the stratified interference assumption made by [Hudgens and Halloran (2008)](https://www.tandfonline.com/doi/full/10.1198/016214508000000292)) **Assumption 6**: `\(f\)` is three-times differentiable, and `\(|f|, |f'|, |f''|, |f'''| \le B\)` where the derivative is taken with respect to `\(M_i / N_i\)`. --- Consequence of Assumptions 5 and 6: by Taylor's theorem, $$ `\begin{aligned} Y_i(w, \bm{w}_{-i}) &= f(w, \pi, \bm{X}_i, U_i; \epsilon_i) \\ &\quad \quad + f'(w, \pi, \bm{X}_i, U_i; \epsilon_i)\left(\frac{M_i}{N_i} - \pi\right) \\ &\quad \quad \quad + \frac{1}{2}f''(w, \pi, \bm{X}_i, U_i; \epsilon_i)\left(\frac{M_i}{N_i} - \pi\right)^2 \\ &\quad \quad \quad \quad + \frac{1}{6}f'''(w, \pi_i^*, \bm{X}_i, U_i; \epsilon_i)\left(\frac{M_i}{N_i} - \pi\right)^3 \end{aligned}` $$ for some `\(\pi_i^*\)` between `\(\pi\)` and `\(M_i/N_i\)`. Since `\(M_i/N_i \to_p \pi\)`, `\(Y_i(w, \bm{w}_{-i}) \approx f(w, \pi, \bm{X}_i, U_i; \epsilon_i)\)` which means outcomes asymptotically behave as if independent. This is key to everything. --- name: li-and-wager-clt Central Limit Theorem ([Li and Wager, 2022](https://doi.org/10.1214/22-AOS2191)): under Assumptions 1-6, $$ \sqrt{n}\left(\tauHA - \tauBarDIR \right) \to_d N\left(0, \pi(1-\pi)\left(\text{Var}[R_i + Q_i] + (\E{Q_i})^2\right)\right) $$ where $$ `\begin{aligned} R_i &= \frac{f(1, \pi, \bm{X}_i, U_i; \epsilon_i)}{\pi} + \frac{f(0, \pi, \bm{X}_i, U_i; \epsilon_i)}{1-\pi}, \\ Q_i &= \left . \E{\frac{G(U_i, U_j)(f'(1, \pi, \bm{X}_j, U_j; \epsilon_j) - f'(0, \pi, \bm{X}_j, U_j; \epsilon_j))}{\E{G(U_i, U_j) | U_j}} \right | U_i}. \end{aligned}` $$ Notes: <ol> <li style="margin: 12px 0;"> When no interference, asymptotic variance \(=\Var{R_i}=\) variance of difference-in-means estimator from previous work.</li> <li style="margin: 12px 0;"> Independent of \(\rho_n\), meaning asymptotic result does <strong>not</strong> depend on sparsity of graph. </li> <li style="margin: 12px 0;"> Unless \(R_i\) and \(Q_i\) strongly negatively correlated, \(2\Cov{R_i, Q_i} + \Var{Q_i} + (\E{Q_i})^2 > 0\), so variance inflated!</li> </ol> --- Central Limit Theorem ([Li and Wager, 2022](https://doi.org/10.1214/22-AOS2191)): under Assumptions 1-6, $$ \sqrt{n}\left(\tauHA - \tauBarDIR \right) \to_d N\left(0, \pi(1-\pi)\left(\text{Var}[R_i + Q_i] + (\E{Q_i})^2\right)\right) $$ where $$ `\begin{aligned} R_i &= \frac{f(1, \pi, \bm{X}_i, U_i; \epsilon_i)}{\pi} + \frac{f(0, \pi, \bm{X}_i, U_i; \epsilon_i)}{1-\pi}, \\ Q_i &= \left . \E{\frac{G(U_i, U_j)(f'(1, \pi, \bm{X}_j, U_j; \epsilon_j) - f'(0, \pi, \bm{X}_j, U_j; \epsilon_j))}{\E{G(U_i, U_j) | U_j}} \right | U_i}. \end{aligned}` $$ Notes: <ol start="4"> <li style="margin: 12px 0;"> interpretable asymptotic variance expression </li> <li style="margin: 12px 0;"> clear what is contributed by graph and what is contributed by potential outcomes </li> <li> nice decomposition into "no-interference" and "interference" parts </li> </ol> -- Disclaimer: does NOT give clear idea for consistent variance estimations. --- layout: true # Part II: Covariate Adjustments --- Why bother? * For RCTs when no interference present, <a name=cite-hernandez_adjustmentstrongpredictors_2006></a>[Hernández Steyerberg et al. (2006)](http://www.liebertpub.com/doi/10.1089/neu.2006.23.1295) suggest covariate adjustment can lead to up to 25% reduction in required sample size. <a name=cite-hernandez_randomizedcontrolledtrials_2006></a>[Hernández Eijkemans et al. (2006)](https://linkinghub.elsevier.com/retrieve/pii/S1047279705003248) show similar reductions in requried sample size for RCTs with time-to-event endpoints. * In a recently updated guidance, the U.S. FDA suggest to always adjust for covariates when analyzing RCTs <a name=cite-foodanddrugadministrationcenterfordrugevaluationandresearch_adjustingcovariatesrandomized_2023></a>([U.S. FDA, 2023](#bib-foodanddrugadministrationcenterfordrugevaluationandresearch_adjustingcovariatesrandomized_2023)). * When interference is presence, variance often inflated (see [this slide](#li-and-wager-clt)) --- layout: true # Part II: AIPW Estimator --- Consider the Augmented Inverse Probability Weighted (AIPW) estimator: $$ `\begin{aligned} \tauAIPW &= \frac{1}{n}\sum_{i=1}^n \left( \frac{W_i\left(Y_i - \Yhat{(1, \bm{w}_{-i})}\right)}{\hat{\pi}} + \Yhat{(1, \bm{w}_{-i})} \right) - \frac{1}{n} \sum_{i=1}^n \left(\frac{(1-W_i)\left(Y_i - \Yhat{(0, \bm{w}_{-i})}\right)}{1 - \hat{\pi}} + \Yhat{(0, \bm{w}_{-i})} \right) \\ &= \tauHA + \frac{1}{n}\sum_{i=1}^n \left(1 - \frac{W_i}{\hat{\pi}} \right) \Yhat{(1, \bm{w}_{-i})} - \frac{1}{n} \sum_{i=1}^n \left(1-\frac{(1-W_i)}{1 - \hat{\pi}}\right) \Yhat{(0, \bm{w}_{-i})} \end{aligned}` $$ Thoroughly examined in no interference setting. We study `\(\tauAIPW\)` under Assumptions 1-6. -- Note: simultaneous work done by <a name=cite-emmenegger_treatmenteffectestimation_2023></a>[Emmenegger Spohn et al. (2023)](http://arxiv.org/abs/2206.14591) also considers `\(\tauAIPW\)`. Main differences: .pull-left[ * They consider fixed network * Structural Equation Models for data generation * They consider observational experiments and estimate `\(\hat{\pi}\)` based on observed covariates * They use sample splitting for estimation ] .pull-right[ * We consider random network * Assumptions 4-6 * We consider randomized experiments and use `\(\hat{\pi} = \frac{1}{n} \sum_{i=1}^n W_i\)` * We do not... ] --- Our main result today: Central Limit Theorem for AIPW estimator. Under Assumptions 1-6, and `\(\Yhat{(w, \bm{w}_{-i})}\)` that are "well-behaved" and converge to `\(\Ytilde{(w)}\)` "fast enough", $$ `\begin{aligned} \sqrt{n}\left(\tauAIPW - \tauBarDIR \right) &\to_d N\left(0, \pi(1-\pi)\left(\text{Var}[R_i - A_i + Q_i] + (\E{Q_i})^2\right)\right), \end{aligned}` $$ where $$ `\begin{aligned} R_i &= \frac{f(1, \pi, \bm{X}_i, U_i; \epsilon_i)}{\pi} + \frac{f(0, \pi, \bm{X}_i, U_i; \epsilon_i)}{1-\pi}, \\ A_i &= \frac{\Ytilde{(1)}}{\pi} + \frac{\Ytilde{(0)}}{1-\pi} \\ Q_i &= \left . \E{\frac{G(U_i, U_j)(f'(1, \pi, \bm{X}_j, U_j; \epsilon_j) - f'(0, \pi, \bm{X}_j, U_j; \epsilon_j))}{\E{G(U_i, U_j) | U_j}} \right | U_i}. \end{aligned}` $$ --- Compare to <a name=cite-lunceford_stratificationweightingpropensity_2004></a>[Lunceford and Davidian (2004)](https://onlinelibrary.wiley.com/doi/10.1002/sim.1903) who consider * No interference * i.e. `\(Q_i = 0\)` * use `\(\widehat{\mathbb{E}}\left[Y_i(w) | \bm{X}_i\right]\)` for `\(\Yhat{(w, \bm{w}_{-i})}\)` with `\(\widehat{\mathbb{E}}\left[Y_i(w) | \bm{X}_i\right] \to_p \E{Y_i(w) | \bm{X}_i}\)` "fast enough", then asymptotic variance above `\(=\)` asymptotic variance derived by [Lunceford and Davidian (2004)](https://onlinelibrary.wiley.com/doi/10.1002/sim.1903). No interference result "special case" of the above. --- layout: true # Part II: ANCOVA Estimator --- We will consider ANCOVA estimator for covariate adjustments because * simple * well-known * broadly used in many fields -- * linear regression "acceptable" for covariate adjustment in RCTs according to FDA -- Specifically, consider the linear regression of `\(Y_i - \bar{Y}\)` on `\(W_i - \bar{W}\)`, `\(\bm{X}_i - \bar{\bm{X}}\)`, and `\((\bm{X}_i - \bar{\bm{X}})(W_i - \bar{W})\)`. Then $$ `\begin{aligned} \betaANCOVAA= \left\{1 - \frac{n^2}{n_0n_1}(n^{-1}d_2)^T D^{-1}(n^{-1}d_2)\right\}^{-1} \left\{\tauHA - \frac{n}{n_0n_1} d_2^T D^{-1} \begin{pmatrix} \widehat{\Sigma}_{\bm{X}Y} \\ \widehat{\Sigma}_{\bm{X}YW} \end{pmatrix} \right\}. \end{aligned}` $$ -- Under no interference, <a name=cite-tsiatis_covariateadjustmenttwosample_2008></a>[Tsiatis Davidian et al. (2008)](https://onlinelibrary.wiley.com/doi/10.1002/sim.3113) shows ANCOVA `\(\approx\)` AIPW asymptotically under no interference. When interaction terms between treatment and covariates are included, ANCOVA always at least as efficient as `\(\tauHA\)`. -- Both are also true under Assumptions 1-6. --- We show that, asymptotically, `\(\betaANCOVAA \approx \tauAIPW\)` with `\begin{equation} \Yhat{(w, \bm{w}_{-i})} = \left[\widehat{\Sigma}_{\bm{X}Y} + \frac{(1-2\pi)}{\pi(1-\pi)}\widehat{\Sigma}_{\bm{X}YW}\right]^T \hat{\Sigma}_{\bm{XX}}^{-1} \bm{X}_i \end{equation}` which converges to `\begin{equation} \Ytilde{(w)} = \left[\pi \Sigma^{(0)}_{\bm{X}f} + (1-\pi)\Sigma^{(1)}_{\bm{X}f}\right]^T \Sigma_{\bm{XX}}^{-1} \bm{X}_i \end{equation}` --- Consequence of main result: Under Assumptions 1-6, when `\(\bm{X}_i\)` consists of iid baseline covariates, $$ `\begin{aligned} \sqrt{n}\left(\betaANCOVAA - \tauBarDIR \right) \to_d N\left(0, \pi(1-\pi)\left(\Var{\tauHA} + \Var{A_i} - 2\Cov{A_i, R_i}\right)\right), \end{aligned}` $$ where .medium[ $$ `\begin{aligned} \Var{\tauHA} &= \Var{R_i + Q_i} + (\E{Q_i})^2, \\ R_i &= \frac{f(1, \pi, \boldsymbol{X}_i, U_i; \epsilon_i)}{\pi} + \frac{f(0, \pi, \boldsymbol{X}_i, U_i; \epsilon_i)}{1-\pi}, \\ A_i &= \left(\frac{1}{\pi} + \frac{1}{1-\pi}\right)\left[\pi \Sigma^{(0)}_{\bm{X}f} + (1-\pi)\Sigma^{(1)}_{\bm{X}f}\right]^T\Sigma_{\bm{XX}}^{-1} \bm{X}_i, \\ Q_i &= \left . \E{\frac{G(U_i, U_j)(f'(1, \pi, \bm{X}_j, U_j; \epsilon_j) - f'(0, \pi, \bm{X}_j, U_j; \epsilon_j))}{\E{G(U_i, U_j) | U_j}} \right | U_i} \end{aligned}` $$ ] -- Conclusion: ANCOVA is still useful under interference structure as given by Assumptions 1-6! When analyzed as if no interference present using ANCOVA, consistent and asymptotically normal. Always asymptotically more efficient than unadjusted estimator, just like in the no-interference case. --- layout: true # Part III: Practical Considerations --- -- **Variance estimation using OLS?** Many reasons to ask if this would work. Two of them are: * If a researcher were to analyze data under the assumption of no interference, what can we say about inferences made? -- * It would be really nice if it did... -- Unfortunately, it does not. * Estimates sometimes conservative, sometimes not. * Decisions made by researcher might heavily influence the performance. * Depends on functional form of the model, and "strength" of covariates --- **Testing of Sharp Null Hypothesis using ANCOVA** While variance estimation in general is really hard, things simplify if we consider sharp null hypothesis. Say we want to test `\(H_0: f(1, \pi, X_i, U_i; \epsilon_i) = f(0, \pi, X_i, U_i; \epsilon_i) \quad \forall i\)`. Under `\(H_0\)`, `\(f'(1, \pi, X_i, U_i; \epsilon_i) = f'(0, \pi, X_i, U_i; \epsilon_i)\)`, so `\(Q_i = 0\)`. I.e. asymptotic variance of `\(\betaANCOVAA\)` simplifies to `\begin{equation} \pi(1-\pi)\left(\frac{1}{\pi} + \frac{1}{1-\pi}\right)^2 \left\{\Var{f(1, \pi, X_i, U_i; \epsilon_i)} - \Cov{f(1, \pi, X_i, U_i; \epsilon_i), X_i}^2\Var{X_i}^{-1} \right\} \end{equation}` We can consistently estimate this! --- **Testing of Sharp Null Hypothesis using ANCOVA** The question is: how is the finite sample performance? Answer: it depends... -- * Sometimes seems to be spot on! * When off, seems to be biased in a way that it mostly provides estimates larger than asymp. variance, but smaller than finite sample variance --- **Data Application** Field experiment presented by <a name=cite-cai_socialnetworksdecision_2015></a>[Cai Janvry et al. (2015b)](https://pubs.aeaweb.org/doi/10.1257/app.20130442). * Randomized experiment based on the introduction of a new weather insurance policy specifically aimed at rice farmers in China. * Two rounds of information sessions were held across 185 villages in rural China. * Households randomized to attend one of * a simple 20 minute long session * an intensive 45 minute long session * either attend the first or second round of sessions * [Cai Janvry et al. (2015b)](https://pubs.aeaweb.org/doi/10.1257/app.20130442) estimate both direct and indirect treatment effects due to knowledge sharing across social bonds We simplify the treatment to a binary "intensive session" status indicator, and discard the timing of the session. For data, see <a name=cite-cai_replicationdatasocial_2015></a>[Cai Janvry et al. (2015a)](https://www.openicpsr.org/openicpsr/project/113593/version/V1/view). --- **Data Application**
Estimator
Direct Effect Estimate
SE Estimator
SE
Horvitz-Thompson
0.07262
Naive Bootstrap
0.01890
Hájek
0.07335
Naive
0.01416
Hájek
0.07335
Naive Bootstrap
0.01440
ANCOVA
0.06237
Naive Bootstrap
0.04256
ANCOVA
0.06237
OLS
0.04316
ANCOVA w/ ppt
0.05655
Naive Bootstrap
0.04247
ANCOVA w/ ppt
0.05655
OLS
0.04321
AIPW
0.07154
Naive Bootstrap
0.02077
AIPW w/ ppt
0.07117
Naive Bootstrap
0.02149
AIPW using SS
0.06903
Emmennegger
0.02454
AIPW w/ ppt using SS
0.06748
Emmennegger
0.02452
--- layout: false name: references # References <a name=bib-lunceford_stratificationweightingpropensity_2004></a>[Lunceford, J. K. and M. Davidian](#cite-lunceford_stratificationweightingpropensity_2004) (2004). "Stratification and Weighting via the Propensity Score in Estimation of Causal Treatment Effects: A Comparative Study". In: _Statistics in Medicine_ 23.19. DOI: [10.1002/sim.1903](https://doi.org/10.1002%2Fsim.1903). URL: [https://onlinelibrary.wiley.com/doi/10.1002/sim.1903](https://onlinelibrary.wiley.com/doi/10.1002/sim.1903) (visited on Apr. 28, 2023). <a name=bib-hernandez_randomizedcontrolledtrials_2006></a>[Hernández, A. V., M. J. Eijkemans, et al.](#cite-hernandez_randomizedcontrolledtrials_2006) (2006). "Randomized Controlled Trials With Time-to-Event Outcomes: How Much Does Prespecified Covariate Adjustment Increase Power?" In: _Annals of Epidemiology_ 16.1. DOI: [10.1016/j.annepidem.2005.09.007](https://doi.org/10.1016%2Fj.annepidem.2005.09.007). URL: [https://linkinghub.elsevier.com/retrieve/pii/S1047279705003248](https://linkinghub.elsevier.com/retrieve/pii/S1047279705003248) (visited on Nov. 13, 2023). <a name=bib-hernandez_adjustmentstrongpredictors_2006></a>[Hernández, A. V., E. W. Steyerberg, et al.](#cite-hernandez_adjustmentstrongpredictors_2006) (2006). "Adjustment for Strong Predictors of Outcome in Traumatic Brain Injury Trials: 25 In: _Journal of Neurotrauma_ 23.9. DOI: [10.1089/neu.2006.23.1295](https://doi.org/10.1089%2Fneu.2006.23.1295). URL: [http://www.liebertpub.com/doi/10.1089/neu.2006.23.1295](http://www.liebertpub.com/doi/10.1089/neu.2006.23.1295) (visited on Nov. 13, 2023). " <a name=bib-hudgens_causalinferenceinterference_2008></a>[Hudgens, M. G. and M. E. Halloran](#cite-hudgens_causalinferenceinterference_2008) (2008). "Toward Causal Inference With Interference". In: _Journal of the American Statistical Association_ 103.482. DOI: [10.1198/016214508000000292](https://doi.org/10.1198%2F016214508000000292). URL: [https://www.tandfonline.com/doi/full/10.1198/016214508000000292](https://www.tandfonline.com/doi/full/10.1198/016214508000000292) (visited on Feb. 22, 2022). <a name=bib-tsiatis_covariateadjustmenttwosample_2008></a>[Tsiatis, A. A., M. Davidian, et al.](#cite-tsiatis_covariateadjustmenttwosample_2008) (2008). "Covariate Adjustment for Two-Sample Treatment Comparisons in Randomized Clinical Trials: A Principled yet Flexible Approach". In: _Statistics in Medicine_ 27.23. DOI: [10.1002/sim.3113](https://doi.org/10.1002%2Fsim.3113). URL: [https://onlinelibrary.wiley.com/doi/10.1002/sim.3113](https://onlinelibrary.wiley.com/doi/10.1002/sim.3113) (visited on Dec. 22, 2022). --- name: referencescont # References (cont.) <a name=bib-cai_replicationdatasocial_2015></a>[Cai, J., A. D. Janvry, et al.](#cite-cai_replicationdatasocial_2015) (2015a). "Replication Data for: Social Networks and the Decision to Insure". Version 1. In: _ICPSR - Interuniversity Consortium for Political and Social Research_. DOI: [10.3886/E113593V1](https://doi.org/10.3886%2FE113593V1). URL: [https://www.openicpsr.org/openicpsr/project/113593/version/V1/view](https://www.openicpsr.org/openicpsr/project/113593/version/V1/view) (visited on Oct. 16, 2023). <a name=bib-cai_socialnetworksdecision_2015></a>[Cai, J., A. D. Janvry, et al.](#cite-cai_socialnetworksdecision_2015) (2015b). "Social Networks and the Decision to Insure". In: _American Economic Journal: Applied Economics_ 7.2. DOI: [10.1257/app.20130442](https://doi.org/10.1257%2Fapp.20130442). URL: [https://pubs.aeaweb.org/doi/10.1257/app.20130442](https://pubs.aeaweb.org/doi/10.1257/app.20130442) (visited on May. 04, 2023). <a name=bib-li_randomgraphasymptotics_2022></a>[Li, S. and S. Wager](#cite-li_randomgraphasymptotics_2022) (2022). "Random Graph Asymptotics for Treatment Effect Estimation under Network Interference". In: _The Annals of Statistics_ 50.4. DOI: [10.1214/22-AOS2191](https://doi.org/10.1214%2F22-AOS2191). (Visited on Mar. 28, 2023). <a name=bib-emmenegger_treatmenteffectestimation_2023></a>[Emmenegger, C., M. Spohn, et al.](#cite-emmenegger_treatmenteffectestimation_2023) (2023). _Treatment Effect Estimation with Observational Network Data Using Machine Learning_. arXiv: [2206.14591 [math, stat]](https://arxiv.org/abs/2206.14591). URL: [http://arxiv.org/abs/2206.14591](http://arxiv.org/abs/2206.14591) (visited on Sep. 13, 2023). preprint. <a name=bib-foodanddrugadministrationcenterfordrugevaluationandresearch_adjustingcovariatesrandomized_2023></a>[U.S. FDA](#cite-foodanddrugadministrationcenterfordrugevaluationandresearch_adjustingcovariatesrandomized_2023) (2023). "Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products - Guidance for Industry". --- count: false # Appendix **Sample splitting?** When no interference is present, sample splitting is a useful tool for creating AIPW estimators. However, it relies on independence of data, which we do not have under interference. [Emmenegger Spohn et al. (2023)](http://arxiv.org/abs/2206.14591) suggest a sample splitting algorithm for use when interference is present. Pro: this allows for consistent variance estimation, both through plugin and bootstrap method. Con: you need some level of sparsity in the network for the sample splitting to work. -- Question: how does sample splitting affect the AIPW estimator? --- count: false # Appendix **Sample splitting?** Observations: * For some of the graphons considered, often the sample splitting fails * For the graphons that do work ( `\(G_3\)` and `\(G_6\)`), often sample splitting leads to similar bias and variance * When not similar, generally sample splitting hurts the performance