Heterogeneous Treatment Effects

Author

João C. L. Camargos and Jody Oetzel

1 Replication Task

In this exercise, we will follow what was done by Liu, Liu and Xu (2025) and we will replicate the study by Noble (2024). to estimate heterogeneous treatment effects with binary variables. The data set to replicate the method for a continuous treatment wasn`t available on any dataverse, leading to the impossibility of replicating their results.

The analysis investigates how legislators’ partisan alignment with the president influences the extent to which they reference the president in their congressional speeches. The goal is to estimate a Conditional Marginal Effect (CME) — that is, how the marginal effect of being in the opposition changes across levels of local presidential support.

Variable Role Description
pres_ref Outcome (Y) Frequency or share of a legislator’s floor speeches that reference the president.
opp Treatment (D) Indicator variable equal to 1 if the legislator is from the president’s opposing party, and 0 otherwise.
pres_vote_margin Moderator (X) District-level presidential vote margin — measures local support for the president, moderating the opposition effect.
prev_vote Control (Z₁) Legislator’s vote share in the previous election (captures electoral strength).
majority Control (Z₂) Indicator for whether the legislator’s party currently holds the chamber majority.
leadership Control (Z₃) Indicator for holding a formal leadership position in Congress.
seniority Control (Z₄) Years of congressional service; captures experience and status.
tot_speech Control (Z₅) Total number of speeches delivered; normalizes activity level.

1.1 Instaling necessary packages and functions

Show Answer
#devtools::install_github('xuyiqing/interflex')

library(tidyverse)
library(interflex)
library(sjPlot)

set.seed(12345678)

1.2 Setting up the communication with Python

Unlike the other algorithms that we work so far in class, the DoubleML implementation really on the scikit-learn and doubleml libraries developed in Python. Therefore, we need to set a communication between our R environment with a Python environment. To do so, we will use the package reticulate to set up the co

Show Answer
library(reticulate)

# Install Miniconda to be operated within reticulate
#reticulate::install_miniconda()  # se já tiver, pode usar force=TRUE

#cb <- reticulate::conda_binary("auto")

# Accpet the 3 indicated channels
#system2(cb, c("tos","accept","--override-channels","--channel","https://repo.anaconda.com/pkgs/main"))
#system2(cb, c("tos","accept","--override-channels","--channel","https://repo.anaconda.com/pkgs/r"))
#system2(cb, c("tos","accept","--override-channels","--channel","https://repo.anaconda.com/pkgs/msys2"))

## 1) Creatre and select the correct virtual environment (may need to reinitialize R)
#reticulate::conda_create("interflex", packages = "python=3.10")
reticulate::use_condaenv("interflex", required = TRUE)

## 2) Confirm if Python is active
#py_config()  # check if the sys.executable shows .../r-miniconda/envs/interflex/python.exe

## 3) Install interflex via condda-forge
#conda_install("interflex", packages = "patsy", channel = "conda-forge")

## 5) (Install dependencies for interflex DML
#conda_install("interflex",
#  packages = c("numpy","pandas","scikit-learn","scipy","doubleml"),
#  channel = "conda-forge"
#)

1.3 Data Cleaning

Download the data set.

Show Answer
onedrive_url <- "https://1drv.ms/x/c/0f7d0a7f95ac8880/ES771P57XopDsmmHJk_U9WgBdEUnvydUJ2H5xnB2_GP0aA?e=oNxF4I"

#house_presapp <- read.csv(url(onedrive_url))

house_presapp <- read_csv('dataverse_files/df_senate_presapp.csv')


# Unpacking the dataset (bugged for some unknown reason)

house_presapp <- house_presapp |> 
  ungroup() |>                     
  select(pres_ref, opp, pres_vote_margin, everything()) |> 
  mutate(
    pres_ref         = {x <- pres_ref;         if (is.data.frame(x)) x[[1]] else x},
    opp              = {x <- opp;              if (is.data.frame(x)) x[[1]] else x},
    pres_vote_margin = {x <- pres_vote_margin; if (is.data.frame(x)) x[[1]] else x}
  ) |> 
  mutate(
    pres_ref         = if (is.list(pres_ref))         unlist(pres_ref)         else pres_ref,
    opp              = if (is.list(opp))              unlist(opp)              else opp,
    pres_vote_margin = if (is.list(pres_vote_margin)) unlist(pres_vote_margin) else pres_vote_margin
  ) |> 
  mutate(
    pres_ref         = as.numeric(pres_ref),
    pres_vote_margin = as.numeric(pres_vote_margin),
    opp              = as.factor(opp)
  )

house_presapp <- house_presapp[which(house_presapp[,'pres_vote_margin'] <=16.5 & 
                                         house_presapp[,'pres_vote_margin'] >= -4.1) ,]

tab_df(head(house_presapp))
pres_ref opp pres_vote_margin congress president pparty bioguide_id lname fname party state majority seniority prev_vote leadership les tot_speech pres_named pres_pct in_cycle mean_app mean_app_margin mean_net_app app_lt50 net_app_lt0
98 1 4.36 94 ford R A000017 abourezk james D SD 1 2 57 0 0.91 793 20 12.36 0 44.28 -5.72 4.17 1 0
127 1 16.50 94 ford R B000254 bayh birch D IN 1 7 52 0 0.27 653 53 19.45 0 44.28 -5.72 4.17 1 0
59 0 12.11 94 ford R B000272 beall john R MD 0 3 51 0 0.71 459 11 12.85 1 44.28 -5.72 4.17 1 0
50 1 10.32 94 ford R B000444 biden joseph D DE 1 2 51 0 0.12 339 22 14.75 0 44.28 -5.72 4.17 1 0
14 1 13.43 94 ford R B001077 burdick quentin D ND 1 8 62 0 2.57 208 2 6.73 1 44.28 -5.72 4.17 1 0
169 1 13.60 94 ford R B001210 byrd robert D WV 1 9 78 1 0.26 3793 26 4.46 1 44.28 -5.72 4.17 1 0

1.4 Estimating CMEs

Formally, the CME is defined as:

\[\theta(x) = \mathbb{E}\big[Y_i(1) - Y_i(0) \mid X_i = x\big]\]

where:

  • \(Y_i(1)\) and \(Y_i(0)\) represent the expected number of presidential references when the legislator is and is not in the opposition, respectively,

  • and \(X_i\) (here, the presidential vote margin) moderates how that difference varies across political contexts.

First, let’s estimate in the classical way teaches in 784 using the Kernel estimator.

Show Answer
out.kernel <- interflex(
  estimator = "kernel",
  data = as.data.frame(house_presapp),
  nboots = 2000,
  Y = "pres_ref",
  D = "opp",
  X = "pres_vote_margin",
  Z = c("prev_vote","majority","leadership","seniority","tot_speech"),
  treat.type = "discrete",
  na.rm = TRUE
  )
Baseline group not specified; choose treat = 0 as the baseline group. 
Cross-validating bandwidth ... 
..............................Optimal bw=0.4091.
Number of evaluation points:50
Show Answer
plot(out.kernel)

Now let’s try to implement the Lasso-AIPW estimator. Given that the code described in the book is not fully implemented on GitHub, this estimator is not available yet. In order to demonstrate how the code would work, there is the sample of the arguments bellow. We basically use the same code as the one in the kernel estimator and but change the name to aipw.

Show Answer
# Not work
est.aipw <- interflex(
  estimator ='aipw',
  data = as.data.frame(house_presapp),
  nboots = 2000,
  Y = "pres_ref",
  D = "opp",
  X = "pres_vote_margin",
  Z = c("prev_vote","majority","leadership","seniority","tot_speech"),
  treat.type = "discrete",
  na.rm = TRUE
  )

Now let’s implement the DoubleML approach to estimate the CME First, we will implement the neural networks within the dml framework. To do so we set the estimator argument as dml and add two additional components, the model.y which refers to the outcome model, and model.t which refers to the propensity score model. Both of them will recieve the value of nn, which indicates neural networks. We further need to say how the variance is gonna be computed by the argument vartype, set as bootstrap, and if we want parallel computation (if your computer has multiple cores).

Show Answer
## DML estimator with neural network machine learning method
out.dml.nn <- interflex(
  estimator = 'dml',
  nboots = 2000,
  data = as.data.frame(house_presapp),
  model.y = 'nn', model.t = 'nn',
  treat.type = 'discrete',
  Y = "pres_ref",
  D = "opp",
  X = "pres_vote_margin",
  Z = c("prev_vote","majority","leadership","seniority","tot_speech"),
  na.rm = T,
  vartype = "bootstrap", 
  parallel = TRUE 
  )
Baseline group not specified; choose treat = 0 as the baseline group. 
Show Answer
plot(out.dml.nn)

Now let’s do the same for a random forest estimator. To do so, we basically change the model.y and model.t arguments to rf. Note that, since both arguments are independent, they don`t need to be set as the same, allowing multiple combinations of machine learning algorithms.

Show Answer
## DML estimator with random forest machine learning method
out.dml.rf <- interflex (
  estimator = 'dml',
  nboots = 2000,
  data = as.data.frame(house_presapp),
  model.y = 'rf', model.t = 'rf',
  treat.type = 'discrete',
  Y = "pres_ref",
  D = "opp",
  X = "pres_vote_margin",
  Z = c("prev_vote","majority","leadership","seniority","tot_speech"),
  na.rm = T,
  vartype = "bootstrap", 
  parallel = TRUE 
  )
Baseline group not specified; choose treat = 0 as the baseline group. 
Show Answer
plot(out.dml.rf)

To end, we will do the same for the histogram gradient boosting, changing the arguments to hgb.

Show Answer
## DML estimator with histogram gradient boosting machine learning method
out.dml.hgb <- interflex (
  estimator = 'dml',
  nboots = 2000,
  data = as.data.frame(house_presapp),
  model.y = 'hgb', model.t = 'hgb',
  treat.type = 'discrete',
  Y = "pres_ref",
  D = "opp",
  X = "pres_vote_margin",
  Z = c("prev_vote","majority","leadership","seniority","tot_speech"),
  na.rm = T,
  vartype = "bootstrap", 
  parallel = TRUE 
  )
Baseline group not specified; choose treat = 0 as the baseline group. 
Show Answer
plot(out.dml.hgb)

To end, we will compare those three models by ploting their output together.

Show Answer
ggsave(filename = 'example.png',gridExtra::grid.arrange(
  plot(out.kernel, main = "Kernel Estimator",ylab = 'Marginal Effect',xlab=''),
  plot(out.dml.nn, main = "NN Estimator",xlab = '',ylab=''),
  plot(out.dml.rf, main = "RF Estimator",ylab = 'Marginal Effect',xlab='Moderator'),
  plot(out.dml.hgb, main = "HGB Estimator",xlab = 'Moderator',ylab='')
), units = 'px', width = 2400, height = 1800)