Let’s load the data, look at the format, and work through a simple implementation.
dat = read.csv("example_data.csv")
head(dat)
Each row is a timepoint/“trial”. The columns are as follows:
id is the subject IDtrial is the timepointX is a binary indicator of availability. If your data
was collected with a static regime (i.e., treatment was randomized
marginally at each timepoint/“trial” with a so-called “open-loop”
design), then set X=1 for every row.Y is the outcome. We simulated this data such that
Y is continuous, but our framework handles all outcome data
types (e.g., binary, count).weights are the treatment probabilities, but we show
below how this column’s values were calculated below for
completeness.Group is the (baseline-randomized) treatment arm
variable. Micro-randomized trials in mobile health sometimes have just
one treatment arm, so this is \(not\)
necessary to have. However, when you analyze optogenetics data, for
example, this is often critical to incorporate into your analysis.Our weights column contains the treatment probabilites,
\(\mathbb{P}(A_t = a_t \mid X_t)\)
(these actually are the inverse of the weights used in the
method): the probability that an individual receives the treatment that
they did on timepoint/trial \(t\),
given their availability status, \(X_t\). Let’s say we ran an experiment where
an animal received optogenetic stimulation on 80% of the
timepoints/trials when they press a lever (\(X_t=1\)), but if they do not press a lever
(\(X_t=0\)), they never receive
stimulation. On timepoints where \(X_t =
1\), the weights are \(W_t =
0.8\) if \(A_t=1\) (i.e., they
received stimulation), and \(W_t =
0.2\) if \(A_t=0\) (i.e., they
did not get stimulated). If \(X_t=0\)
(they were not available to get stimulated), then \(W_t = 1\) no matter what. The
weights were included in the loaded dataset, but for
explanatory purposes, we include the code below which calculates these
weights for you (with an example 80% treatment probability).
treat_prob = 0.8 # P(A = 1 | X = 1) # experimental treatment probability
dat$weights = 1 # initialize for all available = 0 trials
dat$weights = ifelse(dat$X == 1 & dat$A == 1, treat_prob, dat$weights) #
dat$weights = ifelse(dat$X == 1 & dat$A == 0, 1 - treat_prob, dat$weights)
Let’s start with the classic blip effect: “what’s the change in the mean counterfactual outcome for treatment on the most recent timepoint – what would have happened on average if everyone on timepoint \(t\) received treatment (if available) vs. if no one had received treatment?” The left plot shows a setting where a blip effect exists in blue, and a setting where there is no blip effect in red.
To test this, we first construct the treatment sequence (for Policy 0 and Policy 1) matrix depicted in the figure above. This sequence matrix will have \(\Gamma = 1\) columns and \(q = 2\) rows, where \(q\) is the number of unique treatment sequences (policies) we want to estimate the causal effect of. For the blip effect, \(\Gamma=1\) because we are only looking at the effect of the last timepoint, and \(q=2\) because we want to know what happens for treatment (Policy 1 in the figure) or no-treatment (Policy 0 in the figure) on the most recent timepoint only.
seq.mat = matrix(c(1,0), nrow = 2)
This can be read in chronological order from left to right:
d_1 is a treatment opportunity indicator for timepoint
\(t-1\) and d_2 is a
treatment opportunity indicator for timepoint \(t\).
Now let’s use our code to construct the necessary dataset and Inverse
Probability of Treatment (IPW) weights. We specify the names of the
variables. If we have collected data across multiple days/sessions, we
should specify that so the function is not confused by, for example,
having multiple timepoint 1s from different sessions. In our case, we
only have one session so we leave this as NA.
source("seqEx.R")
dat.seqex = seqEx(data = dat,
A_name = "A",
Y_name = "Y",
X_name = "X",
W_name = "weights",
id_name = "id",
trial_name = "trial",
session_name = NA,
seq.mat = seq.mat)
head(dat.seqex)
If we looked further, we’d see some timepoints were removed, and some
repeated. At this point, each observation (row) is not super
interpretable. But we can see we now have a variable d_1,
our treatment variable that we can use as a covariate in a regression.
w is the IP weights.
How do we answer our question? Well looking back at Figure 1, we want
to test whether the mean (counterfactual) outcome is different between a
policy in which you were treated on the last timepoint vs. were not
treated on the last timepoint: \(\mathbb{E}[Y(\boldsymbol{a}^{(1)})] -
\mathbb{E}[Y(\boldsymbol{a}^{(0)})]\). This reduces to testing
whether the coefficient for d_1 is significantly different
than 0.
To test this, let’s fit a weighted GEE. Since outcomes are
continuous, the projection parameter(s) (the History-Restricted Marginal
Structural Model regression coefficients) can be estimated with a
weighted linear regression with a robust covariance estimator. If one
has data with a binary outcome, they can apply our projection parameter
code in the GitHub repository. Alternatively, if you have count or
binary data, you can fit a GLM and use the same robust covariance
estimators below to yield GEE estimates. In our paper, we tested the
HC Heteroscedasticity-Consistent Covariance Matrix
Estimators, so we use those for inference here. Since our sample size is
fairly large, \(n=100\), we use the
type="HC0", but recommend type="HC3" for less
than \(n=50\). Finally, let’s test this
in the ‘treatment’ group (arm) first to keep things simple. We will deal
with between-arm effects below.
dat.seqex.trt <- dat.seqex[dat.seqex$Group == "Treatment",] # select rows for treatment group only
mod = lm(Y ~ d_1, data = dat.seqex, weights = w) # fit model
cov.matrix = sandwich::vcovCL(mod, type = "HC0", cluster = ~ id) # robust sandwich estimator for variance
lmtest::coeftest(x=mod, vcov = cov.matrix) # hypothesis testing of coefficients with robust variance estimator
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.311777 0.011590 113.1819 < 2.2e-16 ***
## d_1 0.114970 0.012356 9.3051 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The d_1 variable shows that a statistically significant
reduction in mean outcome levels when receiving treatment in the
previous timepoint compared to no treatment (averaging over all
timepoints prior to the most recent one). That is, there is a
significant “Blip Effect.”
Now let’s move on to testing the effect of treatment \(sequences\). We can use these to see if having received, say, two treatments in the last two timepoints, causes a greater effect on the mean outcome than one treatment in the past two timepoints. The process is the same as before, but now we define our sequence matrix to capture all combinations of treatment in the past \(\Gamma=2\) timepoints.
The left plot shows a setting where a dose response effect exists in red, and a setting where the effect is constant from successive treatments (in blue).
delta = 2
seq.mat = expand.grid( rep(list(0:1), delta))
colnames(seq.mat) = paste0('d_', 1:delta)
print(seq.mat)
## d_1 d_2
## 1 0 0
## 2 1 0
## 3 0 1
## 4 1 1
Let’s construct our dataset in the same way above, but using our new
seq.mat for the dose response model.
source("seqEx.R")
dat.seqex = seqEx(data = dat,
A_name = "A",
Y_name = "Y",
X_name = "X",
W_name = "weights",
id_name = "id",
trial_name = "trial",
session_name = NA,
seq.mat = seq.mat)
head(dat.seqex)
In this case we have 4 policies (i.e., treatment rules or “regimes”) that are of potential interest: one policy of dose 0, two policies of dose 1, one policy of dose 2. In the figure above, we summarize the policies simply by their dose and find an effect that averages across, for example, all policies of dose 1. To do this, we need to construct a variable in our dataset that labels these appropriately.
dat.seqex = as.data.frame(dat.seqex)
var_nams = paste0("d_", 1:delta) # variable names
dat.seqex$dose = rowSums(dat.seqex[,var_nams]) # add up the total number of treated trials
print(dat.seqex[c(1,30,70,110), c(var_nams, "dose")]) # print a few representative cases
## d_1 d_2 dose
## 1 0 0 0
## 30 1 0 1
## 70 0 1 1
## 110 1 1 2
We can see our dose variable is just the total number of
times in the last \(\Gamma=2\)
timepoints with treatment. Let’s set dose as a factor
variable so that we can test whether there are differences. We adjust
the model now to include dose as the covariate/treatment
variable. In particular, the model is as follows: \[
\mathbb{E} [Y_t(\boldsymbol{d}_{\Gamma, t})] = \beta_0 + \beta_1
I(\sigma_{t -1}(d_{t - 1}) + \sigma_{t}(d_{t}) = 1) + \beta_2
I(\sigma_{t -1}(d_{t - 1}) + \sigma_{t}(d_{t}) = 2),\] where
\(\sigma_{j}(d_j) \in \{0,1\}\) is an
indicator of a treatment opportunity at time \(j\) under rule \(d_j\).
dat.seqex$dose = as.factor(dat.seqex$dose) # set as factor
dat.seqex.trt <- dat.seqex[dat.seqex$Group == "Treatment",] # select rows for treatment group only
# fit model
mod = lm(Y ~ dose, data = dat.seqex.trt, weights = w) # fit model
# hypothesis testing
cov.matrix = sandwich::vcovCL(mod, type = "HC0", cluster = ~ id) # robust sandwich estimator for variance
lmtest::coeftest(x=mod, vcov = cov.matrix) # hypothesis testing of coefficients with robust variance estimator
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.209093 0.026574 45.4992 < 2.2e-16 ***
## dose1 0.175685 0.025913 6.7799 1.234e-11 ***
## dose2 0.218778 0.028880 7.5753 3.730e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This tests the contrasts outlined on the far right of the plot: the difference in the mean (counterfactual) outcome comparing: 1. Dose 0 vs. Dose 1 2. Dose 0 vs. Dose 2 3. Dose 0 vs. Dose 3
From the output, it is clear there is a significant (decreasing) dose response effect. We could also test other pairs (e.g., Dose 1 vs. Dose 2) than those tested here.
Similarly, we could model dose with a linear trend.
Let’s fit that to emphasize that once we construct the dataset for the
inputted policies of interest, we can summarize the treatment and model
it in a variety of ways (albeit with different modeling tradeoffs). The
only tweak here is that we set dose to be a continuous
variable so we can model it with a single coefficient.
dat.seqex$dose = as.numeric(dat.seqex$dose) # set as continuous variable
dat.seqex.trt <- dat.seqex[dat.seqex$Group == "Treatment",] # select rows for treatment group only
# fit model
mod = lm(Y ~ dose, data = dat.seqex.trt, weights = w) # fit model
# hypothesis testing
cov.matrix = sandwich::vcovCL(mod, type = "HC0", cluster = ~ id) # robust sandwich estimator for variance
lmtest::coeftest(x=mod, vcov = cov.matrix) # hypothesis testing of coefficients with robust variance estimator
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.132351 0.034970 32.3802 < 2.2e-16 ***
## dose 0.109681 0.013869 7.9083 2.737e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This linear effect estimate is significant.
Let’s test one more effect and then move on to effect modification.
Let’s next test whether the effect of a \(single\) dose persists or dissipates across timepoints.
Here we set \(\Gamma=2\) again, and we test these with respect to a sequence with no treatments in the last \(\Gamma=2\) timepoints.
# construct treatment sequence matrix
delta = 2
seq.mat = rbind( rep(0, delta),
diag(delta) )
var_nams = paste0("d_", 1:delta) # variable names
colnames(seq.mat) = var_nams
print(seq.mat)
# construct dataset
dat.seqex = seqEx(data = dat,
A_name = "A",
Y_name = "Y",
X_name = "X",
W_name = "weights",
id_name = "id",
trial_name = "trial",
session_name = NA,
seq.mat = seq.mat)
dat.seqex = as.data.frame(dat.seqex)
# summarize treatment sequence by when most recent dose occurred
sum.fn <- function(x){
ifelse(sum(x) > 0,
which(x == 1),
0)
}
dat.seqex$trt.seq = apply(dat.seqex[,var_nams], 1, sum.fn)
head(dat.seqex[c(1,20,50), c(var_nams, "trt.seq")])
We can see that trt.seq encodes on what trial (among the last \(\Gamma =2\) timepoints) the single
treatment occurred. We set trt.seq=0 if there are no
treatments (our baseline value) since we will set trt.seq
to a factor below. As above, now we can just use trt.seq as
our treatment variable in a regression to test the effects. In this
case, the HR-MSM is as follows: \[
\mathbb{E} [Y_t(\boldsymbol{d}_{\Gamma, t})] = \beta_0 + \beta_1
I(\sigma_{t -1}(d_{t - 1}) = 1) + \beta_2 I(\sigma_{t}(d_{t}) =
2),\] where we restrict to \(d_{\Gamma,
t}\) such that \(\sigma_{t -1}(d_{t -
1}) + \sigma_{t}(d_{t}) \leq 1\).
dat.seqex$trt.seq = as.factor(dat.seqex$trt.seq) # set as factor variable
dat.seqex.trt <- dat.seqex[dat.seqex$Group == "Treatment",] # select rows for treatment group only
# fit model
mod = lm(Y ~ trt.seq, data = dat.seqex.trt, weights = w) # fit model
# hypothesis testing
cov.matrix = sandwich::vcovCL(mod, type = "HC0", cluster = ~ id) # robust sandwich estimator for variance
lmtest::coeftest(x=mod, vcov = cov.matrix) # hypothesis testing of coefficients with robust variance estimator
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.2090930 0.0265739 45.4992 <2e-16 ***
## trt.seq1 0.0070049 0.0269684 0.2597 0.7951
## trt.seq2 0.3439222 0.0275021 12.5053 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It appears that this treatment has an effect dissipation profile: the more recent the treatment, the larger the effect estimate (trt.seq2 indicates treatment on the most recent trial whereas trt.seq1 indicates treatment two timepoints ago). Again, we could instead compare different pairs (e.g., perhaps we want to compare treatment 2 timepoints ago with treatment 1 trial ago).
Now let’s move on to effect modification to show how to incorporate treatment groups (arms) or other effect modifiers.
For optogenetics studies, we often want to know whether the effect of an optogenetic stimulation (or sequence of stimulations) is significantly different between treatment (opsin) and (no-opsin) control groups (arms). This reduces to testing interaction terms between treatment arm (\(\texttt{Group}\)) and the treatment sequence variables constructed above. Let’s see one example on the dose response analysis from above. Here we also show we can test the dose response effect with \(\Gamma=2\). Formally, the model will be \[\mbox{logit}\left ( \mathbb{E} [Y_t(\boldsymbol{d}_{\Gamma, t}) \mid G = g] \right ) = \beta_0 + \sum_{r=0}^1 \beta_{r+1} \sigma_{t - r}(d_{t - r}) + \beta_3 g + \sum_{r=0}^1 \beta_{4+r} g \times \sigma_{t - r}(d_{t - r}).\] Since we already saw how to construct the dataset, let’s jump right into the analysis.
dat.seqex$dose = as.factor(dat.seqex$dose) # set as factor variable
dat.seqex$Group = as.factor(dat.seqex$Group)
dat.seqex$Group = relevel(dat.seqex$Group, ref = "Control") # set Control as baseline
# fit model
mod = lm(Y ~ dose*Group, data = dat.seqex, weights = w) # fit model
# hypothesis testing
cov.matrix = sandwich::vcovCL(mod, type = "HC0", cluster = ~ id) # robust sandwich estimator for variance
lmtest::coeftest(x=mod, vcov = cov.matrix) # hypothesis testing of coefficients with robust variance estimator
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.4009807 0.0232813 60.1763 < 2.2e-16 ***
## dose1 0.0053460 0.0219977 0.2430 0.8080
## dose2 -0.0014178 0.0231048 -0.0614 0.9511
## GroupTreatment -0.1918877 0.0352795 -5.4391 5.389e-08 ***
## dose1:GroupTreatment 0.1703390 0.0339410 5.0187 5.226e-07 ***
## dose2:GroupTreatment 0.2201956 0.0369286 5.9627 2.502e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The interaction terms (e.g., dose1:GroupTreatment) show
that the dose response effect is significantly larger in the Treatment
arm compared to the Control arm. The results for the main effects
dose1 and dose2 indicate that there is no
significant effect of the treatment (either dose 1 or dose 2) within the
Control arm (since we set the Control arm as the reference level).
More generally, one could include any variable as an effect modifier that was measured before trial \(t-\Gamma+1\), such as behavior on trial \(t-\Gamma\), trial number, or baseline covariates (e.g., age, sex).
Why go through the above estimation procedure? Why not just compare simple summaries across groups (arms)? It turns out that in closed-loop designs (i.e., dynamic regimes), “treatment-confounder feedback” can completely obscure the strong effects we observed above. Let’s see this by plotting estiamtes of the average outcome at each trial (Plot A) or pooled across trials/timepoints (Plot B).
# plot macro effects
library(dplyr)
library(ggplot2)
library(latex2exp)
myColors <- c("green", "#E69F00", "grey", "purple")
myColors2 <- c("#ca0020", "#0868ac", "#E69F00", "darkgreen", "darkgrey")
plot2 <- dat %>% as_tibble() %>%
dplyr::mutate(group = as.factor(Group)) %>%
group_by(Group, trial) %>%
summarise(Y_mean = mean(Y)) %>%
dplyr::filter(trial != 1) %>%
ggplot(aes( y = Y_mean, x = trial, group = Group)) +
geom_point(aes(color = Group), size = 0.75, alpha = 0.5) +
geom_smooth(aes(color = Group)) +
ylab('Outcome Variable - Y') +
xlab("Trial Number") +
scale_fill_manual(values = myColors2) +
scale_color_manual(values = myColors2) +
theme_classic(base_size = 12) +
theme( plot.title = element_text(hjust = 0.5, color="black", size=rel(0.75), face="bold"),
axis.text=element_text(color="black", size=rel(1)),
axis.title = element_text(color="black", size=rel(1.25)),
legend.key.size = unit(1.5, "line"), # added in to increase size
legend.text = element_text(color="black", size = rel(1)),
legend.title = element_text(face="bold", color="black", size = rel(1)),
strip.text = element_text(color="black", size = rel(1))) +
labs(tag = "A")
plot3 <- dat %>% as_tibble() %>%
dplyr::mutate(group = as.factor(Group)) %>%
dplyr::filter(trial != 1) %>% # no observation first trial
group_by(Group, id) %>%
summarise(Y_mean = mean(Y)) %>%
ggplot(aes( y = Y_mean, x = Group, group = Group)) +
geom_boxplot(aes(color = Group), lwd = 0.75) +
ylab('Outcome Variable - Y') +
xlab("Group") +
scale_fill_manual(values = myColors2) +
scale_color_manual(values = myColors2) +
theme_classic(base_size = 12) +
theme( plot.title = element_text(hjust = 0.5, color="black", size=rel(0.75), face="bold"),
axis.text=element_text(color="black", size=rel(1)),
axis.title = element_text(color="black", size=rel(1.25)),
legend.key.size = unit(1.5, "line"), # added in to increase size
legend.text = element_text(color="black", size = rel(1)),
legend.title = element_text(face="bold", color="black", size = rel(1)),
strip.text = element_text(color="black", size = rel(1))) +
labs(tag = "B")
# plot both
ggpubr::ggarrange(plot2, plot3,
ncol=2, nrow=1,
common.legend = TRUE,
legend="bottom")
As we can see, treatment effects are entirely obscured if estimated with standard methods. The history-restricted marginal structural model that we use explicitly accounts for the closed-loop design and thus avoids the pitfalls of treatment-confounder feedback.