Introduction

  • Recognize what to control for in DAGs

  • Estimation methods

  • Steps to causal inference

    • Define causal effects

    • Specify models

    • Specify and defend assumptions

Causal effects and confounding

Causal effects

A: treatment

  • exposure/intervention of interest

  • Random variable

  • lowercase a refers to a realization of var A

  • Treatment status of subject i

Y: outcome

  • Could be continuous

  • Could be time to event

  • Could be multidimensional

Potential outcomes

  • Ya for continuous ind var

  • Y1 and Y0 for binary

    • i.e. Y1: time until individual would get flu if they received the vaccine

    • Y0: time until the individual would get the flu if they did not receive the vaccine

Observed outcomes

  • Observed outcome Y is the outcome under treatment a subject received, i.e. Y = YA

  • Missing data problem where selection decided by treatment received

Independence

  • Assuming that the treatment given to one subject does not affect outcome for another subject

Causal effects

  • A had causal effect on Y if Y1 =/= Y0

  • Cant assume causal effect of ibuprofen without knowing untreated effect

  • However can make estimates on population level (average causal effects)

  • E(Y1 - Y0)

    • Average outcome if everyone had been treated versus if everyone had been treated

    • Typically not equal to E(Y|A = 1) - E(Y|A = 0) (two subpopulations) unless we randomize

    • Pearl uses notation E(Y|do(A = 1)) - E(Y|do(A = 0))

    • do notation means “set” treatment, equal to notation on top

Other causal effects

  • Potential risk ratio

  • Causal effect of a subpopulation E(Y1 - Y0|V = v)

  • Removing an exposure, hypothetical “what ifs” E(Y1 - Y0 |A = 1)

  • Quantile causal effects F\(\overline{1}\)1(p) - F\(\overline{0}\)1(p)

Valid examples

  • Invalid when comparing same treatment on two separate groups without strong assuptions

  • Not valid when comparing separate subpopulations even when we have different treatment effects

Confounding

  • When there are variables that affect the treatment decision and outcome

    • Also possible in randomized trials when there is noncompliance

    • Creates a problem because there is a lack of comparability between groups

Example

  • Severe injury is affecting likelihood of getting surgery but its not deterministic

  • Treatment will bump up Ya

  • Will have separate distributions for a=0 and a=1, but curve might get pulled in one direction by L

Types of confounding

  • Controllable confounding

  • No confounding

  • Confounding not controllable using standard methods but controllable using others (e.g. instrumental variables)

    • COuld use a natural randomizer

    • Difference of differences tries to difference out unmeasured confounding variables

  • Uncontrollable confounding

    • Can do sensitivity analysis

Controllable confounding

  • L is set of baseline (pre-treatment) covariates

  • If we control for confounding we can assume any other differnces are random

  • Controllable if there is overt (not hidden) bias

    • f(Y0, Y1 |L,A) = f(Y0, Y1 |L)

    • f(A|L, Y0, Y1) = f(A|L)

Ignorability

  • Basically means conditional randomization

  • Treatment depends on potential outcomes

Graphical Models

  • dag terminology

    • Nodes are variables and paths are flow from one variable to another
library(dagitty)
library(ggdag)

dagified <- dagify(z ~ x,
                   y  ~ z,
                   exposure = "x",
                   outcome = "y")
dag <- tidy_dagitty(dagified)

ggdag(dag, layout = circle) +
  theme(panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank())

  • In this DAG:

    • X is parent of Z, ancestor of Y

    • Y is child of Z, descendent of X

    • This is an example of chains

Information flow

  • Forks

    • A \(\leftarrow\) Z \(\rightarrow\) B

    • X and Y are not independent in diagram below since information flows from Z in both cases

    • X and Y are associated with eachother

Paths that do not induce association

  • In cases of a collider G

    • A \(\rightarrow\) G \(\leftarrow\) B

    • A and B both affect G

    • A and B are both independent

Blocking

  • Paths can be blocked by conditioning on nodes in the path

  • Stopping the independence

  • i.e. temperature -> icy sidewalks -> slipping

  • We can block the relationship between temperature and slipping by conditioning on sidewalks

D-separation

A path is d-separated by a set of nodes C if:

  • it contains a chain (D\(\rightarrow\)E\(\rightarrow\)F) and the middle part is in C

    OR

  • it contains a fork (D\(\leftarrow\)E\(\rightarrow\)F) and the middle part is in C

    OR

  • it contains an inverted form (D\(\rightarrow\)E\(\leftarrow\)F) and the middle part is not in C

Backdoor paths

Backdoor paths from treatment A to outcome Y are paths from A to that travel through arrows going into A

  • These confound the relationship between A and Y

  • Need to be blocked to make causal assumptions about effect of A on Y

Backdoor path criterion

Set of variables L is sufficient to control for confounding if:

  • It blocks all backdoor paths from treatment to the outcome

  • It does not include any descendants of treatment

This is the backdoor path criterion

  • Can identify backdoor paths as those that go against the flow of information
dagified <- dagify(y ~ a,
                   a  ~ w,
                   y ~ w,
                   w ~ z,
                   a ~ z,
                   exposure = "a",
                   outcome = "y")
dag <- tidy_dagitty(dagified)

ggdag(dag, layout = circle) +
  theme(panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank())

In this example we can block the backdoor path of information from a to y by controlling for W or for W and Z

  • Can also control for other risk factors affecting y for sake of efficiency gains, does not affect causal outcome

  • Conditioning on collider opens path between vars leading into collider (ex 4 in slides)

dagified <- dagify(y ~ a,
                   y  ~ x2,
                   x3 ~ x2,
                   x3 ~ x1,
                   a ~ x1,
                   x3 ~ a,
                   y ~ x3,
                   exposure = "a",
                   outcome = "y")
dag <- tidy_dagitty(dagified)

ggdag(dag, layout = circle) +
  theme(panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Here we need to condition on x3 because we need to prevent a backdoor path from x1 but we create another backdoor path when we condition on x3. We must also condition on x2.

Strategies

  • Adjust for all pre-treatment variables

    • controlling for instruments can inflate standard errors
  • Select minimum set using backdoor criterion

  • Use disjunctive cause criterion

    • Requires knowing all vars that directly affect treatment decision and outcome

    • Just need to control for all potentially causal variables on both treatment and exposure

  • Debate on data-driven approach

    • Could discard many weak confounders

    • inferential challenge about getting SEs right for data collection

Identifiability and estimation

Identification

Identifying assumptions

  • Ignorability

    • If we condition on L, A tells us nothing about Y
  • Positivity

  • Consistency

Special case: standardization

E(Ya) = \(\sum_{l = 0}^{1} f(x)\)E(Y|A = a, L)dF(L)

Parametric G-Formula

Could estimate E(Ya) by using models for E(Y|A = a,L) and f(L)

E(Ya) = \(\int_{}^{}\) E(Y|A = a,L;\(\theta\))f(L;\(\eta\))dL

Example

  • Can fit logistic regression Y|A,L

  • Get predicted values for Y for each Li if A set to a

  • Average over empirical distribution of L

  • do for a = 0 and a = 1

Weighting

Standardization approach

Averaging over the marginal distributon of L

  • downweights those exposed to L and upweights those not exposed to L

Propensity score

The propensity score is the conditional probability of treatment P(A|L)

IPTW

  • Can weight by the inverse of the probability of treatment received

    • If P(A = 1|L) is high we expect that and we want that case to be weighted down
    • If P(A = 1|L) is low we don’t expect that and we want that case to be weighted up
  • Creating a pseudo-population by getting rid of confounder by creating a balance between treated and controlled

  • When we have balance we can just take sample means and have weighted estimates

  • Basically “erasing an arrow” from confounding, but you have to get the right Ls to weight to

Stabilized weights

If the prob of a particular treatment give L is small, then the weight will be large, the use of stabilized weights can help

What we know about IPTW

  • large weights lead to large SEs

  • very large weights if positivity nearly violated

  • We can do trimming or weight truncation to deal with large weights

  • weight truncation is setting a max weight

  • augmented IPTW - doubly robust and more efficient than IPTW

Matching

Matching is an alternative to standardization and IPTW estimation

  • Idea is to make the data more like what you would expect from a randomized trial

  • Match treated subjects to controls subjects on the confounders

  • Advtanges to matching

    • controlling for confounders is achieved at the design phase without looking at the outcome

    • Matching will reveal lack of overlap in covariate distribution

    • Once data are matched, essentiall treated as if from an RCT

Distance

  • Matching requires calculating distance between Li and Lj

    • can do this with exact matching

    • mahalanobis distance

    • mahalanobis distance with a caliper (if distance to large, we force it to not match)

    • propensity score - probability of treatment and compare scores for matching

    • propoensity score with caliper

  • Can combine measures, such as using mahalanobis with propensity score caliper or KNN matching

Propensity score matching

  • If there is poor overlap we’d have to do extrapolation. We’d lose lots of data if we did trimming.

  • For subjects i,j, We then can create a distance matrix between ei and ej (propensity of i and j)

Criticisms

  • Propensity is based on treatment so we might be overweighting a covariate highly related to p of treatment even though it might not be that related to outcome

Some match types

  • Nearest Neighbor matching

    • Computationally fast but not optimal
  • Optimal matching

    • Size of population limiting on ability to do optimal matching
  • Sparse optimal matching

    • Somewhat stratified

    • Optimal matching within blocks

  • After matching you can analyze outcome data

R examples

Toy example

library(sandwich)

n<-200
L<-rbinom(n,1,0.5)
A<-rbinom(n,1,(0.3+0.3*L))
Y<-rnorm(n,(35+10*L+20*A),5)
mydata<-data.frame(cbind(A,L,Y))

kable(head(mydata))
A L Y
1 0 65.93717
0 1 44.00917
1 1 67.89592
0 0 43.63981
0 1 43.23738
1 1 70.90363

Let’s look at the diff of means

#unadjusted
#mean of Y for treated subjects
mean(Y[A==1])
## [1] 63.88793
#mean of Y for untreated subjects
mean(Y[A==0])
## [1] 38.31022
#difference
mean(Y[A==1])-mean(Y[A==0])
## [1] 25.57771
We can fit a regression here.
reg<-lm(Y~A+L,data=mydata)
summary(reg)
## 
## Call:
## lm(formula = Y ~ A + L, data = mydata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.6897  -3.1837   0.1299   3.6035  11.2976 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  34.5494     0.5490   62.94   <2e-16 ***
## A            20.4373     0.7932   25.77   <2e-16 ***
## L            11.4975     0.7932   14.50   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.004 on 197 degrees of freedom
## Multiple R-squared:  0.8846, Adjusted R-squared:  0.8834 
## F-statistic: 754.9 on 2 and 197 DF,  p-value: < 2.2e-16
Let’s compute standardization.
#can do by hand since one binary covariate
pr.l <- prop.table(table(mydata$L))
kable(pr.l)
Var1 Freq
0 0.465
1 0.535
tab.out <- aggregate(Y ~ A + L, mydata, mean)
kable(tab.out)
A L Y
0 0 34.47951
1 0 55.22618
0 1 46.19054
1 1 66.41428
Conditional means

We can computer conditional means and weigh by value of the covariate. This will allow us to correct the bias of treatment.

((mean(mydata$Y[mydata$A==1 & mydata$L==1]) - 
    mean(mydata$Y[mydata$A==0 & mydata$L==1]))*pr.l[2]) + 
  (mean(mydata$Y[mydata$A==1 & mydata$L==0]) - 
     mean(mydata$Y[mydata$A==0 & mydata$L==0]))*pr.l[1]
##       1 
## 20.4669
We can also calculate propensity scores:
p.score <- glm(A ~ as.factor(L), data=mydata, family=binomial)
p.a <- ifelse(mydata$A == 0, 1 - predict(p.score, type = "response"),
                  predict(p.score, type = "response"))
kable(table(p.a))
p.a Freq
0.225806451613194 21
0.327102803738384 35
0.672897196261616 72
0.774193548386806 72

We get four values corresponding to two values of a for each treated and untreated L.

Next we can generate IP weights
mydata$w <- 1/p.a
kable(table(mydata$w))
Var1 Freq
1.29166666666715 72
1.48611111111126 72
3.05714285714224 35
4.42857142856573 21
summary(mydata$w)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.292   1.292   1.486   2.000   3.057   4.429
sd(mydata$w)
## [1] 1.046915
Let’s estimate ATE
mean(mydata$w*as.numeric(mydata$A==1)*mydata$Y) - 
  mean(mydata$w*as.numeric(mydata$A==0)*mydata$Y)
## [1] 20.4669
Creating a marginal structural model
msm <- lm(Y  ~ A, data = mydata, weights = w)
Now with inference

Weights create a pseudo population:

SE <-sqrt(diag(vcovHC(msm, type="HC0"))) # robust standard errors
beta <- coef(msm)
lcl <- beta-1.96*SE 
ucl <- beta+1.96*SE
cbind(beta, lcl, ucl)[2,]
##     beta      lcl      ucl 
## 20.46690 18.10841 22.82540

Real data

Let’s load our data:
library(tableone)
library(Matching)
library(geepack)
library(stats)


load(url("https://urldefense.proofpoint.com/v2/url?u=http-3A__biostat.mc.vanderbilt.edu_wiki_pub_Main_DataSets_rhc.sav&d=DwIGAg&c=lEzKI_JJakPtcnbAQ6Q5xQ&r=ADBnQF_uxktNSAkWyomgLeFb-eOkP5LRzvDAXj-4ESQ&m=GJ1JdJTpKzJvLWlb0bGX87lB6Gx5rPtIV9m0KvRo6AI&s=639__Kuc3BOyiS-CFnTwBH3AVQyrBK3GMJStuyusjrk&e="))

kable(head(rhc))
cat1 cat2 ca sadmdte dschdte dthdte lstctdte death cardiohx chfhx dementhx psychhx chrpulhx renalhx liverhx gibledhx malighx immunhx transhx amihx age sex edu surv2md1 das2d3pc t3d30 dth30 aps1 scoma1 meanbp1 wblc1 hrt1 resp1 temp1 pafi1 alb1 hema1 bili1 crea1 sod1 pot1 paco21 ph1 swang1 wtkilo1 dnr1 ninsclas resp card neuro gastr renal meta hema seps trauma ortho adld3p urin1 race income ptid
COPD NA Yes 11142 11151 NA 11382 No 0 0 0 0 1 0 0 0 1 0 0 0 70.25098 Male 12.000000 0.6409912 23.50000 30 No 46 0 41 22.0976562 124 10 38.69531 68.0000 3.500000 58.00000 1.0097656 1.1999512 145 4.000000 40 7.359375 No RHC 64.69995 No Medicare Yes Yes No No No No No No No No 0 NA white Under $11k 00005
MOSF w/Sepsis NA No 11799 11844 11844 11844 Yes 1 1 0 0 0 0 0 0 0 1 1 0 78.17896 Female 12.000000 0.7549996 14.75195 30 No 50 0 63 28.8984375 137 38 38.89844 218.3125 2.599609 32.50000 0.6999512 0.5999756 137 3.299805 34 7.329102 RHC 45.69998 No Private & Medicare No No No No No No No Yes No No NA 1437 white Under $11k 00007
MOSF w/Malignancy MOSF w/Sepsis Yes 12083 12143 NA 12400 No 0 0 0 0 0 0 0 0 1 1 0 0 46.09198 Female 14.069916 0.3169999 18.13672 30 No 82 0 57 0.0499954 130 40 36.39844 275.5000 3.500000 21.09766 1.0097656 2.5996094 146 2.899902 16 7.359375 RHC 0.00000 No Private No Yes No No No No No No No No NA 599 white $25-$50k 00009
ARF NA No 11146 11183 11183 11182 Yes 0 0 0 0 0 0 0 0 0 1 0 0 75.33197 Female 9.000000 0.4409790 22.92969 30 No 48 0 55 23.2968750 58 26 35.79688 156.6562 3.500000 26.29688 0.3999634 1.6999512 117 5.799805 30 7.459961 No RHC 54.59998 No Private & Medicare Yes No No No No No No No No No NA NA white $11-$25k 00010
MOSF w/Sepsis NA No 12035 12037 12037 12036 Yes 0 0 0 0 0 0 0 0 0 0 0 0 67.90997 Male 9.945259 0.4369998 21.05078 2 Yes 72 41 65 29.6992188 125 27 34.79688 478.0000 3.500000 24.00000 1.0097656 3.5996094 126 5.799805 17 7.229492 RHC 78.39996 Yes Medicare No Yes No No No No No No No No NA 64 white Under $11k 00011
COPD NA No 12389 12396 NA 12590 No 0 1 0 0 1 0 0 0 0 0 0 0 86.07794 Female 8.000000 0.6650000 17.50000 30 No 38 0 115 18.0000000 134 36 39.19531 184.1875 3.099609 30.50000 1.0097656 1.3999023 138 5.399414 68 7.299805 No RHC 54.89999 No Medicare Yes No No No No No No No No No 0 242 white Under $11k 00012
Then we create binaries out of cat1 and make a new dataset:
ARF<-as.numeric(rhc$cat1=='ARF')
CHF<-as.numeric(rhc$cat1=='CHF')
Cirr<-as.numeric(rhc$cat1=='Cirrhosis')
colcan<-as.numeric(rhc$cat1=='Colon Cancer')
Coma<-as.numeric(rhc$cat1=='Coma')
COPD<-as.numeric(rhc$cat1=='COPD')
lungcan<-as.numeric(rhc$cat1=='Lung Cancer')
MOSF<-as.numeric(rhc$cat1=='MOSF w/Malignancy')
sepsis<-as.numeric(rhc$cat1=='MOSF w/Sepsis')
female<-as.numeric(rhc$sex=='Female')
died<-as.numeric(rhc$death=='Yes')
age<-rhc$age
treatment<-as.numeric(rhc$swang1=='RHC')
meanbp1<-rhc$meanbp1


mydata <-data.frame(cbind(ARF,CHF,Cirr,colcan,Coma,lungcan,MOSF,sepsis,
              age,female,meanbp1,treatment,died))

kable(head(mydata))
ARF CHF Cirr colcan Coma lungcan MOSF sepsis age female meanbp1 treatment died
0 0 0 0 0 0 0 0 70.25098 0 41 0 0
0 0 0 0 0 0 0 1 78.17896 1 63 1 1
0 0 0 0 0 0 1 0 46.09198 1 57 1 0
1 0 0 0 0 0 0 0 75.33197 1 55 0 1
0 0 0 0 0 0 0 1 67.90997 0 65 1 1
0 0 0 0 0 0 0 0 86.07794 1 115 0 0
Making a vector out of covariates and printing summary results
xvars<-c("ARF","CHF","Cirr","colcan","Coma","lungcan","MOSF","sepsis",
         "age","female","meanbp1")
table1<- CreateTableOne(vars=xvars,strata="treatment", data=mydata, test=FALSE)

print(table1,smd=TRUE)
##                      Stratified by treatment
##                       0             1             SMD   
##   n                    3551          2184               
##   ARF (mean (SD))      0.45 (0.50)   0.42 (0.49)   0.059
##   CHF (mean (SD))      0.07 (0.25)   0.10 (0.29)   0.095
##   Cirr (mean (SD))     0.05 (0.22)   0.02 (0.15)   0.145
##   colcan (mean (SD))   0.00 (0.04)   0.00 (0.02)   0.038
##   Coma (mean (SD))     0.10 (0.29)   0.04 (0.20)   0.207
##   lungcan (mean (SD))  0.01 (0.10)   0.00 (0.05)   0.095
##   MOSF (mean (SD))     0.07 (0.25)   0.07 (0.26)   0.018
##   sepsis (mean (SD))   0.15 (0.36)   0.32 (0.47)   0.415
##   age (mean (SD))     61.76 (17.29) 60.75 (15.63)  0.061
##   female (mean (SD))   0.46 (0.50)   0.41 (0.49)   0.093
##   meanbp1 (mean (SD)) 84.87 (38.87) 68.20 (34.24)  0.455
Create an outmodel
outmodel<-glm(died~treatment+ARF+CHF+Cirr+colcan+Coma+lungcan+MOSF+
               sepsis+age+female+meanbp1,
             family=binomial(),data=mydata)
Can create predicted values for each person if and if not treated
#now get predicted values for each person if they had been treated
mydata1<-mydata
mydata1$treatment<-1
pred1<-predict(outmodel,mydata1,type='response')

#get predicted values for each person if they had not been treated
mydata0<-mydata
mydata0$treatment<-0
pred0<-predict(outmodel,mydata0,type='response')

mean(pred1)
## [1] 0.6797657
mean(pred0)
## [1] 0.6298412
Propensity Scores
psmodel<-glm(treatment~ARF+CHF+Cirr+colcan+Coma+lungcan+MOSF+
               sepsis+age+female+meanbp1,
             family=binomial(),data=mydata)

#show coefficients etc
summary(psmodel)
## 
## Call:
## glm(formula = treatment ~ ARF + CHF + Cirr + colcan + Coma + 
##     lungcan + MOSF + sepsis + age + female + meanbp1, family = binomial(), 
##     data = mydata)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7013  -1.0097  -0.6336   1.1814   2.4791  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.7299670  0.1997692  -3.654 0.000258 ***
## ARF          1.2931956  0.1487784   8.692  < 2e-16 ***
## CHF          1.6804704  0.1715672   9.795  < 2e-16 ***
## Cirr         0.5234506  0.2181458   2.400 0.016416 *  
## colcan       0.0295468  1.0985361   0.027 0.978542    
## Coma         0.7013451  0.1854937   3.781 0.000156 ***
## lungcan     -0.0869570  0.5039331  -0.173 0.863000    
## MOSF         1.3046587  0.1772705   7.360 1.84e-13 ***
## sepsis       2.0433604  0.1545437  13.222  < 2e-16 ***
## age         -0.0031374  0.0017289  -1.815 0.069567 .  
## female      -0.1697903  0.0583574  -2.909 0.003620 ** 
## meanbp1     -0.0109824  0.0008217 -13.366  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 7621.4  on 5734  degrees of freedom
## Residual deviance: 6983.6  on 5723  degrees of freedom
## AIC: 7007.6
## 
## Number of Fisher Scoring iterations: 4
#create propensity score
pscore<-psmodel$fitted.values
Create weights from pscore

We can output quick histograms to make sure there is enough overlap.

#unstabilized weights
wt<-ifelse(treatment==1,1/pscore,1/(1-pscore))
hist(wt)

summary(wt)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.046   1.405   1.721   2.001   2.280  21.606
#stabilized weights
#probability of treatment (unconditional)
probt<-mean(treatment)
sw<-ifelse(treatment==1,probt/pscore,(1-probt)/(1-pscore))
hist(sw)

summary(sw)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.4917  0.7579  0.8990  1.0004  1.0997  8.2279
Do a weighted regression
weighted.mean(died[treatment==1],wt[treatment==1])
## [1] 0.6820133
weighted.mean(died[treatment==0],wt[treatment==0])
## [1] 0.6304638
#now do weighted regression (get same estimates)
id<-1:length(died)
msm1 <- geeglm(as.numeric(died) ~ treatment, 
               id = id,
               weights = wt,
               family = binomial(link=logit))
summary(msm1)
## 
## Call:
## geeglm(formula = as.numeric(died) ~ treatment, family = binomial(link = logit), 
##     weights = wt, id = id)
## 
##  Coefficients:
##             Estimate Std.err   Wald Pr(>|W|)    
## (Intercept)  0.53421 0.03602 219.95  < 2e-16 ***
## treatment    0.22883 0.06486  12.45 0.000419 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Estimated Scale Parameters:
##             Estimate Std.err
## (Intercept)        1 0.01235
## 
## Correlation: Structure = independenceNumber of clusters:   5735   Maximum cluster size: 1
head(msm1$fitted.values)
##     [,1]
## 1 0.6305
## 2 0.6820
## 3 0.6820
## 4 0.6305
## 5 0.6820
## 6 0.6305
#using stabilized weights
msm2 <- geeglm(as.numeric(died) ~ treatment, 
               id = id,
               weights = sw,
               data=mydata,
               family = binomial(link=logit))
summary(msm2)
## 
## Call:
## geeglm(formula = as.numeric(died) ~ treatment, family = binomial(link = logit), 
##     data = mydata, weights = sw, id = id)
## 
##  Coefficients:
##             Estimate Std.err  Wald Pr(>|W|)    
## (Intercept)   0.5342  0.0360 219.9  < 2e-16 ***
## treatment     0.2288  0.0649  12.4  0.00042 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Estimated Scale Parameters:
##             Estimate Std.err
## (Intercept)        1  0.0106
## 
## Correlation: Structure = independenceNumber of clusters:   5735   Maximum cluster size: 1
head(msm2$fitted.values)
##    [,1]
## 1 0.630
## 2 0.682
## 3 0.682
## 4 0.630
## 5 0.682
## 6 0.630

Matching example

greedymatch<-Match(Tr=treatment,M=1,X=mydata[xvars],replace=FALSE)
matched<-mydata[unlist(greedymatch[c("index.treated","index.control")]), ]

#get table 1 for matched data with standardized differences
matchedtab1<-CreateTableOne(vars=xvars, strata ="treatment", 
                            data=matched, test = FALSE)
print(matchedtab1, smd = TRUE)
##                      Stratified by treatment
##                       0             1             SMD   
##   n                    2184          2184               
##   ARF (mean (SD))      0.42 (0.49)   0.42 (0.49)   0.006
##   CHF (mean (SD))      0.10 (0.29)   0.10 (0.29)  <0.001
##   Cirr (mean (SD))     0.02 (0.15)   0.02 (0.15)  <0.001
##   colcan (mean (SD))   0.00 (0.02)   0.00 (0.02)  <0.001
##   Coma (mean (SD))     0.04 (0.20)   0.04 (0.20)  <0.001
##   lungcan (mean (SD))  0.00 (0.05)   0.00 (0.05)  <0.001
##   MOSF (mean (SD))     0.07 (0.26)   0.07 (0.26)  <0.001
##   sepsis (mean (SD))   0.24 (0.43)   0.32 (0.47)   0.177
##   age (mean (SD))     61.53 (16.15) 60.75 (15.63)  0.049
##   female (mean (SD))   0.44 (0.50)   0.41 (0.49)   0.042
##   meanbp1 (mean (SD)) 73.12 (34.28) 68.20 (34.24)  0.144
#outcome analysis
y_trt<-matched$died[matched$treatment==1]
y_con<-matched$died[matched$treatment==0]
mean(y_trt)
## [1] 0.68
mean(y_con)
## [1] 0.626
#pairwise difference
diffy<-y_trt-y_con

#paired t-test
t.test(diffy)
## 
##  One Sample t-test
## 
## data:  diffy
## t = 4, df = 2000, p-value = 9e-05
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.0271 0.0810
## sample estimates:
## mean of x 
##     0.054
#McNemar test
kable(table(y_trt,y_con))
0 1
0 303 395
1 513 973
mcnemar.test(matrix(c(973,513,395,303),2,2))
## 
##  McNemar's Chi-squared test with continuity correction
## 
## data:  matrix(c(973, 513, 395, 303), 2, 2)
## McNemar's chi-squared = 20, df = 1, p-value = 1e-04

Causal Mediation

This is a case where we have an intervening variable that affects outcome, but isn’t confounding.

Direct and indirect effects

dag3 <- dagify(y ~ x,
       y ~ x + m, 
       m ~ x)
dag_med <- tidy_dagitty(dag3)

ggdag(dag_med, layout = circle) +
  theme(panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Traditional approach is to fit two regression models (Baron and Kenny)

Controlled direct effects

  • Setting treatment to a but for mediator values we set values to YA,MA

Natural direct effects

  • setting treatment a to a then comparing the potential outcomes if mediator was set to what it would be if treatment a versus if treatment 0

Decomposition

Cross-world conterfactual

  • unobservable usually

  • Integrating over a different distribution

    • Average mediator distribution over non treated