Causal inference introduction

Introduction

Definition

If all people are in the treatment/non-treatment group, the overall causal effect is calculated. The mean of the causal effect is E(Ya1-Ya0). The target trials are one of the cores of the causal inference framework.

Randomized experiments

For group A, p(Y=1|A=0) is equal to p(Y=1|A=0) for group B, so for a perfect random experiment, correlation is causality. In fact, a conditional random experiment can be seen as a combination of two marginal random experiments. That is: p(Ya=1|A=1,L=1)= p(Ya=1|A=0,L=1), when L is determined, the treatment group and the control group are interchangeable.

For conditional random experiments, we usually have two ways (stratification/matching/ and standardization/ip) to calculate the causal effect.

The first one (matching) is to calculate the causal effect in the subset, because each subset is a marginal random experiment.

The idea of standardization (typical weighting) is very simple, assuming p(Y=1|l=0,A=1)=1/4, p(Y=1|l=1,A=1)=2/3,l=0 the number of people The proportion is 40%, and the proportion of people with l=1 is 60%, then the weighted result:

  # p(Ya1=1)= 
    1/4 *0.40 + 2/3 *0.60 ==0.5

## [1] TRUE

  # the rate of Y

The second one standardization and inverse probability weighting, which are mathematically equivalent. Inverse probability weighting uses the probability of receiving treatment A given the written variable L; standardization uses the probability of the outcome Y given A and L (weighting on each stratification).

The goal of matching is to construct a subgroup of the population in which the distribution of the variable L is the same in the treated and non-treated groups.

Observational studies

If observational studies can be compared to conditional randomized trials, then we can use the methods introduced in the previous charpters. For each causal effect that we wish to calculate in an observational study, we need to describe: (1) what the target trial is; and (2) how to model (generate) the target trial with observational data.

`Here, we refer to these three conditions as identifiability conditions or identifiability assumptions.`

- It is basically certain that there will be a certain number of subjects in each of our treatment groups.

- We need to specify the specific form a of treatment A. If we want to study the effects of exercise, we need to carefully define exercise duration, intensity, form, and so on. Continuously refining our question until it is no longer ambiguous under existing knowledge is one of the basic elements of causal inference.


`Interchangeability`

`Positivity`

`Consistency and Counterfactual outcomes `

However, the observed data can still be used to make non-causal predictions. But the correlation between obesity and mortality will further give rise to various hypotheses and studies to explore the underlying mechanisms.

Effect modification/ interaction

Effect modification

If the mean value of the causal effect of A on Y is not the same under different values of V, then we say that V is a modifier of the causal effect of A on Y. Before we describe the interaction of two variables, we first identify effect modifiers.

The size of the causal effect of treatment A depends on the distribution of modifiers in the population.

For analyzing the effect modification, these methods are divided into two categories: 1) Standardization and inverse probability weighting are used to calculate marginal and conditional effects. 2) Hierarchical analyses and match simply calculate condition effects in subgroups of the population.

Interaction

Interaction is defined in terms of the effects of 2 interventions whereas effect modification is defined in terms of the effect of one intervention varying across strata of a second variable. Effect modification can be present with no interaction; interaction can be present with no effect modification.

If treatment E is assigned randomly, the two concepts of interaction and effect modification coincide.

When two different dichotomous treatments A and E are involved, there are 4 different values for the combined treatment, thus 4 scenarios of counterfactual outcomes for each subject, and thus a total of 16 (4*4) different response types.

Graphical representation of causal effects

If one variable is the cause of the other, or if two variables have a common cause, then the two variables are correlated, otherwise the two variables are (marginal) independent of each other.

In the causal directed acyclic graph, if the direct cause of variable A is controlled, then the variable A is removed as the cause, and A and other variables in the graph are independent of each other (conditional independent).

Mediator A → B → Y: the box around the variable = B blocks the path A → B → Y.

Based on existing expertise, we can draw the structure among variables and then identify possible sources of correlations between interventions and outcomes. Causal diagrams, however, make it difficult to describe the effect modifiers (effect modification, interaction effect, and mediator…). Augmented Cause-Effect Diagram may be a possible approach.

Bias

Confounding bias: When the intervention and the outcome have a co-cause, the measure of association will often be different from the measure of effect. Many epidemiologists use the term “confounding” to refer to this situation (fork path).

Selection bias: Bias derived from this structure is what most epidemiologists call “selection bias” (collide path).

Measurement bias

Confounding

Modification effect and Interaction

(a) effect modification by (G) of medication (E) on (D). (b) DAGs representing the independent effect of exposure E on outcome D and modification of this effect by G.

Interaction

(a) interaction. (b) DAG representing the independent effects of exposures E and G and their combined effect E × G (interaction effect) on D. (c) Potential confounding affecting an interaction effect.

Pure interaction

(a) ‘Pure’ interaction in which neither G nor E has a direct effect on D and they only have an effect when present together. (b) ‘Pure’ interaction in which the causal effect of exposures E and G on outcome D is only present for specific joint values of E and G⁠. Otherwise, it is effect modification.

Selection bias

Mediator

Figure: Variable roles: A = exposure or treatment; Y = outcome; L = confounder; R = risk factor for Y; M = mediator; C = collider; E = effect of Y; I = instrument; u = unmeasured confounder; P = proxy of U; N = noise variable.

Think about the role of variables in the model first:

Ideally include confounders to reduce bias; consider including risk factor for outcome for greater accuracy; IV, collider, mediators, effect of outcome, noise variables should be avoided; if something is unmeasured, consider adding proxy (with caution); If you do not have subject area expertise, talk to experts; do pre-processing: sparse binary variables; highly collinear variables.

References:

1 2

Causal inference introduction

2023-05-16

Introduction