Compositional form of joint densities: \(f(x_1, x_2, ..., x_p) = f(x_1)f(x_2|x_1)f(x_3|x_1, x_2)...f(x_p|x_{p-1})\)
Conditional Independence: \(f(x_1|x_2, x_3) = f(x_1|x_2)\), \(X_1\) is independent of \(X_3\) given \(X_2\)
A graph \(G = (V, E)\) consists of two sets \(V\) and \(E\). The elements of \(V\) are called the vertices and the elements of \(E\) the edges of \(G\). Each edge is a pair of vertices. For instance, the sets \(V = {1, 2, 3, 4, 5}\) and \(E = {{1, 2}, {2, 3}, {3, 4}, {4, 5}}\) define a graph with 5 vertices and 4 edges.
Adjancency: If \(uv \in E(G)\), then \(u, v\) are said to be adjacent, in which case we also say that \(u\) is connected to \(v\) or \(u\) is a neighbour of \(v\). If \(uv \notin E(G)\), then \(u\) and \(v\) are nonadjacent (not connected, non-neighbours).
Neighborhood: The neighbourhood of a vertex \(v \in V (G)\), denoted \(N(v)\), is the set of vertices adjacent to \(v\), i.e. \(N(v) = {u \in V (G) | vu \in E(G)}\). The closed neighbourhood of \(v\) is denoted and defined as follows: \(N[v] = N(v) \cup {v}\).
Diameter: The diameter of a connected graph, denoted \(diam(G)\), is max \(a,b \in V (G) dist(a, b)\).
Directed graphs: Draw a node for each variable and a line from \(X_j\) going into \(X_i\) if \(X_j\) appears in the conditional distribution of \(X_i\)
Joint distribution: \(f(x_1, x_2, ..., x_p) = \Pi_{j = 1}^{p}f(x_j | parents(x_j))\)
Markov property: \(X_j\) is independent of all non descendants of \((x_j|parents(x_j))\)
Colliders: A collider is a node/variable \(Z\) in a DAG that sits on an undirected path between two other variables \(X_j\) and \(X_i\), where both the paths have arrows into \(Z\)
“d-seperated”: If every path contains at least one blocked variable, then \(D\) and \(Y\) are conditionally independent given \(S\) and we say that they are “d-separated” by \(S\) (or that \(S\) “d-separates” them). The \(d\) refers to “directed”
Regression adjustment: Delete any causal paths from \(D\) to \(Y\), and with the new graph \(G^-\) determine if \(Y\) and \(D\) are independent
library(ggdag)
library(tidyverse)
theme_set(theme_dag())
# Randomized Experiment (vii)
dagify(Y ~ D) %>% ggdag()
# Three variable DAG(acyclic graph)
dagify(y ~ x, x ~ a, a ~ y) %>% ggdag()
ggdag
is more specifically concerned with structural causal models (SCMs): DAGs that portray causal assumptions about a set of variables. Beyond being useful conceptions of the problem we’re working on (which they are), this also allows us to lean on the well-developed links between graphical causal paths and statistical associations. Causal DAGs are mathematically grounded, but they are also consistent and easy to understand. Thus, when we’re assessing the causal effect between an exposure and an outcome, drawing our assumptions in the form of a DAG can help us pick the right model without having to know much about the math behind it. Another way to think about DAGs is as non-parametric structural equation models (SEM): we are explicitly laying out paths between variables, but in the case of a DAG, it doesn’t matter what form the relationship between two variables takes, only its direction. The rules underpinning DAGs are consistent whether the relationship is a simple, linear one, or a more complicated function.
Let’s say we’re looking at the relationship between smoking and cardiac arrest. We might assume that smoking causes changes in cholesterol, which causes cardiac arrest:
smoking_ca_dag <- dagify(cardiacarrest ~ cholesterol,
cholesterol ~ smoking + weight,
smoking ~ unhealthy,
weight ~ unhealthy,
labels = c("cardiacarrest" = "Cardiac\n Arrest",
"smoking" = "Smoking",
"cholesterol" = "Cholesterol",
"unhealthy" = "Unhealthy\n Lifestyle",
"weight" = "Weight"),
latent = "unhealthy",
exposure = "smoking",
outcome = "cardiacarrest")
ggdag(smoking_ca_dag, text = FALSE, use_labels = "label")
Consider the graph \(G^-\) where the causal link between the treatment \(D\) and \(Y\) is deleted. This graph must add an edge from \(D\) to the children of \(Y\) to be a “proper conditioning set”
#Initial CDAG
dagify(
xd ~ xy,
d ~ xd,
y ~ d,
y ~ xy,
b ~ y,
labels = c(
"d" = "Treatment",
"y" = "Respnse",
"b" = "Post\n Treatment"
)
) %>%
ggdag(use_labels = "label")
# Set sample size
n <- 1000
# Treatment effect
tau <- 3
# Random draws for binomial distribution
xy <- rbinom(n, 1, 0.5)
xd <- rbinom(n, 1, 0.5 * xy + 0.25)
# Treatment
D = rbinom(n, 1, 0.5 * xd + 0.25)
# Model
y = tau * D - xy + rnorm(n)
b = rbinom(n, 1, 0.9 - 0.7 * (y < 0))
# Naive estimate of tau
tau.naive = mean(y[D == 1]) - mean(y[D == 0])
print(tau.naive)
## [1] 2.818068
#Conditional Average Treatment effects (CATE)
tau.xy = mean(xy==1)*(mean(y[D==1 & xy==1]) - mean(y[D==0 & xy==1])) + mean(xy==0)*(mean(y[D==1 & xy==0]) - mean(y[D==0 & xy==0]))
print(tau.xy)
## [1] 3.052023