Intro to MR-BRT

05/05/2020

Key takeaways

MR-BRT is a meta-regression tool created by Sasha and Peng, refactored for GBD 2020
The R package mirrors the syntax of the Python package; use help() for R documentation and py_help() for Python documentation
Functions: MRData formats the data, MRBRT and MRBeRT are for running models, and create_draws is for making predictions
Find full examples at https://rpubs.com/rsoren/mrbrt_examples_gbd2020, and find these slides at https://rpubs.com/rsoren/mrbrt_gbd2020

Similarities with other R packages

Think of MR-BRT as a combination of…

Linear regression like lm() – not glm()
Mixed models with lme4
Meta-analysis with metafor
Splines with gam
Bayesian priors like INLA

Also includes:

Z-covariates
“Ensemble knots” for splines
Outlier trimming; Lasso variable selection
“The ratio model” for comparing exposure ranges

A work in progress

Compared to GBD 2019 version, more functionality but syntax is more complicated
Lasso variation selection is new
Still working out versioning – most current is library(mrbrt001, lib.loc = "/ihme/code/mscm/R/packages/") on R 3.6.3
MR-BRT’s niche is functionality, not speed

MR-BRT at IHME

Evidence scoring
Not for crosswalks; we made a separate package for that (https://rpubs.com/rsoren/572598)
Meta-regression of all sorts (e.g. COVID analysis, GBD, cost-effectiveness, etc.)
#friends_of_mr_brt and #mscm-office-hour Slack channels

From linear regression to mixed-effects meta-regression with Z-covariates

Linear regression [1]

\(y_i = \beta_0 + \beta_1 x_i + \varepsilon_i\)

\(\epsilon_i \sim N(0, \sigma^2)\)

Linear regression [2]

## 
## Call:
## lm(formula = y1 ~ x1, data = df_sim1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.299  -7.471   1.136   4.241  17.189 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   1.0783     2.4045   0.448    0.656  
## x1            0.9871     0.4203   2.349    0.023 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.672 on 48 degrees of freedom
## Multiple R-squared:  0.1031, Adjusted R-squared:  0.0844 
## F-statistic: 5.517 on 1 and 48 DF,  p-value: 0.023

Regression vs. meta-regression vs. meta-analysis

Meta-regression is a regression where the dependent variable has uncertainty, or “measurement error”; known \(\sigma^2_i\)

Synthesize results from several studies into a single estimate
Doing a regression on regression results (meta)

Meta-regression is a meta-analysis with covariates, a.k.a. “moderators”

Fixed effects meta-analysis assumes that studies converge on one true effect size
Random effects meta-analysis allows for variation in the true effects; “between-study heterogeneity”

Linear regression vs. GLMs [1]

MR-BRT is not a generalized linear model; need to manually transform the dependent variable into the appropriate space (e.g. log, logit)

For example, a GLM estimates the log of the expectation of \(y\):

\(E(y|x) = exp(\beta_0 + \beta_1 x) \iff log(E(y|x)) = \beta_0 + \beta_1 x\),

whereas transforming \(y\) estimates the expectation of \(log(y)\):

\(E[log(y)|x] = \beta_0 + \beta_1 x\).

This why a GLM like logistic regression can use 0s and 1s, but logit-transforming 0 and 1 doesn’t work.

Linear regression vs. GLMs [2]

When transforming the dependent variable, also need to transform its standard error

COMMON MISTAKE! Do not do this…

# df$y_log <- log(df$y) # fine
# df$y_se_log <- log(df$y_se) # wrong!

Instead, use the delta method:

# library(crosswalk, lib.loc = "/ihme/code/mscm/R/packages/")
# df[, c("y_log", "y_se_log")] <- delta_transform(
#   mean = df$y, sd = df$y_se, transformation = "linear_to_log"
# )

Additive models / splines

\(y_i= \beta_0 +f(x_i)+\varepsilon_i\)

MR-BRT hyperparameters: number of knots, knot location, monotonicity, convexity/concavity, priors, linearity in the tails

Mixed effects regression [1]

\(y_{ij} = \beta_0 + \beta_1 x + u_j + \epsilon_i\)

\(\epsilon_i \sim N(0, \sigma^2) \space \space \space \space \space \space \space u_j \sim N(0, \gamma)\)

Mixed effects regression [2]

How the model sees the world…

\(\beta_0\) = 2.09; \(\beta_1\) = 0.89; \(\varepsilon_i \sim N(0, 0.96^2)\); \(u_j \sim N(0, 11.0^2)\)

Mixed effects meta-regression with Z-covariates

\(y_{ij} = (\beta_0 + u_{0j}) + x_{ij}(\beta_1 + u_{1j}) + \epsilon_{ij}\)

Equivalently, can think of Z-covariates as part of a linear predictor of the variance:

Between-study heterogeneity = \(\gamma_0 + \gamma_1z_1\)

Question: Should we incorporate between-study heterogeneity into the uncertainty of an estimated effect size?

Recap so far

In the MR-BRT framework, standard errors appear as: 1) uncertainty around the dependent variable in a meta-regression, and 2) uncertainty around estimated betas (posterior distribution)
Between-study heterogeneity: variation in true effects; expected value remains the same with \(\uparrow n\); often denoted as Greek symbol tau (metafor) or gamma (MR-BRT)
Z-covariates: predictors of the magnitude of between-study heterogeneity, a.k.a. “variance covariates”; corresponds to the idea of random slopes in lme4

Examples: https://rpubs.com/rsoren/mrbrt_examples_gbd2020