Introduction to MR-BRT

GBD training, July 2021

Key takeaways

MR-BRT is a meta-regression tool created by Sasha and Peng
The R package mirrors the syntax of the Python package; use py_help() to see Python docstrings
Functions: MRData formats the data, MRBRT and MRBeRT are for running models, and predict is for making predictions
For uncertainty, create_draws uses fit-refit (slower) and core$other_sampling$sample_simple_lme_beta uses asymptotic statistics (faster but only accurate for certain models)
Find full examples in the Health Metrics Toolbox: hub.ihme.washington.edu/display/MSCA/Math+Sciences+Team

Similarities with other R packages

Think of MR-BRT as a combination of…

Linear regression like lm() – not glm()
Mixed models with lme4
Meta-analysis with metafor
Splines with gam
Bayesian priors like INLA

Also includes:

“Ensemble knots” for splines
Outlier trimming and automated covariate selection
“The ratio model” for comparing exposure ranges
Bayesian spline cascade

MR-BRT at IHME

Evidence scoring
Not for crosswalks; we made a separate package for that (https://rpubs.com/rsoren/572599)
Meta-regression of all sorts (e.g. COVID analysis, GBD, cost-effectiveness, etc.)
#friends_of_mr_brt and #mscm-office-hour Slack channels
library(mrbrt001, lib.loc = "/ihme/code/mscm/R/packages/") on R version 3; library(mrbrt002, lib.loc = "/ihme/code/mscm/Rv4/packages/") on R version 4

From linear regression to mixed-effects meta-regression with Z-covariates

Linear regression [1]

$y_i = \beta_0 + \beta_1 x_i + \varepsilon_i$

$\epsilon_i \sim N(0, \sigma^2)$

Linear regression [2]

## 
## Call:
## lm(formula = y1 ~ x1, data = df_sim1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.299  -7.471   1.136   4.241  17.189 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   1.0783     2.4045   0.448    0.656  
## x1            0.9871     0.4203   2.349    0.023 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.672 on 48 degrees of freedom
## Multiple R-squared:  0.1031, Adjusted R-squared:  0.0844 
## F-statistic: 5.517 on 1 and 48 DF,  p-value: 0.023

Regression vs. meta-regression vs. meta-analysis

Meta-regression is a regression where the dependent variable has uncertainty, or “measurement error”; known $\sigma^2_i$

Synthesize results from several studies into a single estimate
Doing a regression on regression results (meta)

Meta-regression is a meta-analysis with covariates, a.k.a. “moderators”

Fixed effects meta-analysis assumes that studies converge on one true effect size
Random effects meta-analysis allows for variation in the true effects; “between-study heterogeneity”

Linear regression vs. GLMs [1]

MR-BRT is not a generalized linear model; need to manually transform the dependent variable into the appropriate space (e.g. log, logit)

For example, a GLM estimates the log of the expectation of $y$:

$E(y|x) = exp(\beta_0 + \beta_1 x) \iff log(E(y|x)) = \beta_0 + \beta_1 x$,

whereas transforming $y$ estimates the expectation of $log(y)$:

$E[log(y)|x] = \beta_0 + \beta_1 x$.

This why a GLM like logistic regression can use 0s and 1s, but logit-transforming 0 and 1 doesn’t work.

Linear regression vs. GLMs [2]

When transforming the dependent variable, also need to transform its standard error

COMMON MISTAKE! Do not do this…

# df$y_log <- log(df$y) # fine
# df$y_se_log <- log(df$y_se) # wrong!

Instead, use the delta method:

# library(crosswalk, lib.loc = "/ihme/code/mscm/R/packages/")
# df[, c("y_log", "y_se_log")] <- delta_transform(
#   mean = df$y, sd = df$y_se, transformation = "linear_to_log"
# )

Additive models / splines

$y_i= \beta_0 +f(x_i)+\varepsilon_i$

MR-BRT hyperparameters: number of knots, knot location, monotonicity, convexity/concavity, priors, linearity in the tails

Mixed effects regression [1]

$y_{ij} = \beta_0 + \beta_1 x + u_j + \epsilon_i$

$\epsilon_i \sim N(0, \sigma^2) \space \space \space \space \space \space \space u_j \sim N(0, \gamma)$

Mixed effects regression [2]

How the model sees the world…

$\beta_0$ = 2.09; $\beta_1$ = 0.89; $\varepsilon_i \sim N(0, 0.96^2)$; $u_j \sim N(0, 11.0^2)$

Mixed effects meta-regression with Z-covariates

$y_{ij} = (\beta_0 + u_{0j}) + x_{ij}(\beta_1 + u_{1j}) + \epsilon_{ij}$

Equivalently, can think of Z-covariates as part of a linear predictor of the variance:

Between-study heterogeneity = $\gamma_0 + \gamma_1z_1$

Question: Should we incorporate between-study heterogeneity into the uncertainty of an estimated effect size?

Recap so far

In the MR-BRT framework, standard errors appear as: 1) uncertainty around the dependent variable in a meta-regression, and 2) uncertainty around estimated betas (posterior distribution)
Between-study heterogeneity: variation in true effects; expected value remains the same with $\uparrow n$; often denoted as Greek symbol tau (metafor) or gamma (MR-BRT)
Z-covariates: predictors of the magnitude of between-study heterogeneity, a.k.a. “variance covariates”; corresponds to the idea of random slopes in lme4

Find full examples in the Health Metrics Toolbox: hub.ihme.washington.edu/display/MSCA/Math+Sciences+Team