GBD training, July 2021

Key takeaways

  • MR-BRT is a meta-regression tool created by Sasha and Peng
  • The R package mirrors the syntax of the Python package; use py_help() to see Python docstrings
  • Functions: MRData formats the data, MRBRT and MRBeRT are for running models, and predict is for making predictions
  • For uncertainty, create_draws uses fit-refit (slower) and core$other_sampling$sample_simple_lme_beta uses asymptotic statistics (faster but only accurate for certain models)
  • Find full examples in the Health Metrics Toolbox: hub.ihme.washington.edu/display/MSCA/Math+Sciences+Team

Similarities with other R packages

Think of MR-BRT as a combination of…

  • Linear regression like lm() – not glm()
  • Mixed models with lme4
  • Meta-analysis with metafor
  • Splines with gam
  • Bayesian priors like INLA

Also includes:

  • “Ensemble knots” for splines
  • Outlier trimming and automated covariate selection
  • “The ratio model” for comparing exposure ranges
  • Bayesian spline cascade

MR-BRT at IHME

  • Evidence scoring

  • Not for crosswalks; we made a separate package for that (https://rpubs.com/rsoren/572599)

  • Meta-regression of all sorts (e.g. COVID analysis, GBD, cost-effectiveness, etc.)

  • #friends_of_mr_brt and #mscm-office-hour Slack channels

  • library(mrbrt001, lib.loc = "/ihme/code/mscm/R/packages/") on R version 3; library(mrbrt002, lib.loc = "/ihme/code/mscm/Rv4/packages/") on R version 4

From linear regression to mixed-effects meta-regression with Z-covariates

Linear regression [1]

\(y_i = \beta_0 + \beta_1 x_i + \varepsilon_i\)

\(\epsilon_i \sim N(0, \sigma^2)\)

Linear regression [2]

## 
## Call:
## lm(formula = y1 ~ x1, data = df_sim1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.299  -7.471   1.136   4.241  17.189 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   1.0783     2.4045   0.448    0.656  
## x1            0.9871     0.4203   2.349    0.023 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.672 on 48 degrees of freedom
## Multiple R-squared:  0.1031, Adjusted R-squared:  0.0844 
## F-statistic: 5.517 on 1 and 48 DF,  p-value: 0.023

Regression vs. meta-regression vs. meta-analysis

Meta-regression is a regression where the dependent variable has uncertainty, or “measurement error”; known \(\sigma^2_i\)

  • Synthesize results from several studies into a single estimate
  • Doing a regression on regression results (meta)

Meta-regression is a meta-analysis with covariates, a.k.a. “moderators”

  • Fixed effects meta-analysis assumes that studies converge on one true effect size
  • Random effects meta-analysis allows for variation in the true effects; “between-study heterogeneity”

Linear regression vs. GLMs [1]

MR-BRT is not a generalized linear model; need to manually transform the dependent variable into the appropriate space (e.g. log, logit)

For example, a GLM estimates the log of the expectation of \(y\):

\(E(y|x) = exp(\beta_0 + \beta_1 x) \iff log(E(y|x)) = \beta_0 + \beta_1 x\),

whereas transforming \(y\) estimates the expectation of \(log(y)\):

\(E[log(y)|x] = \beta_0 + \beta_1 x\).

This why a GLM like logistic regression can use 0s and 1s, but logit-transforming 0 and 1 doesn’t work.

Linear regression vs. GLMs [2]

When transforming the dependent variable, also need to transform its standard error

COMMON MISTAKE! Do not do this…

# df$y_log <- log(df$y) # fine
# df$y_se_log <- log(df$y_se) # wrong!

Instead, use the delta method:

# library(crosswalk, lib.loc = "/ihme/code/mscm/R/packages/")
# df[, c("y_log", "y_se_log")] <- delta_transform(
#   mean = df$y, sd = df$y_se, transformation = "linear_to_log"
# )

Additive models / splines

\(y_i= \beta_0 +f(x_i)+\varepsilon_i\)

MR-BRT hyperparameters: number of knots, knot location, monotonicity, convexity/concavity, priors, linearity in the tails

Mixed effects regression [1]

\(y_{ij} = \beta_0 + \beta_1 x + u_j + \epsilon_i\)

\(\epsilon_i \sim N(0, \sigma^2) \space \space \space \space \space \space \space u_j \sim N(0, \gamma)\)

Mixed effects regression [2]

How the model sees the world…

\(\beta_0\) = 2.09; \(\beta_1\) = 0.89; \(\varepsilon_i \sim N(0, 0.96^2)\); \(u_j \sim N(0, 11.0^2)\)

Mixed effects meta-regression with Z-covariates

\(y_{ij} = (\beta_0 + u_{0j}) + x_{ij}(\beta_1 + u_{1j}) + \epsilon_{ij}\)

Equivalently, can think of Z-covariates as part of a linear predictor of the variance:

Between-study heterogeneity = \(\gamma_0 + \gamma_1z_1\)

Question: Should we incorporate between-study heterogeneity into the uncertainty of an estimated effect size?

Recap so far

  • In the MR-BRT framework, standard errors appear as: 1) uncertainty around the dependent variable in a meta-regression, and 2) uncertainty around estimated betas (posterior distribution)

  • Between-study heterogeneity: variation in true effects; expected value remains the same with \(\uparrow n\); often denoted as Greek symbol tau (metafor) or gamma (MR-BRT)

  • Z-covariates: predictors of the magnitude of between-study heterogeneity, a.k.a. “variance covariates”; corresponds to the idea of random slopes in lme4

Find full examples in the Health Metrics Toolbox: hub.ihme.washington.edu/display/MSCA/Math+Sciences+Team