Marginal effects, with marginaleffects

In our final lab, I wanted to give a lasting endorsement for a toolkit you should probably use every time you estimate a model for a paper or a talk.

Model coefficients are really hard to interpret. It’s probably the case that you don’t understand your model’s findings until you’ve converted them into a table or a figure

A motivating example

Perhaps we’re looking to model turnout in the 2012 election, as a function of a respondent’s race, education, age, and their interactions

library(tidyverse)
library(magrittr)
library(marginaleffects)

t2 <- "https://github.com/thomasjwood/ps7160/raw/master/anes_2012_turnout.rds" %>% 
  url %>% 
  readRDS

gm1 <- glm(
  turnout ~ age * (race + educ),
  data = t2, 
  family = binomial()
) 

gm1 %>% 
  summary
## 
## Call:
## glm(formula = turnout ~ age * (race + educ), family = binomial(), 
##     data = t2)
## 
## Coefficients:
##                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)          -0.400481   0.225266  -1.778 0.075435 .  
## age                   0.021853   0.005890   3.710 0.000207 ***
## raceBlack             0.015305   0.290610   0.053 0.957998    
## raceHispanic         -0.521592   0.279403  -1.867 0.061928 .  
## raceOther             1.288741   0.536648   2.401 0.016330 *  
## raceAsian            -0.773851   1.064491  -0.727 0.467245    
## educSome college      0.532496   0.253191   2.103 0.035454 *  
## educBA or more        1.163176   0.307336   3.785 0.000154 ***
## age:raceBlack         0.008375   0.008835   0.948 0.343150    
## age:raceHispanic      0.006219   0.008983   0.692 0.488733    
## age:raceOther        -0.042793   0.013844  -3.091 0.001995 ** 
## age:raceAsian         0.023571   0.036685   0.643 0.520533    
## age:educSome college  0.007811   0.007778   1.004 0.315300    
## age:educBA or more    0.001068   0.008903   0.120 0.904547    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2126.8  on 1705  degrees of freedom
## Residual deviance: 1959.1  on 1692  degrees of freedom
## AIC: 1987.1
## 
## Number of Fisher Scoring iterations: 4

Even armed with our intuitions–gosh, this is imposing. The age effect is large and positive– about half a percentage point among Whites with only a HS education–with only Other races interacting.

From this regression table we might tell a story of modest variance of the age and education effects. Or we could actually check!

gm1 %>% 
  predictions(
    newdata = datagrid(
      age = 18:65,
      race = t2$race %>% levels,
      educ = t2$educ %>% levels %>% extract(c(1, 3))
    )
  ) %>% 
  tidy %>% 
  ggplot(
    aes(
      age, 
      ymin = conf.low,
      ymax = conf.high,
      fill = educ,
      color = educ
    )
  ) +
  geom_ribbon(
    alpha = .4,
    outline.type = "full"
  ) +
  facet_wrap(~race) +
  theme(
    legend.position = c(5/6, 1/4)
  )

Gosh—is it even a question, that with ~20 extra lines we learned way more from our model than the regression table communicated? The age slope for ‘Other’ racial groups flipped!

marginaleffects is a perfect example of what makes R great, and why people who focus on computational concerns (speed, memory efficiency, distributed computing etc.)1, the real advantage of R is the community. Specifically, the number of methodologists in the academy, especially in the social sciences, building all these incredible tools.

Vincent Arel-Bundock, a Michigan PhD and associate at University of Montreal, is a young person who’s been incredibly assiduous updating this package.2 A new model comes out, or a new R implementation of an existing model is released, and Vincent is adapting his package to work with it. Incredibly, at present, marginaleffects supports 120 model types!

marginaleffects terminology

A couple of pieces of statistical terminology are necessary to discuss the package

  • predictions are outcomes, or expected values, for some value of the predictor variables

  • comparisons are differences in outcomes, for some value of the predictor variables

  • slopes are first derivatives of the regression equation wrt a predictor of interests

Of vital importance: marginaleffects will allow you to estimate unit level quantities, where these estimates are varying for each observed value of all the predictive quantities.

Separately, the package will also allow you to marginalize, or average over, different levels of a predictive covariate. Imagine you interact an experimental treatment with partisanship, but you want to report an overall effect of the treatment before you estimate partisan effects. Do you want the average effect to reflect the empirically observed levels of partisanship in your data, or do you want it to assume that you have the distribution of party that we observe in a state’s voter file, or be perfectly balanced, or some other quantity? That’s what averaging over lets you do.

An important clarification might help. Marginal has two distinct meanings in this context:

  1. A “marginal effect” refers to a minuscule (or marginal) change in a regressor and its effect on the quantity (that is, a slope, or first derivative

  2. In “marginal means”, we’re marginalizing across values of the predictive covariates.

A simple example might motivate this distinction

lm1 <- lm(mpg ~ hp * wt * am, data = mtcars)

predictions(lm1)
## 
##  Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
##      22.5      0.884 25.44   <0.001 471.7  20.8   24.2
##      20.8      1.194 17.42   <0.001 223.3  18.5   23.1
##      25.3      0.709 35.66   <0.001 922.7  23.9   26.7
##      20.3      0.704 28.75   <0.001 601.5  18.9   21.6
##      17.0      0.712 23.88   <0.001 416.2  15.6   18.4
## --- 22 rows omitted. See ?print.marginaleffects ---
##      29.6      1.874 15.80   <0.001 184.3  25.9   33.3
##      15.9      1.311 12.13   <0.001 110.0  13.3   18.5
##      19.4      1.145 16.95   <0.001 211.6  17.2   21.7
##      14.8      2.017  7.33   <0.001  42.0  10.8   18.7
##      21.5      1.072 20.02   <0.001 293.8  19.4   23.6
## Type: response

versus

lm1 %>% 
  avg_predictions(
    variables = "am"
  )
## 
##  am Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %
##   0     19.3       1.02 19.0   <0.001 265.4  17.3   21.3
##   1     19.3       1.55 12.4   <0.001 115.6  16.2   22.3
## 
## Type: response
lm1 %>% 
  avg_predictions(
    variables = list(
      wt = c(2.0, 3.0, 4.0)
    )
  )
## 
##  wt Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %
##   2     23.0       1.29 17.9   <0.001 234.3  20.4   25.5
##   3     19.0       0.72 26.4   <0.001 508.6  17.6   20.4
##   4     15.1       1.32 11.4   <0.001  98.2  12.5   17.7
## 
## Type: response
avg_predictions(
  lm1,
  variables = list(
    hp = c(80, 120, 160),
    wt = c(2.5, 3.5)
  )
)
## 
##   hp  wt Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %
##   80 2.5     24.3      1.278 19.0   <0.001 264.9  21.8   26.8
##   80 3.5     19.0      1.253 15.2   <0.001 170.7  16.6   21.5
##  120 2.5     22.4      0.976 22.9   <0.001 383.8  20.5   24.3
##  120 3.5     18.0      0.978 18.4   <0.001 248.6  16.1   19.9
##  160 2.5     20.5      0.974 21.0   <0.001 322.8  18.5   22.4
##  160 3.5     16.9      0.805 21.1   <0.001 324.9  15.4   18.5
## 
## Type: response
lm1 %>% 
  avg_predictions(
    variables = list(wt = c(2.5, 3.5, 4.5)),
    newdata = subset(mtcars, am == 1)
  )
## 
##   wt Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
##  2.5     22.6      0.883 25.58  < 0.001 477.1 20.86   24.3
##  3.5     16.5      2.048  8.07  < 0.001  50.3 12.51   20.5
##  4.5     10.5      3.927  2.66  0.00778   7.0  2.76   18.1
## 
## Type: response

Comparisons

The real gamechanger for marginaleffects is when people want to report experimental effects. This is a huge problem with the way many people do this experimental work – they want to analyse experimental differences, but they end up reporting experimental means !

Applied Example

TESS grant recipients have their data posted on Roper within 2 years of their collection. Here we’re going to test a set of survey experiments seeing if whites’ racial attitudes condition their Covid-19 public health attitudes. The codebook is here.

t2 <- "https://github.com/thomasjwood/ps7160/raw/refs/heads/master/roper_31120363_long.rds" %>% 
  url %>% 
  readRDS

It’s a two by one vignette experiment:


We’re going to check the outcomes on seven outcome variables: Q14:20

t3 <- t2 %>% 
  group_by(dv) %>% 
  nest %>% 
  cross_join(
    expand_grid(
      frm = c(
        "dv_val ~ P_EXP",
        "dv_val ~ P_EXP * P_PARTYI",
        "dv_val ~ P_EXP * EDUC4",
        "dv_val ~ P_EXP * EDUC4 * P_PARTYI"
      )
    )
  ) %>% 
  mutate(
    by_args = map(
      frm, 
      \(x) {
        
        by_vars <- all.vars(as.formula(x)) %>% 
          str_remove_all(
            c("dv_val|P_EXP")
          ) %>% 
          stringi::stri_omit_empty()
        if(
          length(by_vars) %>% 
          equals(0)
          ){
          return(NULL) 
          } else{
            return(by_vars)
            }
        }
      ),
    mods = map2(
      data, 
      frm, 
      \(df, f_str)
      lm(as.formula(f_str), data = df)
      ),
    comps = pmap(
      list(mods, data, by_args),
      \(m, df, b) {
        if (!is.null(b)) {
          avg_comparisons(
            m,
            variables = "P_EXP",
            newdata = df, 
            by = b)
        } else {
          avg_comparisons(
            m, 
            variables = "P_EXP",
            newdata = df
            )
        }
      }
    )
  )

With our nice long nested tibble, estimating the 21 models is a snap

t3$mods <- t3$data %>% 
  map2(
    t3$frm,
    \(i, j)
    j %>% 
      lm(i)
  )

Did any of these DVs move with the vignettes?

library(modelsummary)

t3 %>% 
  filter(
    frm %>% equals("dv_val ~ P_EXP")
  ) %>% 
  use_series(mods) %>% 
  set_attr(
    "names", 
    t3 %>% 
      filter(
        frm %>% equals("dv_val ~ P_EXP")
      ) %>% 
      use_series(dv)
  ) %>% 
  modelsummary(
    gof_map = "nobs", 
    stars = T
  )
Q14 Q15 Q16 Q17 Q18 Q19 Q20
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
(Intercept) 2.917*** 3.199*** 2.615*** 2.948*** 3.540*** 2.920*** 3.721***
(0.079) (0.075) (0.079) (0.078) (0.076) (0.071) (0.073)
P_EXPExperimental group B -0.065 0.003 -0.057 -0.014 0.024 0.050 -0.038
(0.110) (0.104) (0.110) (0.108) (0.106) (0.099) (0.102)
Num.Obs. 591 588 587 588 590 590 590

Hmmm. Well gosh. so much for q racial covid backlash. Maybe… party id differences?

t3 %>%
  filter(
    frm %>% 
      equals("dv_val ~ P_EXP * P_PARTYI")
  ) %>%
  use_series(mods) %>%
  set_attr(
    "names",
    t3 %>%
      filter(
        frm %>% equals("dv_val ~ P_EXP")
      ) %>%
      use_series(dv)
  ) %>%
  modelsummary(
    gof_map = "nobs",
    stars = T
  )
Q14 Q15 Q16 Q17 Q18 Q19 Q20
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
(Intercept) 2.038*** 3.962*** 1.792*** 3.764*** 4.349*** 3.528*** 4.425***
(0.108) (0.105) (0.109) (0.112) (0.108) (0.102) (0.108)
P_EXPExperimental group B 0.032 -0.032 -0.021 -0.220 -0.182 0.171 -0.100
(0.150) (0.146) (0.152) (0.156) (0.150) (0.142) (0.150)
P_PARTYIIndependent 0.637** -0.450* 0.636** -0.578** -0.563** -0.435* -0.820***
(0.201) (0.195) (0.205) (0.208) (0.203) (0.190) (0.201)
P_PARTYIRepublican 1.722*** -1.518*** 1.643*** -1.579*** -1.597*** -1.168*** -1.304***
(0.146) (0.143) (0.149) (0.152) (0.147) (0.139) (0.147)
P_EXPExperimental group B × P_PARTYIIndependent 0.168 -0.280 0.242 0.333 0.322 -0.164 0.395
(0.286) (0.278) (0.291) (0.297) (0.288) (0.272) (0.286)
P_EXPExperimental group B × P_PARTYIRepublican -0.260 0.162 -0.200 0.370+ 0.387+ -0.188 0.081
(0.203) (0.198) (0.206) (0.211) (0.204) (0.193) (0.204)
Num.Obs. 565 562 561 562 564 564 564

Mmmm I struggle to do mental arithmetic, summing three decimal places. or a dreaded three way interaction!

t3 %>%
  filter(
    frm %>% 
      equals("dv_val ~ P_EXP * EDUC4 * P_PARTYI")
  ) %>%
  use_series(mods) %>%
  set_attr(
    "names",
    t3 %>%
      filter(
        frm %>% equals("dv_val ~ P_EXP * EDUC4 * P_PARTYI")
      ) %>%
      use_series(dv)
  ) %>%
  modelsummary(
    coef_omit = "^(?!.*P_EXPE).*$",
    gof_map = "nobs",
    stars = T
  )
Q14 Q15 Q16 Q17 Q18 Q19 Q20
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
P_EXPExperimental group B 0.092 0.275 -0.987* 0.118 0.065 0.810+ 0.451
(0.455) (0.443) (0.454) (0.481) (0.458) (0.429) (0.454)
P_EXPExperimental group B × EDUC4Some college -0.248 -0.112 0.969+ -0.353 -0.091 -0.405 -0.589
(0.510) (0.498) (0.510) (0.538) (0.514) (0.481) (0.509)
P_EXPExperimental group B × EDUC4BA or above 0.003 -0.526 1.056* -0.387 -0.428 -0.937* -0.576
(0.504) (0.491) (0.504) (0.532) (0.508) (0.476) (0.503)
P_EXPExperimental group B × P_PARTYIIndependent 0.565 -0.446 1.643* 0.458 -0.176 -1.063+ -0.310
(0.673) (0.655) (0.672) (0.707) (0.678) (0.635) (0.672)
P_EXPExperimental group B × P_PARTYIRepublican -0.268 -0.404 1.071+ -0.182 0.230 -0.553 -0.366
(0.549) (0.539) (0.549) (0.581) (0.554) (0.518) (0.548)
P_EXPExperimental group B × EDUC4Some college × P_PARTYIIndependent -0.122 -0.027 -1.372+ -0.088 0.700 0.843 0.935
(0.790) (0.769) (0.790) (0.829) (0.797) (0.745) (0.788)
P_EXPExperimental group B × EDUC4BA or above × P_PARTYIIndependent -1.104 0.392 -1.990* -0.300 0.428 1.273 0.518
(0.859) (0.836) (0.857) (0.901) (0.865) (0.810) (0.857)
P_EXPExperimental group B × EDUC4Some college × P_PARTYIRepublican 0.213 0.320 -1.180+ 0.714 0.010 0.058 0.404
(0.629) (0.617) (0.629) (0.664) (0.634) (0.594) (0.628)
P_EXPExperimental group B × EDUC4BA or above × P_PARTYIRepublican -0.029 0.906 -1.552* 0.502 0.197 0.507 0.382
(0.636) (0.623) (0.635) (0.671) (0.641) (0.600) (0.635)
Num.Obs. 565 562 561 562 564 564 564

So–we can do all this much more simply, with a nice simple plot!

t3 %>% 
  filter(
    frm %>% 
      str_count("\\*") %>% 
      equals(2) %>% 
      not
  ) %>% 
  select(
    dv, comps
  ) %>% 
  unnest %>% 
  mutate(
    unconditioned = case_when(
      P_PARTYI %>% 
        is.na &
      EDUC4 %>% 
        is.na  ~ "Unconditioned", 
      .default = NA
    )
  ) %>% 
  pivot_longer(
    P_PARTYI:unconditioned, 
    values_drop_na = T
  ) %>% 
  mutate(
    across(
      name:value,
      fct_inorder
    )
  ) %>% 
  ggplot(
    aes(
      xmin = conf.low,
      xmax = conf.high,
      x = estimate,
      y = value,
    )
  ) +
  geom_vline(
    xintercept = 0,
    linetype = "dashed",
    linewidth = .3
  ) +
  geom_pointrange(
    position = position_dodge(orientation = "y")
  ) +
  facet_grid(
    name ~ dv, 
    scales = "free_y",
    space = "free_y"
  ) +
  theme(
    legend.position = "none"
  )


  1. In an age of historically cheap memory, almost all social science data analytics tasks can fit in a computer’s memory.↩︎

  2. As of this morning, he’s made 3 commits in the last 24 hours.↩︎