library(metafor)
library(rotl)
library(ape)
library(orchaRd)
One way to assess the robustness of a meta-analytic model is by re-running the model without one or more effect sizes and then comparing the results. A common approach is to re-run the model multiple times, each time removing the effect sizes from a different study. This allows you to see how the overall effect estimate changes when each study is excluded.
This is a leave-one-out analysis (Nakagawa et al. 2023). In that example it leaves out one study at a time, but you can leave out any group you think can have a large influence on the overall effect estimate. In ecology and evolution meta-analyses, it can be useful to check how the model changes when certain species are left out.
There are two functions for this: leave_one_out() and
orchard_leave1out(). The first one is to run the analysis,
and the second one is to plot the results.
Using the data from English and Uller (2016), here is how to run a leave-one-out and plot the results.
First, calculate the effect sizes and fit the model.
data(english)
# Create a new column with 'Author, year' format
english$author_year <- paste(english$Author, english$Year)
# We need to calculate the effect sizes, in this case d
english <- escalc(measure = "SMD",
n1i = NStartControl,
sd1i = SD_C,
m1i = MeanC,
n2i = NStartExpt,
sd2i = SD_E,
m2i = MeanE,
var.names = c("SMD", "vSMD"),
data = english)
english_MA <- rma.mv(yi = SMD,
V = vSMD,
random = list(~1 | StudyNo,
~1 | EffectID),
data = english)
leave_one_out() will re-run the model without one study
at a time. The group argument is used to specify the
grouping variable. In this case, we use the author_year
variable:
loo_test <- leave_one_out(english_MA, group = "author_year")
loo_test
## name estimate lowerCL upperCL lowerPR upperPR
## 1 Barrett 2009 0.036840076 -0.10843060 0.1821108 -0.6473088 0.7209890
## 2 Cheney 1983 0.073052597 -0.07361977 0.2197250 -0.6701897 0.8162949
## 3 Langley-Evans 2006 0.065451475 -0.08499050 0.2158934 -0.6997229 0.8306259
## 4 Lynch 1983 0.050987848 -0.09708246 0.1990582 -0.7041133 0.8060890
## 5 Partadiredja 2010 0.056172007 -0.09075380 0.2030978 -0.6947198 0.8070639
## 6 Sayer 2001 0.045146034 -0.09978415 0.1900762 -0.6985299 0.7888220
## 7 Sun 2009 0.073594316 -0.07177262 0.2189612 -0.6670227 0.8142114
## 8 Tu 2003 0.055826503 -0.09278047 0.2044335 -0.6991665 0.8108195
## 9 Zajistchek 2009 0.067691642 -0.08301816 0.2184014 -0.6945776 0.8299609
## 10 Mestre 2012 0.005780273 -0.10955147 0.1211120 -0.6096331 0.6211937
## 11 Saasatamoinen 2013 0.044986574 -0.11003772 0.2000109 -0.7483799 0.8383530
## 12 Saasatamoinen 2010 0.066376765 -0.08545305 0.2182066 -0.6957402 0.8284937
## 13 Ozanne 2004 0.055386829 -0.09963743 0.2104111 -0.6577480 0.7685216
## 14 Zwaan 1991 0.060321715 -0.09092165 0.2115651 -0.7197851 0.8404285
## 15 Boggs 2005 0.040534796 -0.10134201 0.1824116 -0.6897129 0.7707825
## 16 May 2015 0.079633784 -0.06038675 0.2196543 -0.6481933 0.8074608
## 17 Dmitriew 2009 0.066844417 -0.08318506 0.2168739 -0.6970592 0.8307480
## 18 Heidinger 2012 0.067753930 -0.08004524 0.2155531 -0.6831354 0.8186432
## 19 Houslay 2015 0.063864061 -0.08664902 0.2143771 -0.6997483 0.8274764
## 20 Kelly 2014 0.073890324 -0.07409090 0.2218715 -0.6724212 0.8202019
We can plot the results with orchard_leave1out(). This
function takes the output of leave_one_out() and creates an
orchard plot. The y-axis shows which study was left out.
orchard_leave1out(leave1out = loo_test,
xlab = "SMD",
ylab = "Study left out",
trunk.size = 1.2,
branch.size = 1.5,
alpha = 0.08,
legend.pos = "top.out")
The dashed red lines show the 95% confidence interval for the
overall effect size in the original model. You can removed them by
setting ci_lines = FALSE. To change the color, use
ci_lines_color = "blue", or whatever color you
want.
The empty points, called ‘ghost points’, show the effect sizes
left-out of that model. You can removed them by setting
ghost_points = FALSE.
The rest of the arguments are the same as in
orchard_plot().
There are cases in which the meta-analytic model is more complicated.
leave_one_out() can handle some common cases that try to
account for non-independence:
Variance-covariance matrix for sampling variances, implemented
with metafor::vcalc()
Phylogenetic matrix, calculated with
ape::vcv()
Robust variance estimation, implemented with
metafor::robust()
To demonstrate how this works, we will use a subset of the Pottier data and fit a fictitious model.
data("pottier")
pottier_subset <- pottier[1:20, ]
# ----------------------------------------------------
# Get the phylo tree using `rotl`
unique_names <- unique(pottier_subset$search_string)
taxa <- rotl::tnrs_match_names(unique_names)
tree <- rotl::tol_induced_subtree(taxa$ott_id)
tree$tip.label <- rotl::strip_ott_ids(tree$tip.label)
# Compute the phylogenetic matrix using `ape` package
tree <- ape::compute.brlen(tree)
phylo_matrix <- ape::vcv(tree, corr = TRUE)
# ----------------------------------------------------
# Get the variance-covariance matrix
VCV <- metafor::vcalc(vi = Var_dARR,
cluster = study_ID,
obs = es_ID,
rho = 0.5,
data = pottier_subset)
# Fictitious model just for the example
res <- rma.mv(yi = dARR,
V = VCV,
test = "t",
random = list(~1 | study_ID,
~1 | es_ID,
~1 | species_ID,
~1 | phylogeny),
R = list(phylogeny = phylo_matrix),
data = pottier_subset)
robust_res <- robust(res, cluster = study_ID)
mod_results(robust_res, group = "study_ID")
## name estimate lowerCL upperCL lowerPR upperPR
## 1 Intrcpt 0.08411874 -0.2896782 0.4579157 -0.8585258 1.026763
To include those correlations in the leave-one-out analysis, we need
to pass some extra arguments to the leave_one_out()
function, so they can be re-calculated in each iteration. To see the
details of the arguments, run ?leave_one_out().
# vcalc_args must be a list with the arguments for vcalc. They must be
# passed as strings, except for rho, which is a number.
vcalc_args <- list(vi = "Var_dARR",
cluster = "study_ID",
obs = "es_ID",
rho = 0.5)
# phylo_args must be a list with the arguments for vcv. They must be
# the phylogenetic tree object, and a string with the name of the
# species names column that is used as random effect.
phylo_args <- list(tree = tree,
species_colname = "phylogeny")
# robust_args must be a list with the arguments for robust. For the moment,
# it only accepts cluster as a string, and clubSandwich as TRUE or FALSE.
robust_args <- list(cluster = "study_ID")
loo_test_phylo <- leave_one_out(res,
group = "phylogeny",
vcalc_args = vcalc_args,
phylo_args = phylo_args,
robust_args = robust_args)
loo_test_phylo
## name estimate lowerCL upperCL lowerPR upperPR
## 1 Amalosia_lesueurii 0.09479403 -0.4587654 0.6483534 -1.1824405 1.3720285
## 2 Tor_putitora -0.01347685 -0.3880420 0.3610883 -0.8813393 0.8543856
## 3 Labeo_rohita 0.13494970 -0.2041980 0.4740974 -0.7711812 1.0410806
## 4 Danio_rerio 0.04235149 -0.4895781 0.5742810 -1.1710464 1.2557494
## 5 Propylea_japonica 0.11006970 -0.4321737 0.6523131 -1.1294438 1.3495833
orchard_leave1out(leave1out = loo_test_phylo,
xlab = "dARR",
ylab = "Species left out",
trunk.size = 1.2,
branch.size = 1.5,
alpha = 0.2,
legend.pos = "top.out")
Note that this time the leave-one-out was done by species, not by study. You can leave out one study at a time and still use phylogeny:
loo_test_stdy <- leave_one_out(res,
group = "study_ID",
vcalc_args = vcalc_args,
phylo_args = phylo_args,
robust_args = robust_args)
loo_test_stdy
## name estimate lowerCL upperCL lowerPR upperPR
## 1 1 0.09479403 -0.4587654 0.6483534 -1.1824405 1.3720285
## 2 2 -0.01347685 -0.3880420 0.3610883 -0.8813393 0.8543856
## 3 4 0.13494970 -0.2041980 0.4740974 -0.7711812 1.0410806
## 4 5 0.04235149 -0.4895781 0.5742810 -1.1710464 1.2557494
## 5 6 0.11006970 -0.4321737 0.6523131 -1.1294438 1.3495833
Finally, you can use the output from leave_one_out() to plot the results using ggplot2 if you like:
library(ggplot2)
loo_test_phylo$mod_table |>
ggplot(aes(x = name, y = estimate, ymin = lowerCL, ymax = upperCL)) +
geom_pointrange() +
coord_flip()