Sat Nov 15 20:52:22 2025
Background
In the Meeting of Litters project, we collected leaf litter from 33 global change experiments in arctic, subarctic and alpine ecosystems throughout the globe. We had the following global change experiments in our project:
| Experiment | Amount |
|---|---|
| Warming | 24 experiments |
| Nutrient enrichment | 16 experiments |
| CO2 enrichment | 2 experiments |
| UV-B irradiation | 4 experiments |
| Watering | 4 experiments |
| Shading | 3 experiments |
| Ambient / Control | 33 experiments |
All litter was processed in our lab in Amsterdam and samples were analysed for chemical compounds (litter carbon, litter nitrogen, litter phenols) and litter bags were prepared. All litter bags were incubated in a common environment: a litter bed in Northern Sweden.
Aims
In this assignment, we will:
- Evaluate the assumptions of the meta-analysis and;
- Run a meta-analysis on parts of the dataset.
While the current set-up is not a classic set-up for a meta-analysis, because it is based on materials collected from experiments described in various publications instead of re-evaluating data from publications, we may still evaluate whether it complies to the assumptions and principles of a meta-analysis.
1. Does this set-up accurately deal with the criticisms that might be raised for running a meta-analysis?
In order words: is the set-up appropriate for a meta-analysis based on the three criteria outlined in the previous assignments? Please provide arguments.
The three criteria referred to are 1) publication bias, 2) selection bias, and 3) heterogeneity from combining incomparable studies.
Publication bias
Instead of relying on published literature where significant or positive
results are more likely to appear in journals, the researchers directly
collected leaf litter samples from all participating experiments
regardless of their original findings. This means that experiments
showing null results or negative effects are equally represented in the
dataset, avoiding the “file drawer problem” - where unpublished negative
results remain hidden.
Selection bias
The researchers made efforts to include all global change experiments in
arctic, subarctic and alpine ecosystems by approaching all such
experiments to participate. This systematic inclusion criterion reduces
arbitrary selection. However, there is still a potential for a selection
bias, given geographic gaps that could introduce bias: there hardly
information from Siberia and no data from the Southern Hemisphere. These
geographic limitations mean that the results may not fully capture the
global diversity of alpine and subarctic ecosystem responses. These
potential biases must be explicitly accounted for when interpreting and
generalizing the findings.
Running the meta-analysis
Traditional meta-analyses face the “apples to oranges” problem of
different protocols, measurement techniques and environmental conditions
making direct comparisons problematic. The Meeting of Litters project
tackles this through a common garden approach: all litter samples were
analyzed using standardized protocols in the same Amsterdam laboratory,
and all decomposition measurements occurred in a single common
environment in Northern Sweden. This is a major advantage that may
remove an important source of between-study heterogeneity. Like other
meta-analyses, the project still averages many responses across
different species, functional types and original experimental
conditions, which must be done carefully by directly linking treatment
to ambient conditions for each experiment. This will be achieved by
choosing an appropriate effect size and running the model properly. The
common protocol substantially reduces methodological variation, so if no
consistent pattern emerges across global change treatments, we can more
confidently conclude there is genuinely no generic response—important
information for climate models—rather than suspecting methodological
differences obscured real patterns. However, assessing heterogeneity
remains crucial before expressing such conclusions, as remaining
variability must still be evaluated even with standardized
protocols.
There is no time to fully explore the complete dataset, therefore we selected a subset of the data available for the purpose of this assignment; see the data file. Indicated are:
a) Site at which the litter was collected;
b) Species;
c) Plant functional Type (PFT) or growth form to which a species belongs (e.g. tree, shrub, herb);
d) The means, standard deviation and number of observations for a number of trait-treatment combinations.
Selected traits are %C of the litter, %N of the litter, C/N ratio of the litter and the measured decomposition constant k at the common litter bed (i.e. a measure of how quickly did the litter decompose). These traits may be considered potential response variables. As treatments, i.e. explanatory variables, we selected only the ambient, warming and fertilization treatments to allow interpretation within the context of this assignment. The headers of the data file indicate the combination of trait, measure (i.e. mean value, standard deviation or the number of observations available) and treatment (i.e. ambient, fertilization treatment or warming treatment) and is assumed to be self-explanatory.
2. Phrase the research question and hypothesis that you will try to answer with this particular dataset.
There are various research questions possible, either focusing on C, N, C/N or on the decomposition constant k, on the effects of warming vs fertilization or on the modification of responses due to different Plant functional Types or combinations of those. I chose: “Does fertilization consistently affect %N in the litter?”, which is one of the simpler questions. The hypothesis is that fertilization should increase %N, assuming that N is the limiting factor for growth.
Research Question: Does warming consistently affect the decomposition constant k across different plant functional types in arctic, subarctic and alpine ecosystems?
Hypothesis: Warming should increase decomposition rates (higher k values) because elevated temperatures accelerate microbial activity and enzymatic processes that break down organic matter. However, the magnitude of this effect may vary across plant functional types, with recalcitrant tissues from woody species (shrubs, trees) showing weaker responses compared to more labile herbaceous litter, due to differences in chemical composition and structural complexity that create varying degrees of temperature sensitivity in decomposition processes.
3. Read the meta-analysis text file into R and explore your data
In addition to the various examples of explorations from previous assumptions, you may also consider the following as a way to summarize and visualise patterns quickly:
# loading data
litter <- read.table("datafile-meta-analysis.txt", sep="\t", header=TRUE)
# overview of data
dim(litter)## [1] 84 39
## Sitecode Speciescode PFT Nmean_amb
## Length:84 Length:84 Length:84 Min. :0.3875
## Class :character Class :character Class :character 1st Qu.:0.7655
## Mode :character Mode :character Mode :character Median :0.9348
## Mean :0.9809
## 3rd Qu.:1.1187
## Max. :1.9039
##
## Nstdev_amb Nobs_amb Cmean_amb Cstdev_amb
## Min. :0.00000 Min. : 1.000 Min. :37.81 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.: 1.000 1st Qu.:45.17 1st Qu.:0.0000
## Median :0.05410 Median : 3.500 Median :46.90 Median :0.2916
## Mean :0.07636 Mean : 2.988 Mean :47.44 Mean :0.4664
## 3rd Qu.:0.13261 3rd Qu.: 4.000 3rd Qu.:49.99 3rd Qu.:0.7627
## Max. :0.40812 Max. :11.000 Max. :56.06 Max. :3.0003
##
## Cobs_amb CNmean_amb CNstdev_amb CNobs_amb
## Min. : 1.000 Min. : 23.38 Min. : 0.000 Min. : 1.000
## 1st Qu.: 1.000 1st Qu.: 42.84 1st Qu.: 0.000 1st Qu.: 1.000
## Median : 3.500 Median : 51.60 Median : 2.603 Median : 3.500
## Mean : 2.988 Mean : 54.11 Mean : 4.216 Mean : 2.988
## 3rd Qu.: 4.000 3rd Qu.: 62.92 3rd Qu.: 7.066 3rd Qu.: 4.000
## Max. :11.000 Max. :113.94 Max. :18.257 Max. :11.000
##
## kmean_amb kstdev_amb kobs_amb Nmean_fert
## Min. :29.09 Min. : 0.000 Min. :1.000 Min. :0.6000
## 1st Qu.:46.54 1st Qu.: 1.041 1st Qu.:2.000 1st Qu.:0.9959
## Median :56.01 Median : 4.623 Median :4.000 Median :1.2970
## Mean :60.31 Mean : 4.549 Mean :3.548 Mean :1.3239
## 3rd Qu.:68.34 3rd Qu.: 6.593 3rd Qu.:4.250 3rd Qu.:1.6050
## Max. :99.58 Max. :17.075 Max. :7.000 Max. :2.2967
## NA's :41
## Nstdev_fert Nobs_fert Cmean_fert Cstdev_fert
## Min. :0.0000 Min. : 1.000 Min. :40.51 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.: 1.500 1st Qu.:45.73 1st Qu.:0.0000
## Median :0.1371 Median : 4.000 Median :48.94 Median :0.4363
## Mean :0.1656 Mean : 3.442 Mean :48.45 Mean :0.4318
## 3rd Qu.:0.2181 3rd Qu.: 4.000 3rd Qu.:50.39 3rd Qu.:0.6071
## Max. :0.9703 Max. :11.000 Max. :56.71 Max. :1.9815
## NA's :41 NA's :41 NA's :41 NA's :41
## Cobs_fert CNmean_fert CNstdev_fert CNobs_fert
## Min. : 1.000 Min. :21.45 Min. : 0.000 Min. : 1.000
## 1st Qu.: 1.000 1st Qu.:32.46 1st Qu.: 0.000 1st Qu.: 1.000
## Median : 4.000 Median :37.21 Median : 3.756 Median : 4.000
## Mean : 3.419 Mean :41.16 Mean : 4.984 Mean : 3.419
## 3rd Qu.: 4.000 3rd Qu.:48.59 3rd Qu.: 8.089 3rd Qu.: 4.000
## Max. :11.000 Max. :76.58 Max. :15.685 Max. :11.000
## NA's :41 NA's :41 NA's :41 NA's :41
## kmean_fert kstdev_fert kobs_fert Nmean_warm
## Min. :30.18 Min. : 0.000 Min. :1.000 Min. :0.3775
## 1st Qu.:47.80 1st Qu.: 1.060 1st Qu.:2.000 1st Qu.:0.6278
## Median :57.84 Median : 5.040 Median :4.000 Median :0.7514
## Mean :57.12 Mean : 4.868 Mean :3.395 Mean :0.8345
## 3rd Qu.:64.25 3rd Qu.: 7.093 3rd Qu.:4.000 3rd Qu.:0.9593
## Max. :93.40 Max. :18.053 Max. :6.000 Max. :1.9127
## NA's :41 NA's :41 NA's :41 NA's :21
## Nstdev_warm Nobs_warm Cmean_warm Cstdev_warm
## Min. :0.00000 Min. : 1.000 Min. :39.82 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.: 1.000 1st Qu.:45.05 1st Qu.:0.0000
## Median :0.05475 Median : 3.000 Median :47.13 Median :0.2941
## Mean :0.08799 Mean : 2.794 Mean :47.79 Mean :0.4257
## 3rd Qu.:0.12534 3rd Qu.: 4.000 3rd Qu.:50.31 3rd Qu.:0.7323
## Max. :0.44695 Max. :11.000 Max. :55.47 Max. :1.8114
## NA's :21 NA's :21 NA's :21 NA's :21
## Cobs_warm CNmean_warm CNstdev_warm CNobs_warm
## Min. : 1.000 Min. : 25.32 Min. : 0.000 Min. : 1.000
## 1st Qu.: 1.000 1st Qu.: 52.05 1st Qu.: 0.000 1st Qu.: 1.000
## Median : 3.000 Median : 64.42 Median : 4.005 Median : 3.000
## Mean : 2.778 Mean : 64.74 Mean : 5.919 Mean : 2.778
## 3rd Qu.: 4.000 3rd Qu.: 76.73 3rd Qu.: 9.523 3rd Qu.: 4.000
## Max. :11.000 Max. :114.91 Max. :26.622 Max. :11.000
## NA's :21 NA's :21 NA's :21 NA's :21
## kmean_warm kstdev_warm kobs_warm
## Min. :25.79 Min. : 0.000 Min. :1.000
## 1st Qu.:44.15 1st Qu.: 0.000 1st Qu.:1.000
## Median :56.62 Median : 2.864 Median :2.000
## Mean :60.51 Mean : 3.535 Mean :2.829
## 3rd Qu.:71.42 3rd Qu.: 4.862 3rd Qu.:4.000
## Max. :99.85 Max. :17.095 Max. :7.000
## NA's :14 NA's :14 NA's :14
## [1] 723
# to be able run the scatterplot matrix as designed above
library(car)
scatterplotMatrix(~kmean_amb + kmean_warm + kmean_fert + CNmean_amb + CNmean_warm + CNmean_fert,
diagonal="histogram", data=litter) # this scatter plot analyses only two responses! decomposition constant k and %N in the litter
# and only two treatments! warming and fertilization; for other combinations, statement must be adaptedFormal meta-analysis can be run with the package ‘metafor’ of R (basically, the only other software that allows this is called ‘metawin’, a specialised software tool). Install the ‘metafor’ package and activate the package.
INITIAL DATA EXPLORATION:
The dataset contains 84 observations across 39 variables representing
litter traits under ambient, warming, and fertilization treatments. A
total of 723 missing values are present,
disproportionately concentrated in fertilization treatment
variables (41 NAs for most fertilization measures),
indicating that fertilization treatments were not applied at all
experimental sites. Warming treatment data are more
complete with only 14-21 NAs, providing robust coverage
for the primary research question. Ambient treatment data
appear nearly complete, establishing reliable baseline
comparisons.
Summary statistics reveal that decomposition rates under
ambient conditions (kmean_amb: median 56.01, mean 60.31)
and warming (kmean_warm: median 56.62, mean
60.51) are remarkably similar, suggesting
minimal overall warming effect. However, substantial variation exists
within both treatments (kmean_amb range: 29.09-99.58; kmean_warm range:
25.79-99.85), indicating considerable heterogeneity in
decomposition responses across species and sites.
Substrate quality metrics show that C/N ratios increase
under warming (CNmean_amb median 51.60 vs. CNmean_warm
median 64.42), suggesting warming may alter litter chemistry in ways
that could offset direct temperature effects on decomposition.
Nitrogen content shows slight decline under
warming (Nmean_amb median 0.9348 vs. Nmean_warm median
0.7514), consistent with the C/N ratio increase and potentially
reflecting nutrient resorption or altered allocation patterns under
warming conditions.
The scatterplot matrix reveals several critical patterns for evaluating
the hypothesis. The kmean_amb versus kmean_warm relationship
shows points clustering near the diagonal line with moderate
scatter, indicating that warming neither
consistently increases nor decreases decomposition rates*
across the dataset. Some observations fall above the diagonal (warming
increases k) while others fall below (warming decreases k),
demonstrating response heterogeneity that likely reflects variation
across plant functional types or site conditions. The
relationship between C/N ratios and decomposition rates is
clearly negative across all treatments—litters with higher
C/N ratios (more recalcitrant) decompose more slowly, consistent with
expectations for woody versus herbaceous tissue differences.
Notably, the warming treatment shifts many observations
toward higher C/N values, potentially counteracting direct
temperature-stimulation of decomposition through substrate quality
degradation. The kmean_fert versus kmean_amb comparison shows similar
scatter patterns to the warming comparison, suggesting that both global
change treatments produce variable responses rather than consistent
directional shifts. Visual inspection suggests no strong
consistent warming effect across the entire dataset,
though substantial scatter indicates that
responses may differ systematically by functional type—directly
addressing the hypothesis that recalcitrant woody tissues and labile
herbaceous tissues show differential temperature sensitivity.
4. Which effect size metric do you consider most appropriate or feasible, given the number of replicates, to evaluate differences in means?
The first step of any meta-analysis is to calculate the appropriate effect size. Replicates are the number of independent measurements or observations taken for each experimental unit under the same conditions. Think of lnRR, hedges’d or other measures that are used a metric for meta-analyses. If in doubt, you may check: https://bookdown.org/robcrystalornelas/meta-analysis_of_ecological_data/choosing-an-effect-size.html
Higher replication gives more reliable estimates of means and standard deviations, which directly affects:
- Hedge’s g calculation - uses sample sizes to
correct for small-sample bias
- Variance of the effect size - more replicates =
lower uncertainty
- Statistical power - more replicates = better ability to detect true effects
# checking how many complete cases there are for warming vs ambient comparison
complete <- sum(complete.cases(litter[, c("kmean_amb", "kstdev_amb", "kobs_amb",
"kmean_warm", "kstdev_warm", "kobs_warm")]))
complete## [1] 70
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 4.000 3.548 4.250 7.000
hist(litter$kobs_amb, main="Distribution of ambient replicates", xlab="n")
summary(litter$kobs_warm)## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.000 1.000 2.000 2.829 4.000 7.000 14
CHOICE OF EFFECT SIZE METRIC:
Given the replication structure of the dataset, the log
response ratio (lnRR) is the clearly appropriate effect size
metric for this analysis. Of the 84 total observations,
only 70 complete cases are available with both warming
and ambient decomposition measurements, and the replication distribution
is heavily skewed toward single observations, with approximately
36% of ambient measurements and 46% of warming
measurements having only n=1. This asymmetry
in data quality is further evident in the median replication values,
where warming treatments have a median of only 2
replicates compared to 4 for ambient
conditions.
Using Hedge’s g would require excluding all observations
where either ambient or warming treatment has n=1 (since
standard deviation cannot be calculated), resulting in the loss
of approximately 40-50% of the dataset and leaving only 35-40
observations with calculable effect sizes — a severe reduction in
statistical power that would compromise the ability to detect patterns
across plant functional types.
In contrast, lnRR requires only mean values for
calculation and can therefore retain all 70
complete cases, maximizing sample size for evaluating whether
warming effects on decomposition are consistent or vary systematically
across functional types as hypothesized. The cost of losing 30-35
observations far outweighs any benefit from the precision
weighting that Hedge’s g would provide, particularly when the
research question focuses on detecting directional consistency rather
than precise effect magnitude estimation.
5. Calculate the effect size for each site-species combination indicating the difference in your response variable for a selected treatment compared to ambient conditions.
So, calculate your effect size for either C, N, C/N or decomposition
constant k identifying the effect of (one of) your selected treatment(s)
compared to ambient conditions. Do these calculations only for those
treatments and traits relevant to your research question.
Perform the calculations with “escalc” (= effect size calculator) in
metafor, in which YOUR_METHOD most likely equals “ROM” or “SMD”. In this
case, Response ratios refers to “ROM” and Hedge’s g to “SMD”; see also
“metaforRdocumentation.pdf” for a more extensive discussion on possible
metrics.
# subsetting dataset to columns of interest for better overview
effect <- subset(litter, select=c("Sitecode", "Speciescode", "PFT", "kmean_amb", "kstdev_amb", "kobs_amb",
"kmean_warm", "kstdev_warm", "kobs_warm"))
# or: effect <- litter[, c("Sitecode", "Speciescode", "PFT", "kmean_amb", "kstdev_amb", "kobs_amb", "kmean_warm", "kstdev_warm", "kobs_warm")]
# calculating effect size (overwrites sub-dataset with two additional columns yi and vi)
effect <- escalc(measure="ROM", m1i=kmean_warm, m2i= kmean_amb, sd1i=kstdev_warm, sd2i=kstdev_amb,
n1i=kobs_warm, n2i=kobs_amb, data=effect)
# first calculates the effect size estimate itself, which gets stored in a column called yi (essentially the natural logarithm of the ratio of the two means for each study)
# then calculates the variance of your effect size estimate, which gets stored in a column called vi (this is why we need the standard deviations) The output of this analysis is a data frame to which the following elements have been added to your original data file:
- yi value of the effect size or outcome measure
- vi corresponding (estimated) sampling variance
which can be seen/analysed with e.g.:
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -0.580087 -0.051294 0.002413 0.007411 0.074368 0.370870 14
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.0000000 0.0004258 0.0034663 0.0074997 0.0067957 0.0786346 14
# visualizing distribution of effect sizes and their variances across all observations
par(mfrow=c(1,2))
hist(effect$yi, main="Distribution of effect sizes", xlab="n") # distribution of warming effects
hist(effect$vi, main="Distribution of effect size variances", xlab="n") # more of a diagnostic check# redirect user to data table in R: View(effect)
# saving effect object (with effect sizes, variance and raw variables) to "output.txt" in working directory
write.table(effect, "METAoutput.txt", sep="\t", row.names=FALSE) # row.names prevents R from adding extra column with row numbersPresumably, depending your question, you might have to repeat this for several trait-treatment combinations. If you do so, you will have to rename vi and yi to avoid they get overwritten. When would you actually do this?
1. If you want to compare multiple effects (e.g., does warming affect chemistry AND decomposition?)
2. If your research question is broader than just one trait-treatment combo
3. If you want to build a comprehensive picture of global change effects
CALCULATION OF EFFECT SIZE:
Independent of the effect size metric chosen, the calculations produce
yi (effect size) and vi (variance) values for each site-species-PFT
combination, with actual values varying based on the metric and
treatment selected. Using natural log of response ratio (lnRR) for
warming effects on decomposition constant k, the yi values
show highly variable responses ranging from -0.580 to
0.371, with the distribution centered
approximately around zero.
This indicates that warming does not consistently increase or
decrease decomposition rates across the dataset — some
species-site combinations show accelerated decomposition under warming
(positive yi) while others show reduced rates (negative yi), suggesting
substantial heterogeneity in warming responses that may relate to
differences in plant functional types, substrate quality, or site
conditions.
The histogram reveals an approximately symmetric
distribution of effect sizes, though not perfectly normal,
with most observations clustered near zero and a
slight positive skew. The variance
distribution is heavily right-skewed with most vi
values concentrated near zero, indicating that the majority of
effect size estimates have relatively high precision, though a few
observations show considerably higher uncertainty due to small
sample sizes or high variability in decomposition
measurements.
The “effect” data frame contains various rows in which the variance equals zero…
6. Why are those values zero and what do you decide to do with those species-site combinations for which this happens? Provide arguments.
There are multiple solutions to dealing with this! The simplest would be excluding studies with only one observation, thereby removing zero variances altogether, though this discards nearly half the dataset in this case. More appropriate alternatives in practice include attributing the maximum observed variance to single-observation studies to downweight their influence on the overall effect estimate, or using imputation methods to estimate missing variances based on the dataset’s variance structure across similar observations.
You can eliminate effect sizes with zero variance by making another subset:
# subsetting to remove studies with only one observation
subeffect <- subset(effect, vi>0)
# getting new summary of effect sizes and their variances
summary(subeffect$yi)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.265988 -0.034843 0.009227 0.021822 0.074517 0.370870
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.134e-06 1.608e-03 4.658e-03 8.898e-03 8.718e-03 7.863e-02
# visualizing new distribution of effect sizes and their variances
par(mfrow=c(1,2))
hist(subeffect$yi, main="Distribution of effect sizes (adjusted)", xlab="n") # distribution of warming effects
hist(subeffect$vi, main="Distribution of effect size variances (adjusted)", xlab="n")Answer: When only one
replicate exists for a treatment, standard
deviation cannot be calculated, resulting in
zero variance (vi=0) for the effect size
estimate. This creates a critical methodological problem
because observations with zero variance are treated as
completely certain in meta-analysis, receiving infinite
weight in subsequent models despite actually representing
highly uncertain measurements based on single data
points. This artificial certainty introduces substantial bias
into effect size estimation and heterogeneity assessment.
Two solutions exist: omitting
observations with zero variance, or imputing missing
variances using relationships between variance and other
dataset characteristics (for instance, variance often correlates
strongly with means in ecological data).
For this analysis, the more straightforward approach was chosen
by subsetting to retain only observations where vi>0, which
automatically excluded all single-replicate observations. This reduced
the dataset from 70 to 59 observations with
calculable variances. The adjusted effect size distribution (yi) remains
centered near zero with moderate spread (range: -0.265 to 0.371),
continuing to show no consistent directional warming effect across
observations, while the adjusted variance distribution remains
right-skewed but with improved reliability since all retained
observations now have properly estimated uncertainty. Retaining
zero-variance observations would have artificially inflated the
precision of the overall effect estimate and created
strong bias in diagnostic plots like funnel plots,
where zero-variance observations would cluster unrealistically at the
apex, masking true patterns of heterogeneity and potential publication
bias.
7. Define and run the appropriate rma.uni model for your research question.
To analyse patterns in the effect size measure, we apply “rma.uni”. This function allows to internally call “escalc”. In this case, however, we will use the outputs of the effect size calculations directly, i.e. the vi and yi variables (or a subset of the vi and yi data, if needed).
There are various methods to analyze patterns in effect size:
- rma.uni (univariate random/mixed-effects model): Use when each observation has one independent effect size with its variance; standard approach for most meta-analyses where observations are independent (i.e. most common)
- rma.mv (multivariate/multilevel random-effects model): Use when observations are non-independent due to multiple effect sizes per study, hierarchical nesting structures (e.g. species within sites, sites within regions), or multiple correlated outcomes measured on the same experimental units
- rma.glmm (generalized linear mixed-effects model): Use for non-normal data such as binary outcomes or count data, particularly when calculating effect sizes like odds ratios directly from raw contingency tables
- rma.peto (Peto’s method): Specialized method for odds ratios in medical meta-analyses with rare events; less commonly used in ecological research
See the documentation “metafor Rdocumentation.pdf” for rma.uni to select and formulate the most appropriate model given your question and the current experimental design. Consider whether and which variables to include and how!
Available methods in metafor:
- REML (default): Restricted Maximum Likelihood - standard for random-effects models
- ML: Maximum Likelihood
- DL: DerSimonian-Laird (older, simpler method)
- FE: Fixed-effects model
- Several others (EB, HS, SJ, etc.)
Note: If there is a relationship between the observed outcomes and the chosen predictor, then this usually implies asymmetry in the funnel plot and may be an indication of publication bias.
# overall random-effects model (no moderators), tests if there's a consistent overall warming effect
output <- rma.uni(yi, vi, data=subeffect, method="REML") # standard for random-effects models
# R automatically uses method="REML" behind the scenes --> made explicit for better understanding
# plotting meta-analysis outcomes (including forest and funnel plots)
plot.rma.uni(output)# evaluating funnel plot asymmetry with Egger's regression test (or similar), which mathematically tests whether there's a relationship between effect size and study precision
regtest(output)##
## Regression Test for Funnel Plot Asymmetry
##
## Model: mixed-effects meta-regression model
## Predictor: standard error
##
## Test for Funnel Plot Asymmetry: z = -0.3033, p = 0.7617
## Limit Estimate (as sei -> 0): b = 0.0294 (CI: -0.0122, 0.0711)
The plot.rma.uni function produces four diagnostic plots that serve distinct analytical purposes. The forest plot displays individual effect sizes for each observation with confidence intervals alongside the overall pooled effect estimate, revealing the distribution and consistency of warming responses across species-site combinations. The funnel plot tests for publication bias by plotting effect sizes against their standard errors, where a symmetric inverted funnel shape indicates absence of bias.
The radial (Galbraith) plot provides an alternative visualization of the relationship between effect sizes and precision, facilitating identification of outlier observations that deviate substantially from the overall pattern. The standardized residuals plot evaluates model fit by showing which observations have residuals exceeding ±2 standard deviations, indicating potential influential or problematic data points warranting closer examination.
CONCLUSION ON PUBLICATION BIAS:
The funnel plot appears reasonably
symmetric with effect sizes distributed evenly on both
sides of the pooled estimate across varying levels of precision. The
formal Egger’s regression test confirms this visual
assessment, showing no significant funnel plot asymmetry
(z=-0.3033, P=0.7617). This indicates no
evidence of publication bias or small-study effects
in the dataset.
However, this interpretation requires important
contextualization: the Meeting of Litters project
fundamentally differs from traditional meta-analyses by
collecting actual samples from all participating
experiments rather than relying on published literature,
inherently eliminating conventional publication bias.
Any observed asymmetry would therefore more likely
reflect true heterogeneity in warming responses across functional types
or sites rather than selective reporting of
significant results. The funnel plot does reveal
several observations with notably higher standard errors
(lower precision), reflecting the dataset’s variable
replication structure, but these are distributed symmetrically rather
than clustering asymmetrically.
9. Summarize the statistical results with a Baujat plot
This shows the estimated model coefficients, corresponding standard error (se), test statistic(zval), p-value(pval), lower bound of the confidence interval(ci.lb) and upper bound of the confidence interval(ci.ub). The first summary also provides estimates on heterogeneity, which is an important asset of your meta-analysis.
Baujat plot = identifies observations that disproportionately contribute to overall heterogeneity (x-axis) and substantially influence the pooled effect estimate (y-axis)
##
## Random-Effects Model (k = 59; tau^2 estimator: REML)
##
## tau^2 (estimated amount of total heterogeneity): 0.0050 (SE = 0.0015)
## tau (square root of estimated tau^2 value): 0.0706
## I^2 (total heterogeneity / total variability): 95.74%
## H^2 (total variability / sampling variability): 23.49
##
## Test for Heterogeneity:
## Q(df = 58) = 672.8547, p-val < .0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## 0.0242 0.0123 1.9647 0.0495 0.0001 0.0484 *
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
CONCLUSION ON BAUJAT PLOT:
The Baujat plot shows that observations 38, 7, and 16 emerge
as particularly influential, showing both high
contributions to heterogeneity and considerable
influence on the overall result. Observation 38 stands out most
prominently, positioned far from other observations in terms of overall
influence. These represent specific species-site-PFT combinations where
warming effects deviate substantially from the typical pattern. Ideally,
these observations would warrant detailed investigation by
returning to original field data and laboratory records to
verify measurement accuracy, experimental protocol adherence, and data
entry correctness. However, given that this is a standardized
common-garden experiment with consistent methodology, exclusion
would only be justified if clear protocol violations or measurement
errors were documented, not simply because these observations
show extreme responses. These influential observations
reinforce that warming effects are genuinely
heterogeneous across the dataset, with certain
species-site combinations responding dramatically differently than
others. This biological variability is precisely what the hypothesis
about differential responses across functional types predicts,
suggesting these observations may represent ecologically
meaningful variation (e.g. particularly sensitive or resistant
functional types) rather than data quality issues.
10. Evaluate your choice of the model and interpret the pattern, including the role and interpretation of the intercept in your analysis.
INTERPRETATION OF RANDOM-EFFECTS MODEL:
The random-effects model is the appropriate choice for
this analysis because the observations represent different
species from different sites under varying environmental
conditions, making it biologically implausible that all
observations share a single true warming effect. The model assumes that
each observation estimates its own true effect size, and these true
effects are themselves drawn from a distribution, since ecologically,
different species and sites may respond differently to warming. The
model results reveal a small but statistically significant
overall positive warming effect on decomposition rates (intercept=0.024,
P=0.0495), indicating that warming increases
decomposition by approximately 2.4% on average across all
observations. In this simple model, the intercept represents the
overall mean effect size across all species, sites and
functional types. However, the extremely high heterogeneity
(I²=95.74%, Q=672.85, P<0.0001) demonstrates that this
simple model is inadequate for describing the data
structure. The I² value indicates that 95.74% of
observed variation reflects true differences in warming effects
rather than sampling error, while the highly significant
Q-test confirms substantial heterogeneity beyond what random chance
would produce. This massive heterogeneity reveals that the
overall positive effect masks enormous variation, as some observations
show strong positive warming responses while others show negative
responses or no effect at all. The model is clearly too
simplistic because it treats all observations as
estimating variations of the same underlying effect, while
ignoring systematic sources of variation such as plant
functional type or site characteristics. The forest plot confirms this,
showing individual effect sizes scattered widely across both positive
and negative values with no clear central tendency (moderating variables
are missing!).
11. Create a model with the mods-statement within rma.uni that provides a better description of the patterns (i.e. deals better with the heterogeneity).
There might be different populations within the data that, when grouped together, obscure the significant and/or magnitude of the true global effect. In this case, it would be sensible to include Sitecode (to better account for variation within and between sites) and/or PFT as moderators.
# first creating a model with a moderator for different plant species
outputPFT <- rma.uni(yi, vi, mods=~PFT, data=subeffect, method="REML") # PFT as moderator (modeled by)
coef.rma(outputPFT) # only gives the final coefficients of the meta-analysis## intrcpt PFTEVERGR PFTFORB PFTGRASS PFTSEDGE
## 0.019388396 -0.020438907 -0.005396829 -0.045343732 0.097138049
# then creating a model with a moderator for different test sites
outputSite <- rma.uni(yi, vi, mods=~Sitecode, data=subeffect, method="REML") # PFT as moderator (modeled by)
coef.rma(outputSite)## intrcpt SitecodeAuokulu SitecodeDARTpadd SitecodeFinse_1
## 0.0220514343 0.1244466313 0.0779561658 -0.2880395893
## SitecodeFinse_2 SitecodeKilphigh SitecodeLatheath SitecodeLatmeadw
## 0.0701466283 -0.1506693638 -0.0287048693 0.0570729431
## SitecodeLatsedge SitecodePaddus SitecodeRuby SitecodeSlahtta96
## -0.0353866380 0.0001403612 -0.0120831890 -0.0570303799
## SitecodeStorwood SitecodeTib_HGM SitecodeTib_HGSH SitecodeTib_LGM
## -0.0352989350 -0.0023566424 0.0126408335 -0.0106161996
## SitecodeTib_LGSH SitecodeToolik SitecodeWyomgraz SitecodeWyomungr
## 0.0113823529 -0.0689070079 -0.0153201773 0.0015121698
# and finally creating a model with moderators for both different species and different test sites
outputPFTSite <- rma.uni(yi, vi, mods=~PFT+Sitecode, data=subeffect, method="REML") # PFT as moderator (modeled by)
coef.rma(outputPFTSite) # to evaluate differential responses taking both into account## intrcpt PFTEVERGR PFTFORB PFTGRASS
## 0.049855490 -0.015844464 0.002918136 -0.046722904
## PFTSEDGE SitecodeAuokulu SitecodeDARTpadd SitecodeFinse_1
## 0.124765197 0.077355813 0.048980206 -0.269120741
## SitecodeFinse_2 SitecodeKilphigh SitecodeLatheath SitecodeLatmeadw
## 0.040510986 -0.181336785 -0.022271883 -0.018025598
## SitecodeLatsedge SitecodePaddus SitecodeRuby SitecodeSlahtta96
## -0.187955891 -0.064969793 -0.077415346 -0.068989971
## SitecodeStorwood SitecodeTib_HGM SitecodeTib_HGSH SitecodeTib_LGM
## -0.055346578 -0.045584813 -0.040021106 -0.050317240
## SitecodeTib_LGSH SitecodeToolik SitecodeWyomgraz SitecodeWyomungr
## -0.035004244 -0.096711063 0.003598671 0.020431018
VALUE AND INTERPRETATION OF MIXED-EFFECTS
MODEL:
To assess whether moderator variables better explain the observed
heterogeneity, three models were compared: PFT alone,
Sitecode alone, and PFT and Sitecode combined.
Model performance is evaluated through: Test of Moderators (QM, indicating whether moderators significantly predict effect sizes), residual heterogeneity (I² and QE, showing unexplained variation), and variance explained (R²).
Plant Functional Type (PFT) as moderator:
This model shows that PFT is a significant predictor of warming
responses (QM=17.96, P=0.0013), explaining a moderate
33.61% of the observed heterogeneity. Sedges show a
significantly stronger positive warming response (estimate=0.097,
P=0.0044) compared to deciduous species (the reference category), while
evergreens, forbs and grasses do not differ significantly. This
partially supports the hypothesis that tissue type matters, though not
in the expected woody-versus-herbaceous dichotomy. Still, residual
heterogeneity remains extremely high (I²=93.48%, QE
significant), so it is insufficient to explain response
variation.
Site as moderator: The site model reveals
substantial geographic variation in warming responses,
with coefficients ranging from -0.288 (Finse_1, strong negative
response) to +0.124 (Auokulu, positive response). This
demonstrates that location substantially influences
decomposition responses, reflecting differences in climate
context, soil conditions, and moisture regimes across sites. Site also
accounts for spatial pseudoreplication—observations
from the same site respond more similarly because they share unmeasured
environmental conditions.
PFT + Site combined as moderators: This
combined model provides the most comprehensive approach,
simultaneously accounting for biological differences (functional type)
and environmental context (site). Both moderators are treated
as random factors, appropriately recognizing that while generic patterns
are the focus, site-level variation must be accounted for to avoid
pseudoreplication. The combination reduces residual
heterogeneity beyond either moderator alone, though
significant residual heterogeneity persists (QE remains
significant), indicating that additional unmeasured
factors—such as specific climate variables, detailed litter chemistry,
or microbial community composition—continue to drive variation in
warming responses.
12. What do you conclude about your research question?
Research question: Does warming consistently affect the decomposition constant k across different plant functional types in arctic, subarctic and alpine ecosystems?
CONCLUSION AND TAKEAWAYS:
Answer: No, warming does not
consistently affect decomposition rates across plant functional types or
sites. While there is a small overall positive effect
(approximately 2.4% increase in decomposition under warming), this
average masks enormous heterogeneity in responses. Plant
functional type partially explains this variation, with sedges showing
significantly stronger positive responses to warming than other
functional types, but the majority of variation (66% after
accounting for PFT) remains unexplained. Site location
also contributes to response variation, reflecting differences in
environmental context. The combined model
accounting for both PFT and site provides the best
description of the data, yet substantial
residual heterogeneity persists, indicating that warming
effects depend on multiple interacting factors beyond functional type
classification.
Limitations: The high residual
heterogeneity even in the best model suggests that
_several important factors remain unaccounted for_,
including specific climate variables* (temperature
range, moisture regime, precipitation patterns), detailed litter
chemistry beyond C/N ratios (lignin content, phenolic compounds),
microbial community composition and potential interactions between
warming and other global change factors. Geographic
representation is also incomplete, with minimal data from
Siberia and no Southern Hemisphere observations, constraining
generalizability.