library(tidyverse)
library(readxl)
library(metafor)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
## Loading required package: metadat
## Loading required package: numDeriv
##
## Loading the 'metafor' package (version 4.2-0). For an
## introduction to the package please type: help(metafor)
library(here)
## here() starts at /Users/mcfrank/Projects/online_devo_metaanalysis
library(assertthat)
##
## Attaching package: 'assertthat'
## The following object is masked from 'package:tibble':
##
## has_name
source(here("scripts/compute_es.R"))
knitr::opts_chunk$set(
echo=F, warning=F,
message=F, sanitize = T)
d <- readxl::read_xlsx(here("Effect size coding (online testing MA).xlsx"))
d <- d |>
rowwise() |>
mutate(short_cite = paste(str_c(str_split(authors, pattern = ",")[[1]][1],
collapse = " "), ":",
paste(str_split(main_title, pattern = " ")[[1]][1:2],
collapse = " ")))
d_calc <- d |>
ungroup() |>
mutate(idx = 1:n()) |>
group_by(idx) |>
nest() |>
mutate(data = map(data, function (df) {
es <- compute_es(participant_design = df$participant_design,
x_1 = df$x_1, x_2 = df$x_2,
SD_1 = df$SD_1, SD_2 = df$SD_2, n_1 = df$n_1, n_2 = df$n_2,
t = df$t, f = df$F, d = df$d, d_var = df$d_var)
bind_cols(df,es)
})) |>
unnest(cols = c(data)) |>
filter(!is.na(d_calc))
d_calc$Age <- d_calc$mean_age_1 - mean(d_calc$mean_age_1)
d_calc <- d_calc |>
mutate(response_mode = case_when(response_mode == "non-looking" ~ "Verbal",
response_mode == "looking" ~ "Looking")) %>%
mutate(expt_condition = case_when(expt_condition == "online" ~ "Online",
expt_condition == "in-person" ~ "In-person")) %>%
mutate(method = case_when(method == "moderated" ~ "Moderated",
method == "unmoderated" ~ "Unmoderated"))
Let’s explore nesting effects.
This is a standard 2-level random effects meta analysis.
##
## Mixed-Effects Model (k = 211; tau^2 estimator: REML)
##
## logLik deviance AIC BIC AICc
## -169.3656 338.7313 344.7313 354.7583 344.8483
##
## tau^2 (estimated amount of residual heterogeneity): 0.2118 (SE = 0.0252)
## tau (square root of estimated tau^2 value): 0.4602
## I^2 (residual heterogeneity / unaccounted variability): 87.60%
## H^2 (unaccounted variability / sampling variability): 8.06
## R^2 (amount of heterogeneity accounted for): 0.00%
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 2.1086, p-val = 0.1465
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.5320 0.1115 4.7721 <.0001 0.3135 0.7505 ***
## mods -0.1018 0.0701 -1.4521 0.1465 -0.2393 0.0356
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can use rma.mv
to produce the same result by nesting
effects within idx
, which is our unique identifier.
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -169.3656 338.7313 344.7313 354.7583 344.8483
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2 0.2118 0.4602 211 no as.factor(idx)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 2.1086, p-val = 0.1465
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.5320 0.1115 4.7721 <.0001 0.3135 0.7505 ***
## mods -0.1018 0.0701 -1.4521 0.1465 -0.2393 0.0356
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If we nest index within experiment, that gives us study-level grouping.
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -108.8491 217.6983 225.6983 239.0676 225.8943
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.2075 0.4556 48 no
## sigma^2.2 0.0423 0.2056 211 no
## factor
## sigma^2.1 as.factor(experiment_ID)
## sigma^2.2 as.factor(experiment_ID)/as.factor(idx)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 7.9859, p-val = 0.0047
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.6663 0.0949 7.0211 <.0001 0.4803 0.8523 ***
## mods -0.1146 0.0405 -2.8259 0.0047 -0.1940 -0.0351 **
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note, in our analysis, we haven’t been nesting idx
explicitly. Does this matter?
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -136.7293 273.4586 279.4586 289.4856 279.5757
##
## Variance Components:
##
## estim sqrt nlvls fixed factor
## sigma^2 0.2240 0.4733 48 no as.factor(experiment_ID)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 10.1144, p-val = 0.0015
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.6007 0.0824 7.2912 <.0001 0.4392 0.7622 ***
## mods -0.0848 0.0267 -3.1803 0.0015 -0.1370 -0.0325 **
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
YES - it does and this was a mistake. We were assuming that there was no random distribution at the level of effects within an experiment.
We could also nest all of this within paper if we wanted, for another level of grouping.
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -102.0764 204.1529 214.1529 230.8646 214.4485
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.1604 0.4005 30 no
## sigma^2.2 0.0400 0.1999 48 no
## sigma^2.3 0.0418 0.2044 211 no
## factor
## sigma^2.1 as.factor(paper_ID)
## sigma^2.2 as.factor(paper_ID)/as.factor(experiment_ID)
## sigma^2.3 as.factor(paper_ID)/as.factor(experiment_ID)/as.factor(idx)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 7.9691, p-val = 0.0048
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.7492 0.1058 7.0795 <.0001 0.5418 0.9566 ***
## mods -0.1141 0.0404 -2.8230 0.0048 -0.1933 -0.0349 **
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
So that’s a random effect structure that accounts for much of the grouping within our dataset. If you look at the variance components, all of these seem to be doing something, which is good.
The only issue is that sometimes the same infants are in multiple conditions and sometimes not. (Meaning there is another level of grouping). How often does this happen?
## # A tibble: 211 × 4
## # Groups: idx [211]
## paper_ID experiment_ID same_infant idx
## <dbl> <dbl> <chr> <int>
## 1 203 1 1_online 1
## 2 203 1 1_inperson 2
## 3 12 2 12_online 3
## 4 12 2 12_inperson 4
## 5 12 2 12_online 5
## 6 12 2 12_inperson 6
## 7 12 3 12_online 7
## 8 12 3 12_inperson 8
## 9 13 4 13_online_1 9
## 10 13 4 13_inperson_1 10
## # ℹ 201 more rows
Answer - quite a lot. So we need to deal with this grouping.
The problem is that sometimes same_infant
is within
experiment, and sometimes it cross-cuts experiments.
Paper 358 has the same six groups of kids with 9 different measures.
So lots of different groups of kids doing different measures. We want
same_infant
nested within experiment_id
here.
But then something like paper 12 has 2 experiments but the same samples did all of them.
How do we deal with this? You can try a specification like this one - but it seems not to work at all - it creates a fourth level of grouping and the third and forth are identical from a levels/variance perspective.
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -102.0764 204.1529 216.1529 236.2069 216.5687
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.1604 0.4005 30 no
## sigma^2.2 0.0400 0.1999 48 no
## sigma^2.3 0.0209 0.1446 211 no
## sigma^2.4 0.0209 0.1446 211 no
## factor
## sigma^2.1 as.factor(paper_ID)
## sigma^2.2 as.factor(paper_ID)/as.factor(experiment_ID)
## sigma^2.3 as.factor(paper_ID)/as.factor(experiment_ID)/as.factor(idx)
## sigma^2.4 as.factor(paper_ID)/as.factor(experiment_ID)/as.factor(idx)/as.factor(same_infant)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 7.9691, p-val = 0.0048
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.7492 0.1058 7.0795 <.0001 0.5418 0.9566 ***
## mods -0.1141 0.0404 -2.8230 0.0048 -0.1933 -0.0349 **
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
What if we just use same_infant
and not
experiment_ID
? This has 90 levels instead of 48 but also
seems not completely appropriate.
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.1822 0.4269 30 no
## sigma^2.2 0.0604 0.2458 90 no
## sigma^2.3 0.0199 0.1412 211 no
## factor
## sigma^2.1 as.factor(paper_ID)
## sigma^2.2 as.factor(paper_ID)/as.factor(same_infant)
## sigma^2.3 as.factor(paper_ID)/as.factor(same_infant)/as.factor(idx)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 2.4661, p-val = 0.1163
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.7503 0.1407 5.3340 <.0001 0.4746 1.0261 ***
## mods -0.1132 0.0721 -1.5704 0.1163 -0.2546 0.0281
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I believe based on all this exploration that metafor
can
fit highly nested random effects structures but not
crossed random effect structures. This is problematic because
we do have crossing of random effects.
We can pretend we don’t and just nest experiment within same-infant but this is incorrect. The model will still fit, but I think then basically what is happening is that we are actually uniqifying experiments within same infant, e.g., not taking advantage of the idea that there is a shared experimental effect size even if there are two groups of infants.
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -87.1585 174.3171 186.3171 206.3711 186.7329
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.1820 0.4267 30 no
## sigma^2.2 0.0477 0.2185 90 no
## sigma^2.3 0.0161 0.1268 107 no
## sigma^2.4 0.0170 0.1303 211 no
## factor
## sigma^2.1 as.factor(paper_ID)
## sigma^2.2 as.factor(paper_ID)/as.factor(same_infant)
## sigma^2.3 as.factor(paper_ID)/as.factor(same_infant)/as.factor(experiment_ID)
## sigma^2.4 as.factor(paper_ID)/as.factor(same_infant)/as.factor(experiment_ID)/as.factor(idx)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 2.2175, p-val = 0.1365
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.7423 0.1407 5.2740 <.0001 0.4664 1.0181 ***
## mods -0.1074 0.0722 -1.4891 0.1365 -0.2489 0.0340
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can also do the same thing with experiment_ID
inside
same_infant
- also inappropriate but in the opposite
direction.
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -88.7365 177.4729 189.4729 209.5269 189.8888
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.1631 0.4038 30 no
## sigma^2.2 0.0282 0.1680 48 no
## sigma^2.3 0.0530 0.2302 107 no
## sigma^2.4 0.0156 0.1248 211 no
## factor
## sigma^2.1 as.factor(paper_ID)
## sigma^2.2 as.factor(paper_ID)/as.factor(experiment_ID)
## sigma^2.3 as.factor(paper_ID)/as.factor(experiment_ID)/as.factor(same_infant)
## sigma^2.4 as.factor(paper_ID)/as.factor(experiment_ID)/as.factor(same_infant)/as.factor(idx)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 0.6717, p-val = 0.4125
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.6565 0.1295 5.0708 <.0001 0.4028 0.9103 ***
## mods -0.0513 0.0626 -0.8196 0.4125 -0.1740 0.0714
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I tend to think this last model is actually the best one. The reason
is that there are more same_infant
markers grouped within
experiments than cutting across. So paper 358 is more representative
than paper 12. You can see this because there are 48 experiments, with
107 same_infant
groups within them. Whereas if we nest the
other way, there are 90 same_infant
groups with 107
experiments within them. So I think we get more bang for our
buck with this grouping.
For kicks - compare to the same model without the paper ID and the idx. Does it do the same thing?
##
## Multivariate Meta-Analysis Model (k = 211; method: REML)
##
## logLik Deviance AIC BIC AICc
## -100.8466 201.6933 209.6933 223.0626 209.8893
##
## Variance Components:
##
## estim sqrt nlvls fixed
## sigma^2.1 0.1991 0.4462 48 no
## sigma^2.2 0.0618 0.2486 107 no
## factor
## sigma^2.1 as.factor(experiment_ID)
## sigma^2.2 as.factor(experiment_ID)/as.factor(same_infant)
##
## Test for Residual Heterogeneity:
## QE(df = 209) = 1127.8165, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 0.6839, p-val = 0.4083
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.5703 0.1189 4.7967 <.0001 0.3373 0.8033 ***
## mods -0.0510 0.0617 -0.8270 0.4083 -0.1720 0.0699
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The answer is, very close - not numerically identical but similar. Maybe there’s just not enough variance to partition into four different levels.