Preliminaries and data prep

Meta-analysis inclusion spreadsheet at

Data file at https://docs.google.com/spreadsheets/d/1OIrWIsqjuXVBQYdu90AjJ9b–15GfwTjwvqtOhq11oo/edit#gid=0

library(tidyverse)
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'readr' was built under R version 4.1.2
## Warning: package 'dplyr' was built under R version 4.1.2
library(googledrive)
library(readxl)
library(metafor)
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
## 
## Loading the 'metafor' package (version 3.0-2). For an
## introduction to the package please type: help(metafor)
library(assertthat)
## 
## Attaching package: 'assertthat'
## The following object is masked from 'package:tibble':
## 
##     has_name
library(glue)
## Warning: package 'glue' was built under R version 4.1.2
library(wesanderson)

source("scripts/compute_es.R")

Download data.

f <- googledrive::as_dribble("https://docs.google.com/spreadsheets/d/1OIrWIsqjuXVBQYdu90AjJ9b--15GfwTjwvqtOhq11oo/edit#gid=0")
## ! Using an auto-discovered, cached token.
##   To suppress this message, modify your code or options to clearly consent to
##   the use of a cached token.
##   See gargle's "Non-interactive auth" vignette for more details:
##   <https://gargle.r-lib.org/articles/non-interactive-auth.html>
## ℹ The googledrive package is using a cached token for
##   'michaelcfrank@gmail.com'.
googledrive::drive_download(f, overwrite = TRUE)
## File downloaded:
## • 'Effect size coding (online testing MA)'
##   <id: 1OIrWIsqjuXVBQYdu90AjJ9b--15GfwTjwvqtOhq11oo>
## Saved locally as:
## • 'Effect size coding (online testing MA).xlsx'
d <- readxl::read_xlsx("Effect size coding (online testing MA).xlsx")

Data prep and descriptives

d <- d |>
  rowwise() |>
  mutate(short_cite = paste(str_c(str_split(authors, pattern = ",")[[1]][1], 
                                  collapse = " "), ":",
                            paste(str_split(main_title, pattern = " ")[[1]][1:2], 
                                  collapse = " ")))
DT::datatable(d)

How many effect sizes do we have?

d |>
  ungroup() |>
  summarise(n = n(), 
            n_es = sum(!is.na(d)), 
            n_es_var = sum(!is.na(d_var)))
## # A tibble: 1 × 3
##       n  n_es n_es_var
##   <int> <int>    <int>
## 1   106   104      104

Number of papers and experiments.

d |>
  group_by(experiment_ID) |>
  count() %>%
  group_by(n) %>%
  count() %>%
  rename(n_experiments = n, 
         n_papers = nn)
## Storing counts in `nn`, as `n` already present in input
## ℹ Use `name = "new_name"` to pick a new name.
## # A tibble: 6 × 2
## # Groups:   n_experiments [6]
##   n_experiments n_papers
##           <int>    <int>
## 1             2       27
## 2             3        1
## 3             4        1
## 4             5        1
## 5             8        1
## 6            32        1

Age distribution.

ggplot(d %>% 
         group_by(same_infant) %>% 
         summarise(mean_age_1 = mean(mean_age_1)), 
       aes(x = mean_age_1/12)) + 
  geom_histogram(binwidth = 1) +
  xlab("Mean age (years)") +
  ylab("N groups")

Attempt to perform effect size computation. Note that compute_es.R is from MetaLab and was used in Bergmann et al. (2018), Child Dev.

d_calc <- d |>
  ungroup() |>
  mutate(idx = 1:n()) |>
  group_by(idx) |>
  nest() |>
  mutate(data = map(data, function (df) {
    es <- compute_es(participant_design = df$participant_design, 
                     x_1 = df$x_1, x_2 = df$x_2, 
                     SD_1 = df$SD_1, SD_2 = df$SD_2, n_1 = df$n_1, n_2 = df$n_2,
                     t = df$t, f = df$F, d = df$d, d_var = df$d_var)
    
    bind_cols(df,es)
  })) |>
  unnest(cols = c(data)) |>
  filter(!is.na(d_calc))

Check effect size computation against hand-computed where available.

ggplot(d_calc, 
       aes(x = d, y = d_calc)) + 
  geom_pointrange(aes(ymin = d_calc - d_var_calc, ymax = d_calc + d_var_calc)) +
  geom_errorbarh(aes(xmin = d - d_var, xmax = d + d_var)) +
  geom_abline(lty =2) + 
  ggrepel::geom_label_repel(aes(label = short_cite))
## Warning: ggrepel: 97 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

This confirms that – when we have a hand-computed ES – it matches the compute_es.R ES.

Main analysis

Confirmatory meta-analysis

from prereg:

Following Chuey et al. (2021), we will use a random effects meta-analytic model with a further grouping factor for each experiment pair (lab vs. online) to account for methodological grouping, e.g.: effect size ~ online_offline + (1 | experiment).

Note that we group by experimental paradigm (experiment_ID) in our preregistration, but this step does not eliminate the dependency between multiple measurements from the same children (e.g., both eye-tracking and video coding of the same data from the same children) if they contribute multiple effect sizes. In order to deal with this, we include a random effect grouping term (same_infant).

main_mod <- rma.mv(yi = d_calc, 
                   V = d_var_calc,
                   mods = as.factor(expt_condition),
                   random = ~1 | as.factor(experiment_ID) + as.factor(same_infant), 
                   slab = short_cite, 
                   data = d_calc)
summary(main_mod)
## 
## Multivariate Meta-Analysis Model (k = 104; method: REML)
## 
##   logLik  Deviance       AIC       BIC      AICc 
## -84.3549  168.7098  176.7098  187.2096  177.1221   
## 
## Variance Components:
## 
## outer factor: as.factor(same_infant)   (nlvls = 54)
## inner factor: as.factor(experiment_ID) (nlvls = 31)
## 
##             estim    sqrt  fixed 
## tau^2      0.1890  0.4347     no 
## rho        0.9148             no 
## 
## Test for Residual Heterogeneity:
## QE(df = 102) = 541.8296, p-val < .0001
## 
## Test of Moderators (coefficient 2):
## QM(df = 1) = 1.4082, p-val = 0.2354
## 
## Model Results:
## 
##          estimate      se     zval    pval    ci.lb   ci.ub 
## intrcpt    0.8057  0.2135   3.7733  0.0002   0.3872  1.2243  *** 
## mods      -0.1571  0.1324  -1.1867  0.2354  -0.4165  0.1024      
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
forest(main_mod)

Plot of effect size differences (subtracted SMD).

d_diff <- d_calc |>
  group_by(paper_ID, experiment_ID) %>%
  summarise(d_diff = mean(d_calc[expt_condition == "online"], na.rm=TRUE) - 
              mean(d_calc[expt_condition == "in-person"], na.rm=TRUE), 
            age = mean(mean_age_1)) 
## `summarise()` has grouped output by 'paper_ID'. You can override using the
## `.groups` argument.
# need to be careful with this because this merge can add rows
d_diff <- d_diff |>
  left_join(d_calc |> 
              ungroup() |>
              select(paper_ID, experiment_ID, short_cite) |> 
              distinct())
## Joining, by = c("paper_ID", "experiment_ID")
ggplot(d_diff, aes(x = short_cite, y = d_diff)) + 
  geom_point(alpha = .5, width = .25, height = 0) + 
  coord_flip() +
  geom_hline(yintercept = 0, lty = 2) + 
  ylab("SMD difference (online - in-person)") + 
  xlab("")
## Warning: Ignoring unknown parameters: width, height

Confirmatory moderators

We will consider the following moderators in independent, separate moderated models: * The type of online method:Moderated / Synchronous method (e.g., using Zoom/Skype with an experimenter) vs.Unmoderated / Asynchronous (e.g. using platform such as LookIt) * Response Mode: Looking (e.g., looking time, looking location, eye-tracking, lookaways, etc.) vs.Non-looking (e.g. Verbal response, pointing, etc) * Mean age of the children participated in the study (months)

Moderated vs. synchronous

The structure of this model is that we create an additional dummy-coded predictor, which is 1 for unmoderated online experiments and 0 otherwise.

moderated_mod <- rma.mv(yi = d_calc, 
                        V = d_var_calc, 
                        mods = ~ as.factor(expt_condition) 
                        + as.factor(method),
                        random = ~1 | as.factor(experiment_ID) + 
                          as.factor(same_infant), 
                        slab = short_cite, 
                        data = d_calc)
summary(moderated_mod)
## 
## Multivariate Meta-Analysis Model (k = 104; method: REML)
## 
##   logLik  Deviance       AIC       BIC      AICc 
## -83.8246  167.6492  177.6492  190.7248  178.2808   
## 
## Variance Components:
## 
## outer factor: as.factor(same_infant)   (nlvls = 54)
## inner factor: as.factor(experiment_ID) (nlvls = 31)
## 
##             estim    sqrt  fixed 
## tau^2      0.1950  0.4416     no 
## rho        0.9171             no 
## 
## Test for Residual Heterogeneity:
## QE(df = 101) = 541.2447, p-val < .0001
## 
## Test of Moderators (coefficients 2:3):
## QM(df = 2) = 1.3813, p-val = 0.5013
## 
## Model Results:
## 
##                                  estimate      se     zval    pval    ci.lb 
## intrcpt                            0.6506  0.0980   6.6400  <.0001   0.4585 
## as.factor(expt_condition)online   -0.1571  0.1541  -1.0194  0.3080  -0.4591 
## as.factor(method)unmoderated      -0.0012  0.1863  -0.0066  0.9947  -0.3663 
##                                   ci.ub 
## intrcpt                          0.8426  *** 
## as.factor(expt_condition)online  0.1449      
## as.factor(method)unmoderated     0.3638      
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sanity check of these estimated with unweighted means. Obviously incorrect, but if this looks different, we should debug.

d_calc |>
  group_by(expt_condition, method) |>
  summarise(d = mean(d_calc), 
            n = n())
## `summarise()` has grouped output by 'expt_condition'. You can override using
## the `.groups` argument.
## # A tibble: 3 × 4
## # Groups:   expt_condition [2]
##   expt_condition method          d     n
##   <chr>          <chr>       <dbl> <int>
## 1 in-person      moderated   0.642    50
## 2 online         moderated   0.473    29
## 3 online         unmoderated 0.432    25

Looking time vs. non-looking time

Same approach as above, except that we need an interaction term because some of the in-person effect sizes have a looking time measure as well.

looking_mod <- rma.mv(yi = d_calc, 
                      V = d_var_calc, 
                      mods = ~ as.factor(expt_condition) 
                      * as.factor(response_mode),
                      random = ~1 | as.factor(experiment_ID) + 
                        as.factor(same_infant), 
                      slab = short_cite, 
                      data = d_calc)
summary(looking_mod)
## 
## Multivariate Meta-Analysis Model (k = 104; method: REML)
## 
##   logLik  Deviance       AIC       BIC      AICc 
## -82.8119  165.6238  177.6238  193.2548  178.5270   
## 
## Variance Components:
## 
## outer factor: as.factor(same_infant)   (nlvls = 54)
## inner factor: as.factor(experiment_ID) (nlvls = 31)
## 
##             estim    sqrt  fixed 
## tau^2      0.1946  0.4411     no 
## rho        0.9162             no 
## 
## Test for Residual Heterogeneity:
## QE(df = 100) = 532.9889, p-val < .0001
## 
## Test of Moderators (coefficients 2:4):
## QM(df = 3) = 2.5481, p-val = 0.4667
## 
## Model Results:
## 
##                                                                      estimate 
## intrcpt                                                                0.8643 
## as.factor(expt_condition)online                                       -0.4513 
## as.factor(response_mode)non-looking                                   -0.2571 
## as.factor(expt_condition)online:as.factor(response_mode)non-looking    0.3580 
##                                                                          se 
## intrcpt                                                              0.2388 
## as.factor(expt_condition)online                                      0.3116 
## as.factor(response_mode)non-looking                                  0.2618 
## as.factor(expt_condition)online:as.factor(response_mode)non-looking  0.3452 
##                                                                         zval 
## intrcpt                                                               3.6197 
## as.factor(expt_condition)online                                      -1.4484 
## as.factor(response_mode)non-looking                                  -0.9820 
## as.factor(expt_condition)online:as.factor(response_mode)non-looking   1.0372 
##                                                                        pval 
## intrcpt                                                              0.0003 
## as.factor(expt_condition)online                                      0.1475 
## as.factor(response_mode)non-looking                                  0.3261 
## as.factor(expt_condition)online:as.factor(response_mode)non-looking  0.2996 
##                                                                        ci.lb 
## intrcpt                                                               0.3963 
## as.factor(expt_condition)online                                      -1.0619 
## as.factor(response_mode)non-looking                                  -0.7702 
## as.factor(expt_condition)online:as.factor(response_mode)non-looking  -0.3185 
##                                                                       ci.ub 
## intrcpt                                                              1.3323 
## as.factor(expt_condition)online                                      0.1594 
## as.factor(response_mode)non-looking                                  0.2560 
## as.factor(expt_condition)online:as.factor(response_mode)non-looking  1.0346 
##  
## intrcpt                                                              *** 
## as.factor(expt_condition)online 
## as.factor(response_mode)non-looking 
## as.factor(expt_condition)online:as.factor(response_mode)non-looking 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sanity check of these estimated with unweighted means. Obviously incorrect, but if this looks different, we should debug.

d_calc |>
  group_by(expt_condition, response_mode) |>
  summarise(d = mean(d_calc), 
            n = n())
## `summarise()` has grouped output by 'expt_condition'. You can override using
## the `.groups` argument.
## # A tibble: 4 × 4
## # Groups:   expt_condition [2]
##   expt_condition response_mode     d     n
##   <chr>          <chr>         <dbl> <int>
## 1 in-person      looking       0.694     8
## 2 in-person      non-looking   0.632    42
## 3 online         looking       0.464     9
## 4 online         non-looking   0.452    45

Age as moderator

To test age, we again look at the interaction, this time of age and expt_condition. Need to center age.

d_calc$age_centered_mo <- d_calc$mean_age_1 - mean(d_calc$mean_age_1)
age_mod <- rma.mv(yi = d_calc, 
                        V = d_var_calc, 
                        mods = ~ as.factor(expt_condition) 
                        * age_centered_mo,
                        random = ~1 | as.factor(experiment_ID) + 
                          as.factor(same_infant), 
                        slab = short_cite, 
                        data = d_calc)
summary(age_mod)
## 
## Multivariate Meta-Analysis Model (k = 104; method: REML)
## 
##   logLik  Deviance       AIC       BIC      AICc 
## -82.7792  165.5585  177.5585  193.1895  178.4617   
## 
## Variance Components:
## 
## outer factor: as.factor(same_infant)   (nlvls = 54)
## inner factor: as.factor(experiment_ID) (nlvls = 31)
## 
##             estim    sqrt  fixed 
## tau^2      0.1940  0.4405     no 
## rho        0.9157             no 
## 
## Test for Residual Heterogeneity:
## QE(df = 100) = 511.2531, p-val < .0001
## 
## Test of Moderators (coefficients 2:4):
## QM(df = 3) = 2.4673, p-val = 0.4812
## 
## Model Results:
## 
##                                                  estimate      se     zval 
## intrcpt                                            0.6346  0.0990   6.4084 
## as.factor(expt_condition)online                   -0.1377  0.1354  -1.0171 
## age_centered_mo                                   -0.0052  0.0052  -0.9966 
## as.factor(expt_condition)online:age_centered_mo    0.0067  0.0071   0.9333 
##                                                    pval    ci.lb   ci.ub 
## intrcpt                                          <.0001   0.4405  0.8287  *** 
## as.factor(expt_condition)online                  0.3091  -0.4031  0.1277      
## age_centered_mo                                  0.3190  -0.0154  0.0050      
## as.factor(expt_condition)online:age_centered_mo  0.3507  -0.0073  0.0206      
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Exploratory meta-analysis

We will also code the following fields for exploratory analysis: Attrition rate: the proportion of children participated in the study but not included in the final analysis. Number of trials: the number trials included in the experiment. Matching response mode: whether the in-person experiment and the online method has the matching response mode.

As an exploratory step we will consider the inclusion of publication-bias diagnostics, but we do not focus on these methods here because 1) our primary hypothesis of interest is not about individual effects but about the contrast between effects, and 2) we are not certain whether the bias would be to publish pairs of effects that are different or similar.