1 The Replication Debate and Developmental Science

The core Manybabies team

The core Manybabies team

The modal study in infant cognition research:

The setup for Onishi & Baillargeon, 2005 The setup for the Baillargeon lab (mid-90s)

Xu & Spelke (2000) method

Xu & Spelke (2000) method

Xu & Spelke (2000) discrimination (8 * 16)

Xu & Spelke (2000) discrimination (8 * 16)

Xu & Spelke (2000) no discrimination (8 * 12)

Xu & Spelke (2000) no discrimination (8 * 12)

2 The Manybabies project.

2.1 What is it for? Burning developmental psychology to the ground?

  • No.
  • We can’t replicate the Reproducibility project in infant research. And in fact, there’s reason to think that doing so would not be a productive use of time and money.
A p curve of infant cognition findings, by Christina Bergmann and MetaLab

A p curve of infant cognition findings, by Christina Bergmann and MetaLab

  • It is better to try and understand variability – in our measurement tools, in our labs, and between our participants.
  • To do that, the only plausible route is to focus on a deep investigation of a single topic (with further investigations in the future).

2.2 Improving standards through collaboration on measurement and analysis.

  • Increasing participant pool size and diversity
  • Understanding the relationship between different infant paradigms
  • Ensuring common standards and best practices across labs
  • How should we exclude participants?
  • How should we pre-process data?
  • How should we vary paradigms for infants of different ages?
IDS meta analysis

IDS meta analysis

3 Making our science more cumulative through meta-analysis

3.1 A case study of infant rule learning

## 
## Multivariate Meta-Analysis Model (k = 56; method: REML)
## 
## Variance Components: 
## 
##             estim    sqrt  nlvls  fixed     factor
## sigma^2.1  0.0000  0.0000     10     no        lab
## sigma^2.2  0.1137  0.3372     17     no  lab/study
## 
## Test for Heterogeneity: 
## Q(df = 55) = 239.3490, p-val < .0001
## 
## Model Results:
## 
## estimate       se     zval     pval    ci.lb    ci.ub          
##   0.3164   0.0884   3.5785   0.0003   0.1431   0.4897      *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.1.1 Basic random effects regression

We plot forest and funnel plots for this first random effects regression.

3.1.2 Moderated random effects regression

## 
## Multivariate Meta-Analysis Model (k = 56; method: ML)
## 
## Variance Components: 
## 
##             estim    sqrt  nlvls  fixed     factor
## sigma^2.1  0.0022  0.0472     10     no        lab
## sigma^2.2  0.1125  0.3354     17     no  lab/study
## 
## Test for Residual Heterogeneity: 
## QE(df = 53) = 210.8762, p-val < .0001
## 
## Test of Moderators (coefficient(s) 2,3): 
## QM(df = 2) = 24.9332, p-val < .0001
## 
## Model Results:
## 
##                 estimate      se     zval    pval    ci.lb    ci.ub     
## intrcpt           0.0696  0.1032   0.6752  0.4996  -0.1325   0.2718     
## scale(age)       -0.1536  0.0565  -2.7216  0.0065  -0.2643  -0.0430   **
## modalityspeech    0.4938  0.1071   4.6092  <.0001   0.2838   0.7037  ***
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Dogs stimuli

Dogs stimuli

## 
## Multivariate Meta-Analysis Model (k = 56; method: ML)
## 
## Variance Components: 
## 
##             estim    sqrt  nlvls  fixed     factor
## sigma^2.1  0.0000  0.0000     10     no        lab
## sigma^2.2  0.1148  0.3388     17     no  lab/study
## 
## Test for Residual Heterogeneity: 
## QE(df = 52) = 202.9058, p-val < .0001
## 
## Test of Moderators (coefficient(s) 2,3,4): 
## QM(df = 3) = 30.1833, p-val < .0001
## 
## Model Results:
## 
##                       estimate      se     zval    pval    ci.lb    ci.ub
## intrcpt                 0.2885  0.1404   2.0546  0.0399   0.0133   0.5637
## scale(age)             -0.1366  0.0570  -2.3960  0.0166  -0.2483  -0.0249
## modalityspeech          0.2357  0.1550   1.5202  0.1285  -0.0682   0.5395
## semanticsmeaningless   -0.3595  0.1563  -2.2997  0.0215  -0.6658  -0.0531
##                        
## intrcpt               *
## scale(age)            *
## modalityspeech         
## semanticsmeaningless  *
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Finally, we test if the model that includes semantics provides a better fit – it does.

##         df      AIC      BIC     AICc   logLik    LRT   pval       QE
## Full     6 116.9516 129.1037 118.6659 -52.4758               202.9058
## Reduced  5 120.2364 130.3632 121.4364 -55.1182 5.2848 0.0215 210.8762

3.1.3 p curve analysis

We can also fit a p curve. Do a density plot of all p values less than 0.05, and then run the Fisher-style test that is suggested in Simmonsen et al 2004.

## Warning: Removed 21 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing missing values (geom_path).

## Warning: Removed 21 rows containing non-finite values (stat_bin).
## Warning: Removed 4 rows containing missing values (geom_path).

## Warning: Removed 21 rows containing non-finite values (stat_bin).

## Warning: Removed 4 rows containing missing values (geom_path).

3.2 Can attending to cumulativity help?

Rules results

Rules results

Rules Edinburgh Princeton Comparison

Rules Edinburgh Princeton Comparison