Reproducible Research in Statistics

Jessica Minnier – Knight BSR

June 16, 2016

What is Reproducible Research?

Reproducible = Replicable + Transparant

Research results are replicable if there is sufficient information available for independent researchers to make the same findings using the same procedures.

In computational sciences this means: the data and code used to make a finding are available and they are sufficient for an independent researcher to recreate the finding.

In practice, research needs to be easy for independent researchers to reproduce.

King (1995), Ball and Medeiros (2012), from Gandrud (2013)

Replicability has been a key part of scientific inquiry from perhaps the 1200s. It has even been called the “demarcation between science and non-science.”

Gandrud (2013) book “Reproducible Research with R and R Studio” and references therein, including Roger Bacon’s “Opera quaedam hactenus inedita Vol. 1” from 1267

What are the different kinds of reproducibile research?

Enabling reproducibility can be complicated, but by separating out some of the levels and degrees of reproducibility the problem can become more manageable because we can focus our efforts on what best suits our specific scientific domain. Victoria Stodden (2014), a prominent scholar on this topic, has identified some useful distinctions in reproducible research:

Computational reproducibility: when detailed information is provided about code, software, hardware and implementation details.

Empirical reproducibility: when detailed information is provided about non-computational empirical scientific experiments and observations. In practise this is enabled by making data freely available, as well as details of how the data was collected.

Statistical reproducibility: when detailed information is provided about the choice of statistical tests, model parameters, threshold values, etc. This mostly relates to pre-registration of study design to prevent p-value hacking and other manipulations.

ROpenSci Reproducibility Guide

Spectrum of Research

Stodden et al. (2013) place computational reproducibility on a spectrum with five categories that account for many typical research contexts:

ROpenSci Reproducibility Guide

Reproducibility in Statistics

“Reproducibility is important because it is the only thing that an investigator can guarantee about a study.

a study can be reproducible and still be wrong

“These days, with the complexity of data analysis and the subtlety of many claims (particularly about complex diseases), reproducibility is pretty much the only thing we can hope for. Time will tell whether we are ultimately right or wrong about any claims, but reproducibility is something we can know right now.”

“By using the word reproducible, I mean that the original data (and original computer code) can be analyzed (by an independent investigator) to obtain the same results of the original study. In essence, it is the notion that the data analysis can be successfully repeated. Reproducibility is particularly important in large computational studies where the data analysis can often play an outsized role in supporting the ultimate conclusions.”

– Roger Peng’s 2014 blog post on Simply Statistics “The Real Reason Reproducible Research is Important” also see Peng (2011) “Reproducible research in computational science”

Early notions of reproducibility

“Claerbout’s Principle”

An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generate the figures.

It takes some effort to organize your research to be reproducible.

We found that although the effort seems to be directed to helping other people stand up on your shoulders, the principal beneficiary is generally the author herself.

This is because time turns each one of us into another person, and by making effort to communicate with strangers, we help ourselves to communicate with our future selves.

(Jon F. Claerbout is the Cecil Green Professor Emeritus of Geophysics at Stanford University. He was one of the first scientists to emphasize that computational methods threaten the reproducibility of research unless open access is provided to both the data and the software underlying a publication.)

Current Issues and Discussion

How to Make More Published Research True

J. P. Ioannidis (2014) “How to Make More Published Research True” in PLOS Medicine, the author writes a follow up to J. Ioannidis (2005) “Why most published research findings are false.”

He suggests reproducibility as one key component to the cause:

“To make more published research true, practices that have improved credibility and efficiency in specific fields may be transplanted to others which would benefit from them—possibilities include

Availability of code in peer-reviewed journals

Stodden, Guo, and Ma (2013) “Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals”

Reproducible research and Biostatistics (the journal)

Authors can choose to meet a subset of these criteria if they wish:

  1. Data: The analytic data from which the principal results were derived are made available on the journal’s Web site. The authors are responsible for ensuring that necessary permissions are obtained before the data are distributed.
  2. Code: Any computer code, software, or other computer instructions that were used to compute published results are provided. For software that is widely available from central repositories (e.g. CRAN, Statlib), a reference to where they can be obtained will suffice.
  3. Reproducible: An article is designated as reproducible if the Associate Editor of Reproducibility succeeds in executing the code on the data provided and produces results matching those that the authors claim are reproducible. In reproducing these results, reasonable bounds for numerical tolerance will be considered.

Peng (2009) “Reproducible research and Biostatistics

NIH requirements (beginning Jan 2016)

“Enhancing Reproducibility through Rigor and Transparency”

  1. Scientific Premise
    • “describe the general strengths and weaknesses of the prior research being cited by the investigator as crucial to support the application.”
    • experimental design/power of prior studies used for hypothesis generation, weaknesses include different populations/species, unblinded, not adjusting for confounders
  2. Rigorous Experimental Design
  3. Consideration of Sex and Other Relevant Biological Variables
    • “sex is a biological variable that is frequently ignored in animal study designs and analyses”
  4. Authentication of Key Biological and/or Chemical Resources
  5. Implementation

NIH “Rigor and Reproducibility” Policy

Note: Most of this is in regards to the science, design of experiment, chemical and biological methods. Essentially no language describing reproducibility of analyses or data management for data or results generated by the grant.

Journals unite with NIH to encourage reproducibility

NIH Principles and Guidelines for Reporting Preclinical Research

Journals should aim to facilitate the interpretation and repetition of experiments as they have been conducted in the published study.

Checklist: authors required to report

from NIH Guidelines & Landis et al. (2012) “A call for transparent reporting to optimize the predictive value of preclinical research”. Nature 490, 187–191.

Nature series on “Challenges in Irreproducible Research”

Nature has a website containing editorials, features, news, and articles on various topics related to reproducibile research: Nature special: Challenges in Irreproducible Research

Including

Nature series on “Challenges in Irreproducible Research”