Overview

  • What is reproducible research?
  • Why do we care?
  • Why reproducibility questions arise?
  • The cost of reproducibility
  • Reproducibility and statistics
  • Current status of reproducibility
  • What can we do?

What is reproducible research?

Reproducibility and scientific progress

  • Science is the systematic enterprise of gathering knowledge about the universe and organizing and condensing that knowledge into testable laws and theories

  • The success and credibility of science are anchored in the willingness of scientists to expose their ideas and results to independent testing and replication by other scientists.

http://www.aps.org/policy/statements/99_6.cfm

What is reproducible research?

  • Reproducibility
  • Replicability
  • Repeatability
  • Reliability
  • Robustness
  • Generalizability

 

  • Transparency    
  • Open Science    
  • TRUTH

What is reproducible research?

Reproducible research is the ultimate standard for strengthening scientific evidence by independent:

  • Investigators
  • Data
  • Analytical methods
  • Laboratories
  • Instruments

The beginning of reproducible research

Galileo Galilei

Why do we care?

More data = more chance for errors

  • High-throughput biology generates volumes of data

  • Data-generating technologies are increasingly used to make clinical recommendations and treatment decisions

  • A problem may be overlooked .. Published .. Get in clinical trials

More data = more chance for errors

Clinical trials based on flawed and fraudulent data

Clinical trials based on flawed and fraudulent data

  • Described drug response “gene signatures” in NCI60 cell lines
  • Demonstrated these “signatures” correspond to patient-specific signatures and can be used to predict patient response to the drugs

Biostatisticians spot errors

“Off-by-one” error

Published

...
[3,] 1881_at      
[4,] 31321_at   
[5,] 31725_s_at 
[6,] 32307_r_at    
...

“Off-by-one” error

Published                          Replicated

...
[3,] 1881_at       1882_g_at
[4,] 31321_at      31322_at
[5,] 31725_s_at    31726_at
[6,] 32307_r_at    32308_r_at
...

Summary of the Duke case

  • A total of 162 co-authors
  • 40 papers
  • Two-thirds are partially or completely retracted

IOM guidelines on translational omics

IOM guidelines on translational omics

  • Data/metadata used to develop test should be made publicly available

  • The computer code and fully specified computational procedures used for development of the candidate omics-based test should be made sustainably available

  • "… the computer code … will encompass all of the steps of computational analysis, including all data preprocessing steps … All aspects of the analysis need to be transparently reported"

 

Why reproducibility of research issues arise?

PubMed stats

Reproducible Research

PubMed stats

Retraction

The cost of reproducibility

Irreproducfibility ranges ~51% - 89%

Cost of irreproducibility

Why reproducibility questions arise?

Patterns in the noise

  • Humans are good at recognizing patterns

Human beings do not have very many natural defenses. We are not all that fast, and we are not all that strong. We do not have claws or fangs or body armor. We cannot spit venom. We cannot camouflage ourselves. And we cannot fly. Instead, we survive by means of our wits. Our minds are quick. We are wired to detect patterns and respond to opportunities and threats without much hesitation.

  • Nate Silver

 

 

  • Nate Silver “The Signal and the Noise: Why So Many Predictions Fail–but Some Don't” 2015

https://www.amazon.com/Signal-Noise-Many-Predictions-Fail-but/dp/0143125087

Patterns in the noise

Patterns in the noise

Irreproducibility in high-throughput biology

  • Our intuition about patterns in high dimensional data quickly drops with the increased dimensionality of the data
  • We rely on computation to uncover patterns
  • P-values, the ‘gold standard’ of statistical validity, are not as reliable as many researchers assume.

Effect size: the chance of being wrong

The chance of being wrong

The chance of being wrong

  1. Remember that the more hypotheses which are tested and the less selection which goes into choosing hypotheses - the more likely it is that you are looking at noise

  2. Bigger samples are better

  3. Small effects are to be distrusted

  4. Multiple sources and types of evidence are desirable

  5. Evaluate literatures not individual papers

  6. Trust empirical papers which test other people's theories more than empirical papers which test the author's theory

  7. As an editor or referee, don't reject papers that fail to reject the null

The choice of being right

The choice of being right

  • Large-scale collaborative research

  • Adoption of replication culture

  • Registration (of studies, protocols, analysis codes, datasets, raw data, and results)

  • Sharing (of data, protocols, materials, software)

  • More appropriate statistical methods

  • Standardization of definitions and analyses

  • More stringent thresholds for claiming "breakthroughs"

Understanding the p-value

Understanding the p-value

  1. P-values can indicate how incompatible the data are with a specified statistical model

  2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone

  3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold

  4. Proper inference requires full reporting and transparency

  5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result

  6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis

P-value warning: consult a statistician before the experiment

Current status of reproducibility

Focus on preclinical research

Focus on preclinical research

NIH focus on openness

NIH focus on openness

NSF stance on openness

NSF stance on openness

Reproducibility initiatives

Reproducibility initiatives

Reproducibility initiatives

Reproducibility initiatives

Reproducibility guidelines

  • ARRIVE – Animal Research Reporting of In Vivo Experiments
  • CONSORT – Consolidated Standards of Reporting Trials
  • SPIRIT – Standard Protocol Items: Recommendations for Interventional Trials
  • STROBE – The Strengthening the Reporting of Observational Studies in Epidemiology
  • (STARD) TRIPOD – Transparent Reporting of a multivariable prediction model for Individual PROgnosis of Diagnosis
  • REMARK – REporting recommendations for tumour MARKer prognostic studies

http://www.equator-network.org/ - over 300 reporting guidelines

What can we do

Tools to enhance reproducibility?

Flavors of reproducibility

  • Empirical reproducibility

  • Computational reproducibility

  • Statistical reproducibility

Flavors of reproducibility

  • Empirical reproducibility

  • Computational reproducibility

  • Statistical reproducibility

Steps in reproducible research

The most important is the mindset, when starting, that the end product will be reproducible.

– Keith Baggerly

Steps in reproducible research

The most important is the mindset, when starting, that the end product will be reproducible.

– Keith Baggerly
  • Experimental design
  • Data generation
  • Data analysis
  • Results interpretation
  • Dissemination of results

Common approach

write report around results

Common approach

write report around results

Problems

  • With point-and-click, there’s no way to record/save the steps that generated the (copy/pasted) results

  • Data files are kept separately from the analysis code, and from reports

  • After modifications of one of the files, it becomes unclear which version corresponds exactly to the reported results

  • Every time something changes, you have to regenerate the figures/results/reports by hand – very time consuming

Data organization

in spreadsheets

  • Explicitly import text data as text, numeric as numeric, etc.

  • One thing in a cell (avoid comments, color coding, etc.)

  • Choose comprehensive variable names. Use "_" instead of spaces. Be consistent

  • Save the data as CSV, comma-separated values

  • Avoid calculations

http://kbroman.org/dataorg/

Better approach

write report that generates results

  • The report is automated via code

  • Data is attached to the well-documented code

  • History of any changes should be preserved

  • The final report should be self-sufficient and reproducible with a single command

The aims of computational reproducibility

Know your Unux

  • Unix is a family of operating systems and environments that exploits the power of linguistic abstractions to perform tasks

  • Unix users spend a lot of time at the command line

  • In Unix, a command is worth a thousand mouse clicks

Makefiles

reproducibility in command line

  • Make is a tool which controls the workflow of generating target/result files from the dependencies/source files

  • Automates/documents a workflow

  • Intelligently handles the dependencies among data files, code

  • Accounts for the updates in data, code

Automate everything

  • R – free/open source programming language

  • Runs on Windows, Mac, and Linux

  • Extensible with a very large collection of actively developing packages

  • Excellent graphics & report-creating capabilities

R is reimagined with RStudio

Self-documenting code

  • A report containing a stream of text and code chunks

  • Each code chunk loads data, computes results, shows figures

  • Each text chunk explains how the code chunks work

  • The resulting report is human- and machine readable

 

 

 

 

  • Donald E. Knuth "Literate Programming"

http://comjnl.oxfordjournals.org/content/27/2/97.short

Evolution of literate programming

  • HTML - HyperText Markup Language, used to create web pages. Developed in 1993

  • LaTeX – a typesetting system for production of technical/scientific documentation, PDF output. Developed in 1994

  • Sweave – a tool that allows embedding of the R code in LaTeX documents, PDF output. Developed in 2002

  • reStructuredText - a markup syntax that plays well with Python and Sphinx documentation. Developed in 2002

  • Markdown – a lightweight markup language for plain text formatting syntax. Easily converted to HTML. Developed in 2004

R Markdown basics

R Markdown basics

Literate programming with knitR

  • knitR - a package for dynamic report generation written in R Markdown

  • Supports RMarkdown, LaTeX, MathJax. PDF, HTML, DOCX output

  • Developed in 2012 by Yihui Xie

 

http://yihui.name/knitr/

Literate programming with knitR

  • Code chunks separated by ```. Inline code allowed

  • Graphics, code generated and external images

  • Tables, code generated and in RMarkdown format

  • Caching long code chunks

  • Code chunks and results output fully customizable

Literate programming with knitR

  • Mix markdown with code

Literate programming with knitr

  • "Knit" a report with one command

Literate programming in other languages

Keeping history of changes

Version control – what and when did you do

Git and GitHub – version control system

  • Each project stored in its own repository

  • Keep history of changes – track what you did

  • Ability to go back if something breaks

  • Branch out, go creative, then merge or revert the changes

  • Collaborate through merging changes from multiple people

Version control – what and when did you do

Version control – what and when did you do

  • Git is a command line tool
  • GitHub.com is a web-based storage for your project repositories

 

  • Git add – add a file to version control system
  • Git commit – make a snapshot of current changes
  • Git push/pull – send/get changes to/from GitHub

Reproducibility for other languages

IPython/Jupyter Notebooks

  • Combine text, equations, code, and graphics
  • Markdown support
  • Multiple languages support (>40)

 

https://jupyter.org/

Reproducibility for the whole project

Reproducibility for the whole projects

Docker - an envelope (or container) for the whole project environment ~ lightweight virtual machine

  • OS-independent, portable application images
  • Preserves all application dependencies
  • Easy to distribute

 

https://www.docker.com/

Question your project

  • Are the tables and figures reproducible from the code and data?

  • Does the code actually do what you think it does?

  • In addition to what was done, is it clear why it was done? (e.g., how parameter settings were chosen?)

  • Is your code scalable to accommodate more data/methods?

Six degrees of reproducibility

  1. The results cannot be reproduced

  2. The results cannot seem to be reproduced

  3. Reproducibility requires extreme effort

  4. Reproducibility requires considerable effort

  5. Easy reproducibility, but require some proprietary source packages (MATLAB, SAS, etc.)

  6. The results can be easily reproduced by an independent researcher with at most 15 min of user effort, requiring only standard, freely available tools (C compiler, R, Python, etc.)

Reproducibility 101

  • Begin with the final product in mind

  • Use literate programming (self-documenting code)

  • Keep history of changes via code versioning and sharing

  • Get basic statistics right

  • Set stringent cutoffs, correct p-values for multiple testing

  • Be critical, consider batch effects, visualize, do sanity checks, use random controls, cross-validation

  • Follow reporting guidelines

Learn more

Nature “Statistics for Biologists”

Statistics Notes in the British Medical Journal

Reproducible research made simple

Practical reproducibility

Acknowledgements

Thank you