This document contains step-by-step instructions for students taking the reproducibility class assignment.
You should already have completed the pre-assignment survey. If not, please do so now before you read this document any further.
Before you begin the assignment, we strongly recommend that you read this article which reports a study of analytic reproducibility in a sample of articles published in the psychology journal Cognition.
To give you a general overview of what you are aiming for, here is an example of a completed reproducibility report from an empirical project we ran last year. Note that a few details might be different, but generally this is what we expect your reproducibility report to look like (when the .Rmd is knit and rendered to HTML).
To complete the assignment you will need to submit a link to a Github repository containing the following: * the ‘pilot’ version of your assigned reproducibility check (i.e., the version that you completed independently) * the ‘copilot’ version of your assigned reproducibility check (i.e., the version that you worked on with your copilot)
Note that you should not contact the authors of the original article as part of this assignment - the goal of this exercise is to see how often we can independently reproduce outcomes reported in published psychology papers.
We will also send you a short survey after you complete the assignment which will help us to assess how effective it is as a teaching tool. We’d be very grateful if you can share your feedback with us in these surveys.
Any R code that you may need to run will appear below in grey boxes.
If you encounter any problems, please do not hesitate to e-mail me (tom.hardwicke[@]charite.de).
Good luck!
The principle aim of a reproducibility check is to recover a pre-defined subset of target outcomes reported in the original article by repeating the original analysis upon the original data. To do this you should use the provided data files, information about the original analysis provided in the original article, and any other additional documentation (e.g., codebook or analysis scripts). It is not our goal to attempt or suggest alternative analyses that we may think are superior - that is outside the scope of this assignment.
In order to reduce the likelihood of making errors ourselves, we will employ a co-piloting system in which every reproducibility check involves the input of at least two members of the class. The ‘pilot’ will make the first attempt to reproduce the target outcomes. The ‘co-pilot’ will then check the analysis of the pilot and discuss it with them. In the copiloting stage, you work together to try and reproduce the target outcomes (if you have not already done so) and improve the quality and reproducibility of your own analysis code. For example, the copilot may notice that the existing code has not been well annotated, and you can work together to improve that. You must submit two versions of the reproducibility report: the one generated at the pilot stage and the one generated at the copiloting stage.
The class has been divided into four groups and each group has been assigned one article. To find out which article you have been assigned, please check this spreadsheet. You will find a link to a Github repository which has all the raw materials you need to conduct your reproducibility check.
The Github repository for your article should contain:
You should begin by forking the repository so you have your own version to work on. When you submit your assignment, you just need enter the link to your forked version in the column “Student forked repository” in this spreadsheet
Your goal is to try and recover all of the target outcomes (outlined in the file ‘targetOutcomes.md’) by repeating the analyses as described in the original article. You do not have to read all of the original article, but make sure you are generally familiar with the topic and study design. Also check the whole methods section and, if relevant, supplementary materials, for information about the analysis that was employed. It is common for authors to mention exclusion criteria early in a results section or even in the method section and this may not have been included in the targetOutcomes.md file.
If you are struggling to figure out what the original authors did because of ambiguous or absent information then do not engage in excessively lengthy guesswork. It is ok to try a few different things if it is not too time-consuming - for example, you might try a Student’s t-test and a Welch t-test if you suspect the authors have just neglected to report which one of these tests they used. If you encounter unresolveable issues, then in the conclusion section of your report, write down exactly what the issue is and what information you think you need to know to resolve them.
Below is a practical, step-by-step guide for running your reproducibility check.
You must run your reproducibility check in R. You can download the latest version from here
You will need to use the free R Studio software which you can download here.
We will be using several of R Studio’s built in features, such as R Markdown.
We will be using Github for version control and collaboration. You can find all of the reproducibility check repositories (repos) here.
We also highly recommend using the free Github Desktop software which you can download here.
You can find plenty of guides for using Github online. Here is a good place to start.
We will not be doing anything super fancy with Github so do not panic if you are not familiar with this tool. If you run into trouble, just ask for help.
If you are an advanced user, feel free use your own git workflow. However, the rest of this guide assumes you are using the Github Desktop software.
We have put together a simple R package (‘ReproReports’) that contains a custom R Markdown template and a couple of custom functions for you to use when preparing your reproducibility report. To install the ReproReports package, you will first need to install another package called devtools:
install.packages("devtools")
Next, run the following command to install ReproReports directy from our project Github page:
devtools::install_github("TomHardwicke/ReproReports")
To load the package:
library("ReproReports")
To check that this has installed correctly, click on ‘file,’ ‘new file’, and then ‘R markdown…’ in R Studio. Select ‘from template’ and you should see an option “Reproducibility Report”. If not, try restarting R Studio. If you still can’t see the template then the installation has gone wrong somewhere and you may need to ask for help. Otherwise, you’re good to go!
Once you have found your allocated article repository via this spreadsheet, you need to fork the repository. You can do this by opening the repo on the Github website and clicking on the ‘fork’ button in the top-right. In may take a few minutes for the repo to be forked over to your account. When it has finished you need to clone the repository to your personal computer. The quickest way to do this is to click the green ‘clone or download’ button and then click ‘Open in Desktop’ (assuming you have installed Github Desktop). Make sure you are in the forked repo and not the original master branch! The files will now be downloaded to your computer and you should see the repo in the Github Desktop software.
Open up R Studio. Click ‘file’, ‘new project’, ‘existing directory’ and then browse to whereever you cloned the repo to on your computer. Now click ‘create project’. If the R Studio project is set up correctly then you should see the files from your repo listed in the files section.
Github is a piece of version control software (the underlying system is simply called ‘git’). Basically you can use the software to regularly create ‘snapshots’ of your files so you have a detailed history of all of the changes you’ve made over the course of your project. If used regularly it can be an extremely helpful tool for reproducibility, error control, and collaboration.
It is good practice to ‘commit’ your changes fairly regularly. Each time you commit changes a new snapshot is saved, and you can ‘roll back’ to an earlier snapshot if you realise later on that you’ve made a mistake you need to revert.
To commit and sync after you’ve made some changes, open up Github Desktop and select your repo. Where it says ‘summary’ and ‘description’ you can enter some information about this commit so you can work out what you did later on. For example, if you’ve just created the .Rmd document for your reproducibility report, you might call the commit ‘created report’ and in the description put something like ‘created .Rmd file for reproducibility report’ (just a brief summary is often sufficient). Now click on ‘commit to master’. Note that at this point you have just committed to your repo on your computer - the changes are just saved locally. Its a good idea to now click on ‘sync’ in the top right, which will back up your changes on the Github website.
Ok you’re almost ready to start with the actual reproducibility check!
If you are pilot (i.e., you are running the reproducibility check independently for the first time) you will start by opening up a new R Markdown file. If you are at the co-piloting stage, the easiest approach is to duplicate the pilot’s report (call the new file ‘copilotReport.Rmd’) and make changes to that.
R Markdown enables a coding/analysis approach called ‘literate programming’. This is the idea that we interleave actual code with plain language commentary explaining what we are doing in sufficient detail such that someone who does not understand the code itself can still figure out what we have done (including our future selves). You should aim to provide detailed commentary throughout your report.
If you are unfamiliar with R Markdown, there’s plenty of information available here.
You may also find this ‘cheatsheet’ useful.
To run code that you have entered in ‘code chunks’ just click the green arrow.
I have put together a custom R Markdown template so our reproducibility reports are in a fairly standardised format. To open the template, click on ‘file’, ‘new file’, and then ‘R markdown…’ in R Studio. Select ‘from template’ and you should see an option “Reproducibility Report” if the ReproReports package installed correctly (see above). Click on OK.
The R Markdown file that opens will begin with a ‘yaml’ header between two sets of three dashed lines. Leave this section as it is.
Below that you’ll see a code block referring to various details about this reproducibility check e.g., articleID, reportType, pilotNames etc.
Please enter the relevant details. For example, enter the article ID which might be something like “1-1-2015”. Note that we are going to try and get a reasonable estimate of how long these reports take us, so keep your eye on the clock whenever you work on the report. It doesn’t have to be spot on - we don’t want to disrupt people’s workflows by having them time everything with a stopwatch. Just figure it out approximately. If multiple people are working on a report then you should keep adding the time spent to the relevant counter each time you submit a pull request.
When co-piloting - in copilotReport.Rmd you must change the report type from ‘pilot’ to ‘copilot’
Throughout the template I included some guidance text in square brackets that you should either replace or delete before submitting your final report. Anything not in square brackets should remain in your report.
Save the R Markdown file with the name “pilotReport.Rmd” or “copilotReport.Rmd” depending on which stage you are at
Before we get into the details of the R Markdown template, let’s go and have a look at what is available in the repo. You should have a pdf of the article, a targetOutcomes.md file (.md stands for ‘markdown’), and a data folder containing a data file or files.
The targetOutcomes.md file can be opened in any text editor, or you can view it in the repo on Github. It outlines exactly which values in the paper you are to try and reproduce.
Please note you will likely need more information than is included in the targetOutcomes.md file in order to run your reproducibility check. For example, there may be essential pre-processing steps that are detailed in the article, but are not included in the targetOutcomes.md file.
You should read relevant parts of the article and develop a good understanding of the methods employed by the original authors. Make sure you download any supplementary information files to see if they contain additional important details.
Please note You must not directly edit the original data files This cannot be emphasised enough! The original data file must remain as it was when you forked the repo. This is so that others who work on the project can reproduce everything you have done from scratch. If you need to make manual edits to a data file (and this should be avoided unless absolutely necessary), you should save an additional file (see below for details). If you accidentally make changes to the original data file, then you can roll back these changes using Github (this is why it is important to do regular commits!).
You will need to fill in the Methods summary and Target outcomes section. You need to write the methods summary from scratch, but you can copy and paste the target outcomes from the targetOutcomes.md file. The methods summary only needs to be brief but capture all the important details that relate to the target outcomes.
The remainder of the report is divided into 5 key stages outlined below.
Load any necessary R packages. Some useful ones are already listed and you can add any additional ones that you need. Its helpful to add a comment saying what the package is for.
Load data from the file or files in the data folder. You may need different functions for different types of file.
This cheatsheet may be helpful.
Mung/wrangle (organise) the data into the ‘Tidy’ format.
This cheatsheet may be helpful.
To the greatest possible extent you should try and conduct data munging operations programmatically in R. In some cases, you may need to make manual adjustments to the data file in, for example, Excel. If you have to do this, you should detail the steps you have taken in your R Markdown report, and save an additional data file with the name “data_manualClean”. DO NOT EDIT THE ORIGINAL DATA FILE.
This section is further sub-divided into pre-processing, descriptive statistics, and inferential statistics. Work systematically through the target outcomes, attempting to reproduce each reported outcome with the analyses described in the original article (and any supporting documents). Make sure you write down exactly which target outcome you are trying to reproduce (ideally quoting verbatim from the original article) before the analysis code so it is easy to compare to the output.
For every single value in the target outcomes you must run the reproCheck() function to explicity compare the values from the published article and the values from your analysis. This includes values like degrees of freedom or sample sizes.
The reproCheck() function is part of the ReproReports package built specially for this project. As input, it takes the “reportedValue” (i.e., a target value from the original article) and the “obtainedValue” (i.e., the value obtained in your analysis), compares the two and calculates the percentage error between them. The function automatically works out if an error has occurred and classifies them as ‘minor’ (less than 10% difference) or ‘major’ (greater than 10% difference). We need to use this for every single value because we need a complete record of every value we have checked, even if it is a match.
You also need to specify the “valueType”, to indicate, for example, whether you are checking a mean, standard deviation, t value, etc. There is a list of pre-specified values which you can find by running ?reproCheck. If none of these options seem to fit then just use “other”. If you are checking a p value and enter ‘p’ as the valueType, the function will automatically check to see if there is a Decision Error (see the pre-registered protocol if you are not sure what this means).
There is another parameter called “eyeballCheck” which sounds weird. Normally you won’t need this and by default it is set to NA so you can ignore it. However, there are occasions where the original report does not contain an exact value and instead reports a relationship to a threshold, for example p <.05 or t < 1. This is when we need to “eyeball” the comparison instead i.e., manually check that the obtainedValue falls in the correct interval indicated by the reported value. If you need to do this, then you should enter eyeballCheck = TRUE if the values seem to match OK. If the value does not fall within the correct interval, then you should run a regular reproCheck() using threshold as the reported value. For example, if the authors report p <.01, and you obtain p = .26, you should run reproCheck(reportedValue = “.01”, obtainedValue = .26, valueType = ‘p’) which will record a Decision Error in this case. This can sometimes get a bit complicated so please check with Tom to discuss any issues or edge cases.
The reproCheck() function outputs two things. Firstly, a short sentence is printed telling you the outcome of the comparison - specifically the values compared, the percentage error, and whether they match or whether there was an error. Secondly, the function stores its output in something called the “reportObject”. A blank reportObject is created at the start of your report - it is already included in the template and looks like this:
# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared
reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA)
Each time you run the reproCheck() function, you must assign the output to the reportObject so it can be updated (example below). By the end of the report, all of the comparisons are stored and written out as a .csv file.
Note that when you enter the obtainedValue, you should use the R variable containing the value where possible, rather than writing out the value manually. This helps to avoid typos. The function will also automatically round the obtainedValue to the same number of decimal places as the reportedValue.
When you enter the reportedValue, it must be entered as a character e.g., ‘21’. Don’t worry if you accidentally enter it as a number, 21; the function will tell you off and ask you to do it properly. It is important not to get the obtainedValue and reportedValue mixed up so I suggest you write out the argument name in full, as shown below.
An example where we check a mean and there’s a major error:
condition_mean <- mean(c(1,2,3,4))
reportObject <- reproCheck(reportedValue = '3.45', obtainedValue = condition_mean, valueType = 'mean')
An example where we check a standard deviation and its a match:
this_sd <- 15.63
reportObject <- reproCheck(reportedValue = '15.63', obtainedValue = this_sd, valueType = 'sd')
An example where we check a t value and there is only a minor error (because the percentage error is below 10%):
this_t <- 1.2
reportObject <- reproCheck(reportedValue = '1.3', obtainedValue = this_t, valueType = 't')
Here is an example where there is a decision error for the p-value:
a_p_value <- 0.048
reportObject <- reproCheck(reportedValue = '.054', obtainedValue = a_p_value, valueType = 'p')
An example where the p-value is reported as “p <.05” so we have to do an eyeball check. In this case we can see that the description <.05 is accurate, so we say eyeballCheck = TRUE.
a_significant_p_value <- .012
reportObject <- reproCheck(reportedValue = '<.05', obtainedValue = a_significant_p_value, valueType = 'p', eyeballCheck = TRUE)
Another example where the p-value is reported as “p <.05” so we have to do an eyeball check. But in this case we can see that the description <.05 is NOT accurate, so we say eyeballCheck = FALSE.
a_not_significant_p_value <- .24
reportObject <- reproCheck(reportedValue = '<.05', obtainedValue = a_not_significant_p_value, valueType = 'p', eyeballCheck = FALSE)
Note that there is special, fourth type of error which does not involve comparing numerical values. The INSUFFICIENT INFORMATION ERROR applies to situations where the data analysis procedure reported in the original article (and any supporting documentation) is so unclear or incomplete that you cannot conduct your reproducibility check (or some aspect of it). Note that if the provided information is ambiguous and you are unsure what the original analysis entailed, you should not attempt to engage in lengthy guess work about what the original authors did.
There is no R function for these situations. You should simply type INSUFFICIENT INFORMATION ERROR in block capitals and then underneath provide commentary in as much detail as possible about what the issue is. In the conclusion part of the report you should tally up the number of these errors and update the Insufficient_Information_Errors variable accordingly.
There are a few aspects to reporting your conclusions. Firstly, you should provide a verbal summary of the report. Identify and describe any issues you encountered in as much detail as possible.
Secondly, fill in the code chunk in the conclusion section with revelant information. You can wait to do this until the very end of the reproducibility check (i.e., after author assistance). If author assistance was provided, change the Author_Assistance variable to TRUE. If the reproducibility check was a success then you can ignore the remaining variables. However, if you encountered at least one Major Error or Decision Error, then the reproducibility check was a failure and you should add the revelant details. Firstly, add information about the potential cause of the reproducibility issues you encountered to the locus_ variables. Then specify TRUE or FALSE whether you think the original conclusions may be seriously affected by the reproducibility issues you encountered.
Because there is quite a bit of subjective involvement here, feel free to discuss the issues with other members of the team.
The remaining code chunks automatically collate information from across the report and output two .csv files.
The final step in preparing your report is to ‘knit’ it. This produces a nice looking html document. You can find the knit button towards the top of the window next to a blue ball of string. When you click ‘knit’, R Studio will show you the html version of your report. Some of the formatting might look a little strange. In which case you should click on ‘open in browser’. Things should look ok now.
If you decide you need to make some changes that’s fine. Just remember to knit your report again right before you submit it so that the html file is up-to-date.
To submit your report, you should issue a pull request. This means you are requesting that the author of the original master repo (Tom) merges the changes you have made in your fork with the master. To issue the pull request, open up the Github Desktop software and select your repo. Make sure you have committed and synced all recent changes first. Now click on the ‘pull request’ button in the top right. In the ‘description’ box, write something like ‘Pilot reproducibility check is complete’. Then click ‘send pull request’.
That’s it for now. It you are the pilot, then a co-pilot will be assigned to double check your report. If there are reproducibility issues that need resolving then we may need to contact the original authors and Tom will be in touch about that.
Here are a few additional tips for producing a top notch reproducible report.
Describe exactly what you are doing throughout in plain language interleaved with code chunks. Try to avoid jargon and acronyms where possible (unless they are clearly defined).
It can be really useful to use quotations from the original article or associated files to illustrate exactly what the original authors say they did and what they found. To write a quotation in markdown, just use the ‘>’ symbol. For example:
“> This is a quote from the article”
will produce:
This is a quote from the article.
When quoting, make sure you note the source e.g.,
This is a quote from the article. (from Jones et al. p.18).
There are instructions for including images in the R markdown documentation here: http://rmarkdown.rstudio.com/authoring_basics.html
You could, for example, include a screenshot of a figure/table from the original article and compare/contrast it with your own findings.