Reproducible Research in R Resources

This document contains resources and learning materials to help you develop your skills in reproducible research in R.

Tip

RStudio, the creator of the RStudio IDE, has recently re-branded to posit.

Workshops

In-Person Workshops

Data Services is running a number of virtual workshops this semester. Of particular interest will be:

  • February 23rd, 1-4 pm Data Cleaning in R
    Do you work with other people’s data? Are there times when you need to clean or reorganize these data to work for you? Join JHU Data Services for this workshop to efficiently clean data in R. You will need to have either some basic knowledge about using R or have previously attended our Introduction to R for Absolute Beginners workshop in order to take this one. We will give an overview of the concepts of data cleaning and go into great detail with tidyr package. The instructor will walk you through the basic data cleaning steps with tidyr package for the first part of this workshop. Then you will apply these skills to clean a dataset downloaded from Kaggle at the second part of this workshop. You will have plenty of opportunities to do hands-on activities on your laptop and work on datasets provided by instructors.

  • March 7th, 1-4 PM Data Visualization in R with ggplot2
    Learn how to create advanced visualizations in R with the ggplot2 package. We will learn the Grammar of Graphics, the underlying design philosophy that underpins ggplot2’s layered graphics. In this workshop, we will become familiar with the ggplot2 syntax, learn how to use it to develop more complex plots, and create statistical data visualizations.

Beyond JHU:

  • Upcoming Carpentries Workshops: There are no R focused workshops upcoming available online, but it is worth keeping an eye on the Carpentries workshops in case something pops up.

Asynchronous Workshops

Asynchronous workshops are those you can complete on your own time.

Learning Materials

Computational Notebooks - R Markdown and Quarto

So there are actually two frameworks for literate programming in R, R Markdown and Quarto. R Markdown is the older, more well-established framework for wrapping your code and output into an expository report format. Quarto is the next generation version of R Markdown. I’ll touch on some resources for both, but would actually recommend Quarto, it is a little easier to use, and provides native support for Python.

Quarto

Why use Quarto instead of R Markdown?

Most of the advanced functionality in R Markdown (like the book publishing format bookdown) needs to be installed as a separate package. In Quarto, most of this functionality is included natively.

This document was created in Quarto and published to RPubs, all from within RStudio

Tip

Start with Quarto Tutorial - A tutorial introducing the use of Quarto in RStudio with R.

R Markdown

Learn all about how to use R Markdown from an excellent tutorial from R Studio.

General Reproducible Research in R

Riffomanas Project

Riffomonas is a project by Pat Schloss, a microbiologist at the University of Michigan. The project teaches fundamental reproducible data analysis skills in R through a series of fantastic YouTube videos.

RStudio Education Training

RStudio has a number of excellent introductions to working with R in RStudio. Check out the RStudio Education Beginners Tutorials

In particular, I cannot recommend enough spending time reading R for Data Science. This is a free, web-based textbook, and is poorly named, it actually introduces the tidyverse, a collection of R packages that are designed for intuitive data cleaning, analysis, and visualization in R.

Resources

Cheatsheets

Quarto Markdown Syntax - Markdown syntax for Quarto (which is basically the same as R Markdown, with some additions).

R Markdown Cheat Sheet - a nice pdf cheatsheet of all the common R Markdown commands and arguments.i

All cheatsheets available in R - This list includes every cheatsheet available from RStudio.