DATA 621 Blog 3: R Markdown - Organizing Your Code Chunks

Preparing the final project paper, with the help of my teammates and Google, I have discovered a couple of nifty tricks with R Markdown that make compiling the paper easier. They appeared as overkill at first. After all, it’s just one paper, so we might as well just format everything manually. But at the end of the project, they actually simplified things greatly over multiple drafts.

As you know the paper should not include excessive figures or tables, but secondary material can be provided in appendices. Same for R code - goes at the end of the document. Well, you can still write your R Markdown document in logical order, but display and evaluate code as needed.

First, using knitr you can set default options for all your code chunks. All future code will use these options meaning that the code will be evaluated, but nothing will be displayed.

Naming your code chunks is also a good practice. It is not critical, but makes navigation and referencing code easier. You add a name to the chunk like this: {r Chunk Name}.

knitr::opts_chunk$set(error = FALSE,     # suppress errors
                      message = FALSE,   # suppress messages
                      warning = FALSE,   # suppress warnings
                      results = 'hide',  # suppress code output
                      echo = FALSE,      # suppress code
                      fig.show = 'hide', # suppress plots
                      cache = TRUE)      # enable caching

Now let us say that you are discussing your linear model dealing with car’s speed as a predictor of stopping distance. And you are describing diagnostic plots, but do not want to include them in the paper at this point. Well, you can still put the code in this spot. With default options, nothing will be rendered in the final paper. Here I am overwriting default options, so that you can see the actual code.

library(ggfortify)
autoplot(lm(cars$dist ~ cars$speed))

Now at the end of the paper you are ready to add some plots. You can use R Markdown options to reference and display the output of other code chunks. Simply add ref.label to the code chunk options. Here I am using the following options: {r Figures, ref.label="Model Diagnostics", fig.show="as.is"}. Figures is the name of this chunk, Model Diagnostics is the name of the previous chunk that actually has the code (displayed above), and fig.show overwrites default option to display any plots. Otherwise, the code chunk is empty.

If you have multiple plots, you can reference all of them using the following syntax: {r Figures, ref.label=c("Plot 1", "Plot 2", "Plot 3"), fig.show="as.is"}.

Now how about adding all of your code to the appendix. My regular workflow is to maintain an R script with all code and then paste it into the R Markdown document, into a code chunk that is displayed, but not evaluated. However, another approach is to develop the code and descriptive text together in logical sequence and then have R Markdown gather all code at the end. Simply use the following options in an empty code chunk: {r Code, ref.label=knitr::all_labels(), echo=TRUE}. Code is the name of this chunk, echo=TRUE ensures that the code will be display, and the critical part is ref.label=knitr:all_labels(). This part gathers code from all code chunks in the document.

knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(error = FALSE,     # suppress errors
                      message = FALSE,   # suppress messages
                      warning = FALSE,   # suppress warnings
                      results = 'hide',  # suppress code output
                      echo = FALSE,      # suppress code
                      fig.show = 'hide', # suppress plots
                      cache = TRUE)      # enable caching
library(ggfortify)
autoplot(lm(cars$dist ~ cars$speed))

Very simple. Very effective. Highly recommended.

One final note, ggfortify R package provides a very simple way to display diagnostic plots using ggplot2 (so nicely formatted). At its simplest just passing your model as an argument to the autoplot function will return key diagnostic plots.

DATA 621 Blog 3: R Markdown - Organizing Your Code Chunks

Ilya Kats