13 December, 2022

Topics

  • Concept of reproducible documents
  • R Markdown (and Markdown)
  • Running code inline and via code chunks
  • Graphics and tables
  • Output formats
  • Fancier things
  • Resources

Goal: to be comfortable making basic reproducible documents.

Assumptions

  • You’re familiar with the basic mechanics of R

This is intended to be a hands-on workshop, so also:

  • You have R (and probably RStudio) installed
  • You have the rmarkdown package installed

Note: this presentation doesn’t (yet) cover the newer Quarto publishing system, but will provide a good foundation for learning it.

Reproducible documents

What are reproducible documents?

“A reproducible document is one where any code used by the original researcher to compute an output (a graph, equation, table or statistical result) is embedded within the narrative describing the work.”

From the eLife Reproducible Document Stack

Reproducible documents

Here we are in Word:

We have no idea where these numbers came from.

Reproducible documents

Copy-and-paste into word processing documents sucks, and it’s easy to make mistakes (not update numbers, copy the wrong number, etc).

Reproducible documents

What if I want to generate 100 formatted reports for 100 different datasets?

Reproducible documents

Why do we want to use reproducible documents?

  • Transparency
  • Reproducibility
  • Reduce mistakes
  • SAVE TIME AND WORK

xkcd

Reproducible documents

R Markdown

R Markdown is an easy-to-write plain text format for creating dynamic documents and reports.

It’s designed to let us mix text and code to produce a Markdown document, that is then transformed into a final output

R Markdown: under the hood

Let’s dive in

Let’s dive in

Let’s dive in

What do we have? A text document (the R Markdown file).

Let’s dive in

What happened?

Text: Markdown

# A big header
## A smaller one
### Smaller still
A [link](https://daringfireball.net/projects/markdown/).
A sentence with _italics_ in it. **Boldface.**

A link. A sentence with italics in it. Boldface.

https://xkcd.com/1285/

Text: Markdown

Code: inline

R code in Markdown documents can be inline or in code chunks.

Inline code uses backticks, inside of which is an R expression that starts with “r”. It is typically used for something short:

  • These slides were made with `r R.version.string`.

  • These slides were made with R version 4.1.3 (2022-03-10).

  • Two plus two equals `r 2+2`.

  • Two plus two equals 4.

You can also easily insert equations like \(\sum_{n=1}^{10} n^2\)

Code: chunks

The other mechanism to include R code in your document is via code chunks. These are evaluated by knitr and the results inserted into your document.

Let’s try this!

Two important notes:

  • You can give chunks individual names, which makes it easier to navigate around longer documents.
  • RStudio has nice controls for running individual chunks.

Code: chunks

Side note: code chunks don’t have to be in R!

ls *.Rmd
## Advanced_RMarkdown.Rmd
## Getting-data-into-R.Rmd
## Introduction_to_RMarkdown.Rmd
## R-data-manipulation.Rmd
## ggplot2-basics1.Rmd
## ggplot2-basics2.Rmd
## ggplot2-intermediate.Rmd

Chunk options

We can control the behavior of code chunks by using chunk options (settings). Some common ones include:

Option Default Effect
eval TRUE Whether to evaluate the chunk’s code
echo TRUE Whether to display chunk’s code
include TRUE Whether to include chunk output
message TRUE Whether to display chunk messages
cache FALSE Whether to cache results

Chunk options

Try inserting a new chunk with the code dim(iris).

What happens if you set eval=FALSE? What about echo=FALSE?

Why?

Graphics

Say we have a figure we want to include:

library(gapminder)

p <- gapminder %>%
  filter(year == 1977) %>%
  ggplot(aes(gdpPercap, lifeExp, size = pop, color = continent)) +
  geom_point() +
  scale_x_log10() +
  ggtitle("gapminder 1977")

print(p)

Graphics

Graphics (behind the scenes)

Graphics

Let’s add a graph to your document.

Note there are chunk options fig.cap, fig.width, and fig.height you can change.

Tables

Tables can be made by hand (see the Markdown help page), but usually we have a data frame that we want to display.

The simplest method uses knitr’s built-in kable() function:

Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
5.4 3.9 1.7 0.4 setosa
4.6 3.4 1.4 0.3 setosa
5.0 3.4 1.5 0.2 setosa
4.4 2.9 1.4 0.2 setosa
4.9 3.1 1.5 0.1 setosa
5.4 3.7 1.5 0.2 setosa
4.8 3.4 1.6 0.2 setosa
4.8 3.0 1.4 0.1 setosa
4.3 3.0 1.1 0.1 setosa
5.8 4.0 1.2 0.2 setosa
5.7 4.4 1.5 0.4 setosa
5.4 3.9 1.3 0.4 setosa
5.1 3.5 1.4 0.3 setosa
5.7 3.8 1.7 0.3 setosa
5.1 3.8 1.5 0.3 setosa
5.4 3.4 1.7 0.2 setosa
5.1 3.7 1.5 0.4 setosa
4.6 3.6 1.0 0.2 setosa
5.1 3.3 1.7 0.5 setosa
4.8 3.4 1.9 0.2 setosa
5.0 3.0 1.6 0.2 setosa
5.0 3.4 1.6 0.4 setosa
5.2 3.5 1.5 0.2 setosa
5.2 3.4 1.4 0.2 setosa
4.7 3.2 1.6 0.2 setosa
4.8 3.1 1.6 0.2 setosa
5.4 3.4 1.5 0.4 setosa
5.2 4.1 1.5 0.1 setosa
5.5 4.2 1.4 0.2 setosa
4.9 3.1 1.5 0.2 setosa
5.0 3.2 1.2 0.2 setosa
5.5 3.5 1.3 0.2 setosa
4.9 3.6 1.4 0.1 setosa
4.4 3.0 1.3 0.2 setosa
5.1 3.4 1.5 0.2 setosa
5.0 3.5 1.3 0.3 setosa
4.5 2.3 1.3 0.3 setosa
4.4 3.2 1.3 0.2 setosa
5.0 3.5 1.6 0.6 setosa
5.1 3.8 1.9 0.4 setosa
4.8 3.0 1.4 0.3 setosa
5.1 3.8 1.6 0.2 setosa
4.6 3.2 1.4 0.2 setosa
5.3 3.7 1.5 0.2 setosa
5.0 3.3 1.4 0.2 setosa
7.0 3.2 4.7 1.4 versicolor
6.4 3.2 4.5 1.5 versicolor
6.9 3.1 4.9 1.5 versicolor
5.5 2.3 4.0 1.3 versicolor
6.5 2.8 4.6 1.5 versicolor
5.7 2.8 4.5 1.3 versicolor
6.3 3.3 4.7 1.6 versicolor
4.9 2.4 3.3 1.0 versicolor
6.6 2.9 4.6 1.3 versicolor
5.2 2.7 3.9 1.4 versicolor
5.0 2.0 3.5 1.0 versicolor
5.9 3.0 4.2 1.5 versicolor
6.0 2.2 4.0 1.0 versicolor
6.1 2.9 4.7 1.4 versicolor
5.6 2.9 3.6 1.3 versicolor
6.7 3.1 4.4 1.4 versicolor
5.6 3.0 4.5 1.5 versicolor
5.8 2.7 4.1 1.0 versicolor
6.2 2.2 4.5 1.5 versicolor
5.6 2.5 3.9 1.1 versicolor
5.9 3.2 4.8 1.8 versicolor
6.1 2.8 4.0 1.3 versicolor
6.3 2.5 4.9 1.5 versicolor
6.1 2.8 4.7 1.2 versicolor
6.4 2.9 4.3 1.3 versicolor
6.6 3.0 4.4 1.4 versicolor
6.8 2.8 4.8 1.4 versicolor
6.7 3.0 5.0 1.7 versicolor
6.0 2.9 4.5 1.5 versicolor
5.7 2.6 3.5 1.0 versicolor
5.5 2.4 3.8 1.1 versicolor
5.5 2.4 3.7 1.0 versicolor
5.8 2.7 3.9 1.2 versicolor
6.0 2.7 5.1 1.6 versicolor
5.4 3.0 4.5 1.5 versicolor
6.0 3.4 4.5 1.6 versicolor
6.7 3.1 4.7 1.5 versicolor
6.3 2.3 4.4 1.3 versicolor
5.6 3.0 4.1 1.3 versicolor
5.5 2.5 4.0 1.3 versicolor
5.5 2.6 4.4 1.2 versicolor
6.1 3.0 4.6 1.4 versicolor
5.8 2.6 4.0 1.2 versicolor
5.0 2.3 3.3 1.0 versicolor
5.6 2.7 4.2 1.3 versicolor
5.7 3.0 4.2 1.2 versicolor
5.7 2.9 4.2 1.3 versicolor
6.2 2.9 4.3 1.3 versicolor
5.1 2.5 3.0 1.1 versicolor
5.7 2.8 4.1 1.3 versicolor
6.3 3.3 6.0 2.5 virginica
5.8 2.7 5.1 1.9 virginica
7.1 3.0 5.9 2.1 virginica
6.3 2.9 5.6 1.8 virginica
6.5 3.0 5.8 2.2 virginica
7.6 3.0 6.6 2.1 virginica
4.9 2.5 4.5 1.7 virginica
7.3 2.9 6.3 1.8 virginica
6.7 2.5 5.8 1.8 virginica
7.2 3.6 6.1 2.5 virginica
6.5 3.2 5.1 2.0 virginica
6.4 2.7 5.3 1.9 virginica
6.8 3.0 5.5 2.1 virginica
5.7 2.5 5.0 2.0 virginica
5.8 2.8 5.1 2.4 virginica
6.4 3.2 5.3 2.3 virginica
6.5 3.0 5.5 1.8 virginica
7.7 3.8 6.7 2.2 virginica
7.7 2.6 6.9 2.3 virginica
6.0 2.2 5.0 1.5 virginica
6.9 3.2 5.7 2.3 virginica
5.6 2.8 4.9 2.0 virginica
7.7 2.8 6.7 2.0 virginica
6.3 2.7 4.9 1.8 virginica
6.7 3.3 5.7 2.1 virginica
7.2 3.2 6.0 1.8 virginica
6.2 2.8 4.8 1.8 virginica
6.1 3.0 4.9 1.8 virginica
6.4 2.8 5.6 2.1 virginica
7.2 3.0 5.8 1.6 virginica
7.4 2.8 6.1 1.9 virginica
7.9 3.8 6.4 2.0 virginica
6.4 2.8 5.6 2.2 virginica
6.3 2.8 5.1 1.5 virginica
6.1 2.6 5.6 1.4 virginica
7.7 3.0 6.1 2.3 virginica
6.3 3.4 5.6 2.4 virginica
6.4 3.1 5.5 1.8 virginica
6.0 3.0 4.8 1.8 virginica
6.9 3.1 5.4 2.1 virginica
6.7 3.1 5.6 2.4 virginica
6.9 3.1 5.1 2.3 virginica
5.8 2.7 5.1 1.9 virginica
6.8 3.2 5.9 2.3 virginica
6.7 3.3 5.7 2.5 virginica
6.7 3.0 5.2 2.3 virginica
6.3 2.5 5.0 1.9 virginica
6.5 3.0 5.2 2.0 virginica
6.2 3.4 5.4 2.3 virginica
5.9 3.0 5.1 1.8 virginica

Output formats

We haven’t really talked about pandoc, the software that transforms Markdown into an output format of our choice.

Use the Knit menu button, or the output: line in the YAML header, to change the output format to e.g. `word_document” or “pdf_document”.

These may require extra software to be installed.

Session info

For reproducibility, it is a good practice to call sessionInfo() at the end of your document. Like this:

sessionInfo()
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur/Monterey 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] gapminder_0.3.0 ggplot2_3.4.0   dplyr_1.0.9    
## 
## loaded via a namespace (and not attached):
##  [1] highr_0.9        pillar_1.8.1     bslib_0.3.1      compiler_4.1.3  
##  [5] jquerylib_0.1.4  tools_4.1.3      digest_0.6.29    jsonlite_1.8.0  
##  [9] evaluate_0.15    lifecycle_1.0.3  tibble_3.1.8     gtable_0.3.0    
## [13] pkgconfig_2.0.3  rlang_1.0.6      cli_3.4.1        rstudioapi_0.13 
## [17] yaml_2.3.5       xfun_0.30        fastmap_1.1.0    withr_2.5.0     
## [21] stringr_1.4.0    knitr_1.38       generics_0.1.3   vctrs_0.5.1     
## [25] sass_0.4.1       grid_4.1.3       tidyselect_1.1.2 glue_1.6.2      
## [29] R6_2.5.1         fansi_1.0.3      rmarkdown_2.18   farver_2.1.0    
## [33] purrr_0.3.4      magrittr_2.0.3   scales_1.2.1     htmltools_0.5.2 
## [37] colorspace_2.0-3 labeling_0.4.2   utf8_1.2.2       stringi_1.7.6   
## [41] munsell_0.5.0

Fancier things to whet your appetite

This has been the tip of the iceberg.

Fancier things: interactive graphs

Fancier things: interactive tables


This takes one two lines of code in RMarkdown. Example based on this post.

Fancier things: interactive maps

Fancier things: citation management

In a subsequent paper [@Bond-Lamberty2009-py], we used the same model 
outputs to examine the _hydrological_ implications of these wildfire 
regime shifts [@Nolan2014-us].

In a subsequent paper (Bond-Lamberty et al. 2009), we used the same model outputs to examine the hydrological implications of these wildfire regime shifts (Nolan et al. 2014).

References

Bond-Lamberty, Ben, Scott D Peckham, Stith T Gower, and Brent E Ewers. 2009. “Effects of Fire on Regional Evapotranspiration in the Central Canadian Boreal Forest.” Glob. Chang. Biol. 15 (5): 1242–54.

Nolan, Rachael H, Patrick N J Lane, Richard G Benyon, Ross A Bradstock, and Patrick J Mitchell. 2014. “Changes in Evapotranspiration Following Wildfire in Resprouting Eucalypt Forests.” Ecohydrol. 6 (January). Wiley Online Library.

Resources

The End

Thanks for attending this introduction to R Markdown documents workshop! We hope it was useful.

This presentation was made using R Markdown version 2.18 running under R version 4.1.3 (2022-03-10).

These slides are available at https://rpubs.com/bpbond/983108.

They were written in R Markdown! The code is here.