Advanced R Markdown

Topics and goals

<——- Topics are to the left

This workshop assumes (but ask questions as needed!) you are:

familiar with basic data structures and control structure of R: data.frame, for, function calls, parameters, etc.
comfortable with the idea, and basic mechanics, of R Markdown documents: how to make them, chunks and chunk options, inline code
have heard of terms like HTML

Goal: exposure to a variety of more advanced R Markdown techniques and tricks.

Note that this is NOT intended to be a comprehensive survey of the possibilities with R Markdown.

Under the hood

Disclaimer: I’m not an expert, and this quickly gets really complex.

rmarkdown is an R package for converting R Markdown documents into a variety of output formats
Its render() function processes R Markdown input, creating a Markdown (*.md) file
This uses knitr, an R package for dynamic report generation with R.
This is then transformed into HTML by pandoc
R Markdown files have a YAML header giving configuration options that can apply to many stages of this pipeline

R Markdown workflow

Original graphic from The R Markdown Cookbook

Areas we’ll be discussing

Original graphic from The R Markdown Cookbook

HTML goodies

TOC and code folding

Here’s the YAML header for this presentation:

---
title: "Advanced R Markdown"
author: "BBL"
date: "`r format(Sys.time(), '%d %B %Y')`"
output:
  html_document:
    toc: true
    toc_float: true
    code_folding: hide
---

Things to notice:

The date field has inline R code to dynamically insert the current date
The html_document setting for output: has three sub-settings:
- toc: true generates a table of contents (based on # and ## lines)
- toc_float: true makes it ‘floating’
- code_folding: hide turns on code folding with a default of hidden code

ggplot(diamonds, aes(carat, fill = cut)) +
  geom_density(position = "stack")

Tabs

You can use tabs to organize your content:

## Tabs {.tabset}

### Tab 1 name
(content)

### Tab 2 name
(content)

Cars

plot(cars$speed, cars$dist)

Iris

pairs(iris)

Volcano

image(volcano)

Printing data frames

For HTML output only, you can add the df_print: paged parameter to your YAML header to have printed data frames rendered as HTML tables.

output:
  html_document:
    df_print: paged

mtcars

Equations

Equations are (mostly) straightforward and based on LaTeX mathematical typesetting:

R Markdown	Final document
$x^{n}$	$x^{n}$
$\frac{a}{b}$	$\frac{a}{b}$
$\sum_{n=1}^{10} n^2$	$\sum_{n=1}^{10} n^2$
$\sigma \Sigma$	$\sigma \Sigma$

A handy summary is here. Extremely usefully, the RStudio editor provides has an equation preview feature.

Static image files

These are inserted with a bit of HTML, e.g. for the image above:

<img src="images-rmarkdown/editor-eq-preview.png" width = "75%">

There are lots of options that can be applied here, including size, whether the image floats, its justification, etc. See the img tag documentation.

Themes

This quickly gets confusing (to me anyway).

Bootswatch themes

These are built into rmarkdown so easy to use; themes are from the Bootswatch theme library. Just insert lines into your YAML header:

output:
  html_document:
    theme: sandstone
    highlight: tango

rmdformats

When the rmdformats package is installed, it allows us create R Markdown documents using very different themes.

output:
  rmdformats::readthedown:
    highlight: kate

There’s also the prettydoc package.

Custom CSS

You can use a custom Cascading Style Sheet (CSS) file. You’re on your own here :)

knitr tricks

combine_words

The first 10 letters are `r knitr::combine_words(LETTERS[1:10])`.

The first 10 letters are A, B, C, D, E, F, G, H, I, and J.

Chunk defaults

Most R Markdown documents (including this one) have a first chunk that, among other things, sets the default chunk options:

knitr::opts_chunk$set(echo = TRUE)

Computable chunk options

Chunk options can take non-constant values; in fact, they can take values from arbitrary R expressions:

```{r}
# Define a global figure width value
my_fig_width <- 7
```

```{r, fig.width = my_fig_width}
plot(cars)
```

An example of R code in a chunk option setting:

```{r}
width_small <- 4
width_large <- 7
small_figs <- TRUE
```

```{r, fig.width = if(small_figs) width_small else width_large}
plot(cars)
```

Here’s a chunk that only executes when a particular package is available:

```{r, eval = require("ggplot2")}
ggplot2::ggplot(cars, aes(speed, dist)) + geom_point()
```

More information here.

Child documents

R Markdown documents may be split, with a primary document incorporating others via a child document mechanism.

Caching

Don’t forget about the cache=TRUE chunk option. Critical for keeping the build time of longer, complex documents under control.

Line breaks

Two trailing spaces are used to force a line break:

This line does not has two spaces at the end. The following line.

This line has two spaces at the end.
The following line.

(This is actually part of the Markdown spec.)

Programmatic reports

What if I want to run the same analysis, and/or generate the same report, for different datasets or conditions?

This offers the possibility of tremendously extending the utility of rmarkown!

Parameters

R Markdown documents can take parameters. These are specified in the YAML header as a name followed by a default value:

params:
  cut: NULL
  min_price: 0

and can then be accessed by code in the document, via a read-only list called params:

print(params$min_price)

Let’s go make an R Markdown document that takes one or more parameters, for example to produce a report on some part of the diamonds dataset.

render

So far so good, but how do we use this capability programmatically?

The rmarkdown::render() function converts an input file to an output format, usually calling knitr::knit() and pandoc along the way.

rmarkdown::render("diamonds-report.Rmd", 
  params = list(cut = "Ideal"),
  output_file = "Ideal.html")

Let’s go make a driver script that generates an output file for each diamond cut in the dataset.

Working directory issues

Because R Markdown files are parsed in a separate R instance, the working directory is the location of your R Markdown file.

Don’t mess with it via setwd().

Don’t mess with it via setwd().

If the first line of your #rstats script is setwd(“C:”), I will come into your lab and SET YOUR COMPUTER ON FIRE. Source

It’s almost always much better to use relative paths. Absolute paths aren’t robust and break reproducibility and transportability.

Note that render has an output_dir parameter.

Finally, check out the here package, which tries to figure out the top level of your current project using some sane heuristics.

Neat R packages

plotly

Interactive graphics.

library(plotly)
p <- ggplot(mtcars, aes(hp, mpg, size = cyl, color = disp)) + geom_point()
ggplotly(p)

DT

Handy if you want to sort or filter your table data.

library(DT)
library(gapminder)
datatable(mtcars, rownames = TRUE, filter = "top", 
          options = list(pageLength = 5, scrollX = TRUE))

Example based on this post.

reactable

I haven’t used the reactable package but it can make cool tables, and link those tables to data visualizations:

library(dplyr)
library(sparkline)
library(reactable)

data <- chickwts %>%
  group_by(feed) %>%
  summarise(weight = list(weight)) %>%
  mutate(boxplot = NA, sparkline = NA)

reactable(data, columns = list(
  weight = colDef(cell = function(values) {
    sparkline(values, type = "bar", chartRangeMin = 0, chartRangeMax = max(chickwts$weight))
  }),
  boxplot = colDef(cell = function(value, index) {
    sparkline(data$weight[[index]], type = "box")
  }),
  sparkline = colDef(cell = function(value, index) {
    sparkline(data$weight[[index]])
  })
))

More information here.

leaflet

I really like the simplicty of the leaflet package.

library(leaflet)
leaflet() %>% 
  addTiles() %>%
  setView(-76.9219, 38.9709, zoom = 17) %>%
  addPopups(-76.9219, 38.9709,
            "Here is the <b>Joint Global Change Research Institute</b>")

Citations and references

We might want to include citations. This is surprisingly easy; the source

In a subsequent paper [@Bond-Lamberty2009-py], we used the
same model outputs to examine the _hydrological_ implications
of these wildfire regime shifts [@Nolan2014-us]. 
Nolan et al. [-@Nolan2014-us] found that...

becomes:

In a subsequent paper (Bond-Lamberty et al. 2009), we used the same model outputs to examine the hydrological implications of these wildfire regime shifts (Nolan et al. 2014). Nolan et al. (2014) found that…

References

Bond-Lamberty, Ben, Scott D Peckham, Stith T Gower, and Brent E Ewers. 2009. “Effects of Fire on Regional Evapotranspiration in the Central Canadian Boreal Forest.” Glob. Chang. Biol. 15 (5): 1242–54.

Nolan, Rachael H, Patrick N J Lane, Richard G Benyon, Ross A Bradstock, and Patrick J Mitchell. 2014. “Changes in Evapotranspiration Following Wildfire in Resprouting Eucalypt Forests.” Ecohydrol. 6 (January). Wiley Online Library.

To do this we include a new in (of course) the YAML header, for example:

---
bibliography: bibliography.json
---

While *.json is preferred, a wide variety of file formats can be accommodated:

Format	File extension
CSL-JSON	.json
MODS	.mods
BibLaTeX	.bib
BibTeX	.bibtex
RIS	.ris
EndNote	.enl
EndNote XML	.xml
ISI	.wos
MEDLINE	.medline
Copac	.copac

More details can be found here.

Bookdown

Larger projects can become difficult to manage in a single R Markdown file (or even one with child files).

The bookdown package (by the same author as rmarkdown) offers several key improvements:

Books and reports can be built from multiple R Markdown files
Documents can easily be exported in a range of formats suitable for publishing, including PDF, e-books and HTML websites
Additional formatting features are added, such as cross-referencing, and numbering of figures, equations, and tables

The last of these is so useful that it’s available in R Markdown as well:

output: bookdown::html_document2

```{r cars-plot, fig.cap = "An amazing plot"}
plot(cars)
```

```{r mtcars-plot, fig.cap = "Another amazing plot"}
plot(mpg ~ hp, mtcars)
```

See Figure \@ref(fig:cars-plot).

See Figure 1.

Theorems, equations, and tables can also be cross-referenced; see the documentation.

Resources

Good resources:

The End

Thanks for attending this workshop on Advanced R Markdown! I hope it was useful.

This presentation was made using R Markdown version 2.2 running under R version 3.6.1 (2019-07-05). It is available at https://rpubs.com/bpbond/630335. The code is here.

sessionInfo()

## R version 3.6.1 (2019-07-05)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] leaflet_2.0.2   reactable_0.2.0 sparkline_2.0   dplyr_0.8.3    
## [5] gapminder_0.3.0 DT_0.13         plotly_4.9.1    ggplot2_3.2.1  
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.3        later_1.0.0       pillar_1.4.2     
##  [4] compiler_3.6.1    tools_3.6.1       digest_0.6.23    
##  [7] lifecycle_0.1.0   jsonlite_1.6      evaluate_0.14    
## [10] tibble_2.1.3      gtable_0.3.0      viridisLite_0.3.0
## [13] pkgconfig_2.0.3   rlang_0.4.5       shiny_1.4.0.2    
## [16] crosstalk_1.0.0   yaml_2.2.0        xfun_0.10        
## [19] fastmap_1.0.1     reactR_0.4.2      withr_2.1.2      
## [22] stringr_1.4.0     httr_1.4.1        knitr_1.25       
## [25] vctrs_0.2.2       htmlwidgets_1.5.1 grid_3.6.1       
## [28] tidyselect_0.2.5  glue_1.3.1        data.table_1.12.6
## [31] R6_2.4.1          rmarkdown_2.2     tidyr_1.0.0      
## [34] purrr_0.3.3       magrittr_1.5      promises_1.1.0   
## [37] scales_1.0.0      htmltools_0.4.0   assertthat_0.2.1 
## [40] xtable_1.8-4      mime_0.7          colorspace_1.4-1 
## [43] httpuv_1.5.2      labeling_0.3      stringi_1.4.3    
## [46] lazyeval_0.2.2    munsell_0.5.0     crayon_1.3.4

R Markdown	Final document
$x^{n}$	\(x^{n}\)
$\frac{a}{b}$	\(\frac{a}{b}\)
$\sum_{n=1}^{10} n^2$	\(\sum_{n=1}^{10} n^2\)
$\sigma \Sigma$	\(\sigma \Sigma\)