<——- Topics are to the left
This workshop assumes (but ask questions as needed!) you are:
data.frame
, for
, function calls, parameters, etc.Goal: exposure to a variety of more advanced R Markdown techniques and tricks.
Note that this is NOT intended to be a comprehensive survey of the possibilities with R Markdown.
Disclaimer: I’m not an expert, and this quickly gets really complex.
rmarkdown
is an R package for converting R Markdown documents into a variety of output formatsrender()
function processes R Markdown input, creating a Markdown (*.md
) fileknitr
, an R package for dynamic report generation with R.Here’s the YAML header for this presentation:
---
title: "Advanced R Markdown"
author: "BBL"
date: "`r format(Sys.time(), '%d %B %Y')`"
output:
html_document:
toc: true
toc_float: true
code_folding: hide
---
Things to notice:
date
field has inline R code to dynamically insert the current datehtml_document
setting for output:
has three sub-settings:
toc: true
generates a table of contents (based on #
and ##
lines)toc_float: true
makes it ‘floating’code_folding: hide
turns on code folding with a default of hidden codeggplot(diamonds, aes(carat, fill = cut)) +
geom_density(position = "stack")
You can use tabs to organize your content:
## Tabs {.tabset}
### Tab 1 name
(content)
### Tab 2 name
(content)
plot(cars$speed, cars$dist)
pairs(iris)
image(volcano)
For HTML output only, you can add the df_print: paged
parameter to your YAML header to have printed data frames rendered as HTML tables.
output:
html_document:
df_print: paged
mtcars
Equations are (mostly) straightforward and based on LaTeX mathematical typesetting:
R Markdown | Final document |
---|---|
$x^{n}$ |
\(x^{n}\) |
$\frac{a}{b}$ |
\(\frac{a}{b}\) |
$\sum_{n=1}^{10} n^2$ |
\(\sum_{n=1}^{10} n^2\) |
$\sigma \Sigma$ |
\(\sigma \Sigma\) |
A handy summary is here. Extremely usefully, the RStudio editor provides has an equation preview feature.
These are inserted with a bit of HTML, e.g. for the image above:
<img src="images-rmarkdown/editor-eq-preview.png" width = "75%">
There are lots of options that can be applied here, including size, whether the image floats, its justification, etc. See the img
tag documentation.
This quickly gets confusing (to me anyway).
These are built into rmarkdown
so easy to use; themes are from the Bootswatch theme library. Just insert lines into your YAML header:
output:
html_document:
theme: sandstone
highlight: tango
When the rmdformats package is installed, it allows us create R Markdown documents using very different themes.
output:
rmdformats::readthedown:
highlight: kate
There’s also the prettydoc package.
The first 10 letters are `r knitr::combine_words(LETTERS[1:10])`
.
The first 10 letters are A, B, C, D, E, F, G, H, I, and J.
Most R Markdown documents (including this one) have a first chunk that, among other things, sets the default chunk options:
knitr::opts_chunk$set(echo = TRUE)
Chunk options can take non-constant values; in fact, they can take values from arbitrary R expressions:
```{r}
# Define a global figure width value
my_fig_width <- 7
```
```{r, fig.width = my_fig_width}
plot(cars)
```
An example of R code in a chunk option setting:
```{r}
width_small <- 4
width_large <- 7
small_figs <- TRUE
```
```{r, fig.width = if(small_figs) width_small else width_large}
plot(cars)
```
Here’s a chunk that only executes when a particular package is available:
```{r, eval = require("ggplot2")}
ggplot2::ggplot(cars, aes(speed, dist)) + geom_point()
```
More information here.
R Markdown documents may be split, with a primary document incorporating others via a child document mechanism.
Don’t forget about the cache=TRUE
chunk option. Critical for keeping the build time of longer, complex documents under control.
Two trailing spaces are used to force a line break:
This line does not has two spaces at the end. The following line.
This line has two spaces at the end.
The following line.
(This is actually part of the Markdown spec.)
What if I want to run the same analysis, and/or generate the same report, for different datasets or conditions?
This offers the possibility of tremendously extending the utility of rmarkown
!
R Markdown documents can take parameters. These are specified in the YAML header as a name followed by a default value:
params:
cut: NULL
min_price: 0
and can then be accessed by code in the document, via a read-only list called params
:
print(params$min_price)
Let’s go make an R Markdown document that takes one or more parameters, for example to produce a report on some part of the diamonds
dataset.
So far so good, but how do we use this capability programmatically?
The rmarkdown::render()
function converts an input file to an output format, usually calling knitr::knit()
and pandoc along the way.
rmarkdown::render("diamonds-report.Rmd",
params = list(cut = "Ideal"),
output_file = "Ideal.html")
Let’s go make a driver script that generates an output file for each diamond cut in the dataset.
Because R Markdown files are parsed in a separate R instance, the working directory is the location of your R Markdown file.
Don’t mess with it via setwd()
.
Don’t mess with it via setwd()
.
If the first line of your #rstats script is setwd(“C:”), I will come into your lab and SET YOUR COMPUTER ON FIRE. Source
It’s almost always much better to use relative paths. Absolute paths aren’t robust and break reproducibility and transportability.
Note that render
has an output_dir
parameter.
Finally, check out the here package, which tries to figure out the top level of your current project using some sane heuristics.
Interactive graphics.
library(plotly)
p <- ggplot(mtcars, aes(hp, mpg, size = cyl, color = disp)) + geom_point()
ggplotly(p)
Handy if you want to sort or filter your table data.
library(DT)
library(gapminder)
datatable(mtcars, rownames = TRUE, filter = "top",
options = list(pageLength = 5, scrollX = TRUE))
Example based on this post.
I haven’t used the reactable
package but it can make cool tables, and link those tables to data visualizations:
library(dplyr)
library(sparkline)
library(reactable)
data <- chickwts %>%
group_by(feed) %>%
summarise(weight = list(weight)) %>%
mutate(boxplot = NA, sparkline = NA)
reactable(data, columns = list(
weight = colDef(cell = function(values) {
sparkline(values, type = "bar", chartRangeMin = 0, chartRangeMax = max(chickwts$weight))
}),
boxplot = colDef(cell = function(value, index) {
sparkline(data$weight[[index]], type = "box")
}),
sparkline = colDef(cell = function(value, index) {
sparkline(data$weight[[index]])
})
))
More information here.
I really like the simplicty of the leaflet
package.
library(leaflet)
leaflet() %>%
addTiles() %>%
setView(-76.9219, 38.9709, zoom = 17) %>%
addPopups(-76.9219, 38.9709,
"Here is the <b>Joint Global Change Research Institute</b>")
We might want to include citations. This is surprisingly easy; the source
In a subsequent paper [@Bond-Lamberty2009-py], we used the
same model outputs to examine the _hydrological_ implications
of these wildfire regime shifts [@Nolan2014-us].
Nolan et al. [-@Nolan2014-us] found that...
becomes:
In a subsequent paper (Bond-Lamberty et al. 2009), we used the same model outputs to examine the hydrological implications of these wildfire regime shifts (Nolan et al. 2014). Nolan et al. (2014) found that…
References
Bond-Lamberty, Ben, Scott D Peckham, Stith T Gower, and Brent E Ewers. 2009. “Effects of Fire on Regional Evapotranspiration in the Central Canadian Boreal Forest.” Glob. Chang. Biol. 15 (5): 1242–54.
Nolan, Rachael H, Patrick N J Lane, Richard G Benyon, Ross A Bradstock, and Patrick J Mitchell. 2014. “Changes in Evapotranspiration Following Wildfire in Resprouting Eucalypt Forests.” Ecohydrol. 6 (January). Wiley Online Library.
To do this we include a new in (of course) the YAML header, for example:
---
bibliography: bibliography.json
---
While *.json
is preferred, a wide variety of file formats can be accommodated:
Format | File extension |
---|---|
CSL-JSON | .json |
MODS | .mods |
BibLaTeX | .bib |
BibTeX | .bibtex |
RIS | .ris |
EndNote | .enl |
EndNote XML | .xml |
ISI | .wos |
MEDLINE | .medline |
Copac | .copac |
More details can be found here.
Larger projects can become difficult to manage in a single R Markdown file (or even one with child files).
The bookdown package (by the same author as rmarkdown
) offers several key improvements:
The last of these is so useful that it’s available in R Markdown as well:
output: bookdown::html_document2
```{r cars-plot, fig.cap = "An amazing plot"}
plot(cars)
```
```{r mtcars-plot, fig.cap = "Another amazing plot"}
plot(mpg ~ hp, mtcars)
```
See Figure \@ref(fig:cars-plot).
See Figure 1.
Theorems, equations, and tables can also be cross-referenced; see the documentation.
Good resources:
Thanks for attending this workshop on Advanced R Markdown! I hope it was useful.
This presentation was made using R Markdown version 2.2 running under R version 3.6.1 (2019-07-05). It is available at https://rpubs.com/bpbond/630335. The code is here.
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] leaflet_2.0.2 reactable_0.2.0 sparkline_2.0 dplyr_0.8.3
## [5] gapminder_0.3.0 DT_0.13 plotly_4.9.1 ggplot2_3.2.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.3 later_1.0.0 pillar_1.4.2
## [4] compiler_3.6.1 tools_3.6.1 digest_0.6.23
## [7] lifecycle_0.1.0 jsonlite_1.6 evaluate_0.14
## [10] tibble_2.1.3 gtable_0.3.0 viridisLite_0.3.0
## [13] pkgconfig_2.0.3 rlang_0.4.5 shiny_1.4.0.2
## [16] crosstalk_1.0.0 yaml_2.2.0 xfun_0.10
## [19] fastmap_1.0.1 reactR_0.4.2 withr_2.1.2
## [22] stringr_1.4.0 httr_1.4.1 knitr_1.25
## [25] vctrs_0.2.2 htmlwidgets_1.5.1 grid_3.6.1
## [28] tidyselect_0.2.5 glue_1.3.1 data.table_1.12.6
## [31] R6_2.4.1 rmarkdown_2.2 tidyr_1.0.0
## [34] purrr_0.3.3 magrittr_1.5 promises_1.1.0
## [37] scales_1.0.0 htmltools_0.4.0 assertthat_0.2.1
## [40] xtable_1.8-4 mime_0.7 colorspace_1.4-1
## [43] httpuv_1.5.2 labeling_0.3 stringi_1.4.3
## [46] lazyeval_0.2.2 munsell_0.5.0 crayon_1.3.4