RMarkdown and Knitr

UC Davis ABG 250 - Mathematical Modeling in Biological Systems

Courtney D. Shelley

What is RMarkdown?

  • RMarkdown is a tool to integrate writing blocks with "live" R code.

  • R code is evaluated as part of the processing of the markdown document.

  • Data, code, and processing can flow in a single document.

  • Reproducible research through literate statistical programming is the goal:

    • An article which is a stream of text and code chunks, weaved to produce a human-readable document and tangled to produce a machine-readable document.
    • R code is evaluated when the document is processed and the results are inserted into the document, demonstrating functional code and reproducible results.
  • See http://rmarkdown.rstudio.com/

Creating an RMarkdown Document

  • If you haven't done so already, install RStudio

  • Select File \(\rightarrow\) New File \(\rightarrow\) R Markdown from the dropdown menu.

  • Give your document a title and author.

  • You have three output format options:

    • HTML produces a web-readable document, which will open in any browser (hint: great for emailing but don't use this if you plan on printing your document)
    • PDF, uneditable to receivers and maintains its formatting on any platform.
    • Word documents allow the receiver to edit and comment on your document.

RMarkdown Document Basics

  • The front-of-document chunk within the --- is a bit of css code that formats your document. With a bit of research, you can customize this. See http://rmarkdown.rstudio.com/html_document_format.html.

  • Text chunks are separated from code chunks. Text chunks can include mathematical formulas, web links, anything you'd normally include in a text editor like Word.

  • Code chunks are separated within '''{r} [code chunk] '''. By default, code will echo back as output before the results of the code, but this can be modified with echo = FALSE.

Text Chunk Options

Emphasis:     *italic* or _italic_ 
              **bold** or __bold__

Headers:      # Header 1    
              ## Header 2   
              ### Header 3  

Unordered lists:    - Item 1
                    - Item 2
                    - Item 3

Ordered lists:      1. Item 1
                    2. Item 2
                    3. Item 3

A Simple Example

## Classification With Iris Data Set

We will use R.A. Fisher's classic iris data set to generate a classification tree. 
# ```{r loadData, message = FALSE}
#Load data and required packages
data(iris)
library(caret); library(rattle)
nrow <- nrow(iris); ncol <- ncol(iris)  #inline code
iris[1:4,]
# ```
The data consists of `'r nrow'` rows and `'r ncol'` columns, with no missing values.  

Classification Analysis with Iris Data Set

We will use R.A. Fisher's classic iris data set to generate a classification tree.

#Load data and required packages
data(iris)
library(caret); library(rattle)
nrow <- nrow(iris); ncol <- ncol(iris)  # inline code
iris[1:4,]
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa

The data consists of 150 rows and 5 columns, with no missing values.

### Analysis

Analysis consisted of fitting a predictive model of iris species based on 
petal length/width and sepal length/width. 
#```{r analysis, message = FALSE}
modFit <- train(Species ~., method = "rpart", data=iris) #Fit model
print(modFit$finalModel)   #Summarize model
#```
The final analysis is presented as a decison tree.
#```{r tree, fig.width = 5, fig.height = 4}

fancyRpartPlot(modFit$finalModel) #Plot decision tree
#```

Analysis

Analysis consisted of fitting a predictive model of iris species based on petal length/width and sepal length/width. The final model was then plotted as a decision tree.

modFit <- train(Species ~., method = "rpart", data=iris) #Fit model
print(modFit$finalModel)   #Summarize model
## n= 150 
## 
## node), split, n, loss, yval, (yprob)
##       * denotes terminal node
## 
## 1) root 150 100 setosa (0.33333 0.33333 0.33333)  
##   2) Petal.Length< 2.45 50   0 setosa (1.00000 0.00000 0.00000) *
##   3) Petal.Length>=2.45 100  50 versicolor (0.00000 0.50000 0.50000)  
##     6) Petal.Width< 1.75 54   5 versicolor (0.00000 0.90741 0.09259) *
##     7) Petal.Width>=1.75 46   1 virginica (0.00000 0.02174 0.97826) *

More Cool Stuff With Markdown

Knitr

To complete your document you will need to knit together the text and the code chunks into (1) a human-readable document, and (2) a machine-readable document.

  1. Save your document by clicking the floppy disk icon or File \(\rightarrow\) Save. Your document will have the file extension .Rmd.

  2. Click Knit HTML to create two output documents:

  • A .html document, which will open in your favorite web browser.
  • A .md document, which contains all the document formatting. In the newer version of RStudio, this won't be saved but you can retain it in Options.

RPubs

You can publish an RMarkdown piece to a publicly accessible website using RPubs by clicking Publish on your finished HTML.