Reproducible Research Using RMarkdown

Overview


  • What is reproducible Research?
  • What is RMarkdown?
  • Web-based approaches - Rpubs

R studio

What is reproducible research?

What is reproducible research?


The Holy Grail

Full details of any results reported and the methods and data used to obtain them should be made available, so that others following the same methods can obtain identical results.

  • Recently considered in terms of
    • Statistics
    • Econometrics
    • Signal Processing
    • Epidemiology
  • So maybe
    • Geocomputation?
    • Environmental Informatics?

One possible way forward


Rmarkdown is open source software for dynamic report generation with R, enabling integration of R code into PDF, HTML, and MS Word documents. Effectively it embeds R code into documents.

Why it Matters


  • Not just an Academic Issue
    • Open Data / Open Government
    • Accountability - How did you reach your conclusions or recommendations?

Why it Matters


“We believe that, at the point of publication, enough information should be available to reconstruct the process of analysis. This may be a full description of algorithms and/or software programs where appropriate.”

(p. 104 of the Russell Report)

Rogoff and Reinhart


NCG Example - An Open Geodemographic Classification from the 2011 Irish Census


  • Information relating to the data and clustering method used is freely available
    • Others able to scrutinise the approach
    • Others able to adapt the methodology
    • Openess of variables used
    • Avoid `faux-pas’ of using geodemographic classes to predict a variable already used in the classification system

Open Geodemographics


Open Geodemographics


How RMarkdown Works

Some simple RMarkdown content


---
title: "RMarkdown"
author: "Chris Brunsdon"
date: "17 November 2015"
output: html_document
---

## R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see . When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: ```{r hello} print('hello world!') ```

Making an Rmarkdown file


  • From RStudio select File,New File,R Markdown - then

    • Enter and save the R Markdown, then hit
    • R markdown files end in .Rmd

Useful R Markdown operations


  • # Subject - Highest Level Header
    • ## Less important subject - Next level down
    • ### Even less important subject - Next level down
    • #### Trivia - Lowest level
    • ![](photo.png) - Include an image in document
    • <http://rmarkdown.rstudio.com> - Include clickable link to URL
    • Also tables, bullet point lists, equations …
    • More here - http://rmarkdown.rstudio.com

More on embedding R - 1


Evaluate but don’t echo in the document:


```{r hello,echo=FALSE}
  print('hello world!')
```
Graphics are directly embedded - just include the plotting code

```{r randomplot,echo=FALSE}
  x <- seq(-1,1,length=21)
  y <- x*x + rnorm(21)/10
  plot(x,y)
```

More on embedding R - 2


So are plotly commands


```{r rain_graph,echo=FALSE}
p %>% layout(title="Phoenix Park Rainfall",
             yaxis=list(title="Monthly Total (mm)", showline=TRUE,tickangle=-90)) -> p2
p2 
```
and ggplot commands

```{r station_map,echo=FALSE}
rain %>% filter(Year > 1980,Station %in% c('Birr','Valentia','Belfast','Dublin Airport')) %>%
  highlight_key(~Year,"Year") -> rain_hy2
ggplot(rain_hy2,aes(x=Month,y=Rainfall,group=Year)) + geom_line(col='grey50') +
  geom_point(col='seagreen') + facet_wrap(~Station,ncol = 2) + labs(y='mm rain') -> plot_ky2
ggplotly(plot_ky2,tooltip='Year') %>%
   highlight(on = "plotly_click", off = "plotly_doubleclick",color='navy') 
```
  • Results of previous code remain (ie objects created etc.)
  • Code in Rmarkdown is independent of main R console #reproducible

Slide presentations 1


Modify the header


---
title: "My Life as a Hell's Angel"
author: "Daniel O'Donell"
date: ""
output: 
  ioslides_presentation: 
    widescreen: true
---

Slide presentations 2


Bullet points


  - Level 1
    - Level 2
    - Still level 2
  - Back to level 1 *italic*
  - Still level 1 **bold**

Be a web publisher

Rpubs


  • A simple web site to publish R HTML documents from R markdown
    • You need to sign up to Rpubs
  • But it isn’t difficult
  • Hit the publish option in the top right of the window
  • If you aren’t already on Rpubs you get the option to sign up
  • publish gives your article a web site
    • You can edit the markdown and update the page later
  • The ‘blogs’ of these lectures are created using Rmarkdown
    • So are the lecture slides

Self test


Create an Rpubs blog showing how to create a monthplot of the mean rainfall data for Ireland on a month-by-month basis from January 1970 to December 1979, and briefly describing the patterns in the plot.

Conclusion


💡 New ideas

  • In R
    • Blogging
    • R markdown
    • Rpubs
  • In methodology
    • Reproducibility
  • Asignment