Reproducible,

Publication-Ready Charts &

Figures with R

2013 OEEB Fellows Colloquium

Stephanie Kovalchik, kovalchiksa@nih.gov
Biostatistics Branch
National Cancer Institute

Have You Ever...

  • Copy and pasted data into Word/Excel tables?

  • Entered data manually into a table?

  • Used text boxes to customize plots in Word?

Is This A Common Situation?

Table Convert

What If You Could...

  • Perform an analysis and create a Word table with the results at the same time?

  • Avoid errors resulting from copying & pasting or manual data entry?

  • Make insightful and pretty plots without a single drag-and-click?

Goal of Reproducibility

The term reproducible research refers to the idea that the ultimate product of academic research is the paper along with the full computational environment used to produce the results in the paper such as the code, data, etc. that can be used to reproduce the results and create new work based on the research.


Source: en.wikipedia.org/wiki/Reproducibility#Reproducible_research

Outline

Today, I will introduce two packages in R to make beautiful, reproducible reports:

  • odfWeave

  • ggplot2

Reproducible Reports With odfWeave

Steps For Using odfWeave

odfWeave allows you to integrate text and code in a single document.


How it works:

  1. Write your documents in odt or odf format.

  2. Process the document with the function odfWeave

  3. The output is a odf file with text, code, and output.

Open Document Format

Open document format files are a "universal" file format which can be read by:

  • Microsoft Office (proprietary, Windows and Mac OS)

  • OpenOffice (free, open-source, multi-platform)

  • LibreOffice (free, open-source, multi-platform)

  • In fact, most any text editor

Getting Started With odfWeave

  1. Install R from www.r-project.org.

  2. Install package odfWeave.

  3. Create odt file with text and code chunks.

  4. Load odfWeave and process file with function odfWeave.

Getting Started With odfWeave

install.packages("odfWeave")  # Install Package
library(odfWeave)  # Load Package

Creating Source odt File

  • The source file for odfWeave can mix text and code.

  • Write text as usual.

  • Code chunks are denoted with a starting <<>>= and closing @.

  • Inline code is enclosed in \Sexpr{}.

  • Chunks and inline R care are written as usual.

Example: Source odt File

In this example, we will see how to get started with odfWeave.

Iris Output

Processing The Source File

  1. Save the file (for example, source.odt)

  2. In R, run the command odfWeave, indicating the file (source.odt) and dest (e.g., output.odt)

Example: Processing The Source File

If we save our example as iris.odt, the following command will output the result as an odt document named iris_output.odt to the working directory.

odfWeave("iris.odt", "iris_output.odt")

Example Output

Now we can open iris_output.odt in our text editor.

Iris Output

Some Functions For Customized Output

Function Output Type
odfItemize Lists
odfTable Tables
setStyleDefs Specify document style

Example: More Complex Source File

Iris Output

Output

Iris Output

Visualizing Data With ggplot2

Author

ggplot

The Basics

  • Plots begin by specifying data and aesthetics

ggplot( data, aes(y, x, ymin, ymax))

  • Add layers named geoms using `+' to specify how to represent the data
  • Add layers named scales using `+' to modify the cosmetics of the plotted data

A Few Geoms...

Geom Description
geom_abline User-specified line
geom_bar Bar chart
geom_boxplot Box-whisker plot
geom_point Scatter plot of points
geom_errorbar Scatter plot of points with error bars
geom_line Line connect y and x

A Few Scales..

Scale Description
scale_x_continuous x-axis representation (continuous)
scale_x_discrete x-axis representation (discrete)
scale_y_continuous y-axis representation (continuous)
scale_y_discrete y-axis representation (discrete)
scale_colour_manual User-specified colour of points, lines, etc...
scale_shape_manual User-specified shape of points, lines, etc...

Simple Plot

ggplot

Output

ggplot

Customizing ggplot2

  1. Change symbol size (size)

  2. Change group colours (scale_color_manual)

  3. Use theme to modify settings for the following plot elements:

    • legend.title
    • legend.key
    • panel.background

Example: Customizing ggplot2

library(RColorBrewer)  # Package with many color palettes

Blues <- brewer.pal(9, "Blues")[c(5, 7, 9)]

ggplot(data = iris, aes(y = Sepal.Length, x = Petal.Width, colour = Species)) + 
    geom_point(size = 2) + scale_colour_manual(values = Blues) + 
    theme(legend.title = element_blank(), 
    	  legend.key = element_blank(), 
    	  panel.background = element_rect(fill = brewer.pal(8, "Pastel2")[4]))

Output

ggplot

Faceting & Summarizing

  • We can shows groups in different elements of a grid (rows or columns) using facet.grid

  • Almost any kind of summary of the plot's data frame can be added using stat_summary functions.

Example: Faceting & Summarizing

ggplot(data = iris, aes(y = Sepal.Length, x = Petal.Width, colour = Species)) + 
    facet_grid(~Species) + 
    geom_point(size = 2) + 
    stat_smooth(method = "lm") + 
    scale_colour_manual(values = Blues) + 
    theme(legend.title = element_blank(), 
    	  panel.background = element_rect(fill = brewer.pal(8, "Pastel2")[4]))

Output

ggplot

Further Reading