Review: Documentation and testing in R

Andrew J. Bass
06-10-2014

Table of contents

  • How to create R packages
    • roxygen2
    • testthat
  • The edge package
    • Glance at the package
    • Quick start guide
    • Example

Getting started

There are two ways to create a R package:

  • Terminal:
package.skeleton(name = "sample")
  • In RStudio:

    • Click File -> New Project
  • You will notice a few things:

    • Directory: man, R
    • File: DESCRIPTION, NAMESPACE and read-and-delete-me

What are these files!

  • man: contains .Rd files which contain descriptions of functions used in package
  • R: contains the .R functions
  • DESCRIPTION: basic information about the package
  • NAMESPACE: which functions to export and import for the package
  • Read-and-delete-me: steps to run the package

Sample program

Let's create a simple program that determines all the prime numbers

getPrimed <- function(x) {
  allIntegers <- 2:x
  i <- 1
  while (allIntegers[i] <= ceiling(sqrt(x))) {
    bool <- allIntegers %% allIntegers[i] != 0 
    bool[i] <- TRUE
    allIntegers <- allIntegers[bool]
    i <- i + 1
  }
  return(allIntegers)
}

Creating the package

package.skeleton(list=c("getPrimed"), name = "getPrimed")
  • Here comes the challenging part: Documentation and testing!
  • Two packages make life easy:
    • roxygen2
    • testthat

Writing documentation: roxygen2

  • Help files are accessed by
?
help()
  • The old way of writing documentation can be suffocating! Instead, we will focus on the R package roxygen written by Hadley Wickam:
  • Automatically creates .Rd and NAMESPACE files!
install.packages("roxygen2")

Writing documentation: roxygen2

  • Roxygen comments start with #' before a function
  • tags start with @name and help specify sections of the help file
#' @title
#' @details
#' @author
#' @examples
tmp <- function(x) {
  print(x)
  }

Documenting getPrimed

#' @description The function getPrimed determines all the prime numbers up to a specific value
#'
#' @details x must be an integer > 1
#'
#' @title getPrimed: The Prime Function
#' @param x numeric number
#' @return list of prime numbers up to x
#' @examples
#' a <- 250
#' primes <- getPrimed(a)
#' @export 
getPrimed <- function(x){
   ...
}

Inheriting parameters in documentation

  • If you have many parameters that are used in multiple function then it can be annoying to document the same parameters!
#' @description prime numbers in both x and y
#'
#' @param y second parameter
#' @inheritParams getPrimed
#' @export
getPrimed2 <- function(x, y) {
  gpx <- getPrimed(x)
  gpy <- getPrimed(y)
  gpx[getPrimed(x) %in% getPrimed(y)]
}

Documenting similar functions

  • You can document multiple functions in the same file with @rdname.
#' @rdname getPrimed
getPrimed <- function(x) {
  ...
}

#' @rdname getPrimed
getPrimed2 <- function(x, y) {
  ...
}

Demo roxygen2

Testing the package

library(getPrtimed)
getPrimed(-10)
Error in while (allIntegers[i] <= ceiling(sqrt(x))) { : 
  missing value where TRUE/FALSE needed
  • Testing functions in a package is essential and can be time consuming! The testthat package allows for automated testing.

  • Hadley Wickam: “I started automating my tests because I discovered I was spending too much time recreating bugs that I had previously fixed.”

testthat package

Testing in the package has a hierarchical structure:

  • Expectations: testing the result of a function
  • Tests: grouping expectations to test a function
  • Context: groups together multiple tests that are testing similar functionality
install.packages("testthat")

testthat package: expectations

Expectations test whether a value is what you expect!

  • main function is expect_that(a, built_in(b)) which reads as “I expect that a will ____ b”
  • There are 11 built in functions:
    • equals(b): Tests equality between a and b with some numerical tolerance
    • is_identical_to(b): a is exactly b
    • throws_error(): a returns an error message

Expectations: examples

library(testthat)
b <- getPrimed(5)
expect_that(c(2, 3, 5), equals(b))
expect_that(getPrimed(13,12), throws_error())

expect_that(getPrimed(0), throws_error())
Error: getPrimed(0) code did not generate an error

expect_that(getPrimed(-10), throws_error())
Warning message:
In sqrt(x) : NaNs produced

expect_that(getPrimed(10.2), gives_warning())
Error: getPrimed(10.2) no warnings given

testthat package: tests

Each test should be testing a single functionality

test_that("getPrimed handles positive numerics > 1", {
  b <- getPrimed(5)

  expect_that(c(2, 3, 5), equals(b))
  expect_that(getPrimed(10.2), gives_warning())
  expect_that(getPrimed(1), gives_error())
  ### Can include plenty more tests! ###
})

testthat package: context

Context: Group tests into blocks that have related functionality

context("getPrimed: testing input variable x")
test_that("getPrimed handles positive integers > 1", {
  b <- getPrimed(5)
  expect_that(c(2, 3, 5), equals(b))
  expect_that(getPrimed(10.2), gives_warning())
})
test_that("getPrimed handles integers <= 1 ", {
  expect_that(getPrimed(0), throws_error())
  expect_that(getPrimed(-10), throws_error())
})

testthat package: where do I put the files?

  • Need a tests/ directory that contains
    • testthat.R file that loads the testthat package and runs test_check
library(testthat)
test_check("getPrimed")
  • testthat/ directory
    • All the unit tests are contained in this directory.

Demo testthat

The edge Package: Overview

  • Package for significance analysis of DNA microarray experiments
  • Can handle static or time course experiments
  • In the future it should be able to work for RNA-seq data
  • Main selling point: The Optimal Discovery Procedure
  • Written in S4 (a more stricter OO system when compared to S3)

edge package: Key functions

  • Function to create an edgeSet object from an ExpressionSet object
edgeSet
  • edgeFit fits a linear model to each gene
edgeFit

edge Package: Key functions 2

  • Given a full and null model, lrt performs generalized likelihood ratio test
lrt
  • Optimal Discovery Procedure: approach for optimally performing many hypothesis tests in a high-dimensional study.
odp

edge package: Quick Start Guide

  • Given an ExpressionSet object called expSet:
mNull <- ~sex
mFull <- ~sex +  ns(age, df=3, intercept = FALSE)
edgeObj <- edgeSet(expSet, full.model = mFull, null.model = mNull)

out.odp <- odp(edgeObj)
out.lrt <- lrt(edgeObj)

edge package: Examples

  • Three demos: kidney, endotoxin, gibson