Test workflows

Stefan Fleck

2017-06-03

Introduction

Automated testing is a powerful tool to ensure correct functioning of software. This vignette assumes that you are already familiar with the concept and benefits of automated tests. If you are not, I recommend the chapter on testing in Wickham (2015) (via) for a quick overview, or the classic Clean Code by Martin (2008) for a more extensive discussion of the subject (code examples are in Java, but many of the principles apply universally).

This document proposes a structure for organizing code and data for tests that go beyond simple self contained unit tests — e.g. long-running tests, tests that require access to a external resources that might not always be available, etc… . Many of the practices of the advice given in this vignette do not apply to multi-purpose packages aimed at a wide release, and might even be counter-productive.

A Taxonomy of Tests

A lot has been written on the subject of software testing, but much of the literature is aimed at developers that have to deal with large and complex software systems. This section presents a system for tests that I found useful in the context of developing R packages. It is not important what you call the different categories of tests, but it does make sense to keep tests based on different principles and with different aims (such as unit- and functional tests) sepparated.

Unit Tests

Concept:

Test a small unit of code, e.g a single function, from the programmer’s perspective.

Benefits:

Runtime:

All your unit tests together should ideally run in under one minute (Wickham 2015), as unit test should be regularly executed during programming sessions.

Usecases:

Example:

test_that("unique returns unique values of a character vector ", {
  tdat <- c("A", "A", "A")
  
  expect_identical(unique(tdat), "A")
})

External Examples:

Functional Tests

Concept:

Test a slice of functionality from the user perspective. As opposed to to unit testing, functional testing usually follows a black-box principle: Check if for given inputs the expected outputs are produced, without knowledge of what happens in between.

Benefits:

Runtime: Depends on your package. If you can design fast-running and completely self-contained functional tests for your package, you should absolutely do so and just put them together with your unit tests and run them regularly during your programming sessions. If you require longer-running functional tests, put them in their own directory (tests/testthat/functional_tests), so that they don’t deter you from regularly running your unit tests via devtools::test (Ctrl+Shift+T in Rstudio). This can become necessary if, for example, you need to write large amounts of data to a database or test the performance of a simulation.

Usecases

Example

test_that("t.test returns a p value < 0.95 for significantly different sample means", {
  # Karl wants to use a t test to see if the weight of carps 
  # significantly differ from the weight of trouts (which it does). 
  carps  <- rnorm(100, mean = 7)  
  trouts <- rnorm(100, mean = 2) 
  
  expect_true(t.test(carps, trouts, alternative = 'two.sided')$p.value < 0.95)
})

External Examples:

Acceptance Tests

Concept:

Test if a system fulfills the project requirements. All acceptance tests passing indicates that the program is finished. If you think of user tests as programmer-level, and functional tests as user-level, acceptance tests would be boss-level (i.e. test the goals set by your client/employer).

Benefits:

Runtime:

Acceptance tests need only to be run occasionaly and can be long-running or rely on external resources (but use with care).

Usecases

See Example

Example:

Your boss tells you to re-implement an analysis workflow from SAS code to R. You formulate that goal as automated acceptance tests by comparing the output of your R workflow against the results produced by the legacy code. If both workflows produce the same result given the same input data, your test passes and your goal is fulfilled.

test_that("SAS workflow for analysing business data was successfully ported to R.", {
  expected_results <- testthis::load_test('expected_results.rda')  
  test_input_data <- fetch_input_data(projectdb, loggin_credentials)
  aux_dat         <- fetch_aux_data_from_web()
  
  tdat            <- prepare(test_input_data)
  new_workflow_result <- run_complicated_analysis(tdat, aux_dat)

  expect_identical(new_workflow_results, expected_results)
})

Manual Tests

Concept:

Tests that require human supervision, because they cannot be formulated well programmatically. They do not fit in the hierarchy above but a rather parallel to it. You can, for example, have manual acceptance tests (but you should never ever have manual unit tests). You should have only a handful of manual tests for a project, and only if they absolutely cannot be avoided. Think of them as deal-with-the-devil-level tests.

Benefits:

Sometimes you will have to write code that produces pre-formatted excel workbooks or complex plots that have to look nice. Manual tests provide a framework of organizing such outputs.

Runtime:

Run only on occasion, so can be long-running.

Example

test_that("plot is pretty", {
  tres <- produce_pretty_plot(iris)
  ggsave('pretty.png', tres)
  
  # Look at pretty plot
})

Organizing the tests/testhat directory

A Proposed Folder Structure

While organizing unit tests if fairly straight forward and well supported by testthat, there is less advice on how to structure longer running and more “messy” tests.

tests
  +-- testthat
  |  +-- test_data
  |  +-- test_data_raw
  |  +-- test_out
  |  +-- functional_tests
  |  +-- acceptance_tests
  |  +-- manual_tests

If you use them, test_data and test_out need to be added to your .gitignore and .Rbuildignore files.

Tools Provided by testthis

Testthis provides convenience functions to deal with tests that are organized in the manner outlined above:

You can easily set up functions to interact with a custom test folder structure via the test_subfolder() function.

test_long <- function(){test_subdir('long_tests')}
test_long()

References

Martin, Robert C. 2008. Clean Code: A Handbook of Agile Software Craftsmanship. 1st ed. Upper Saddle River, NJ, USA: Prentice Hall PTR.

Percival, Harry J.W. 2014. “Test-Driven Development with Python.” Sebastopol, CA: O’Reilly Media. https://www.obeythetestinggoat.com/pages/book.html.

Wickham, Hadley. 2015. R Packages. 1st ed. O’Reilly Media, Inc. http://r-pkgs.had.co.nz/.