Introduction

Automated testing is a powerful tool to ensure correct functioning of software. This vignette assumes that you are already familiar with the concept and benefits of automated tests. If you are not, I recommend the chapter on testing in Wickham (2015) (via) for a quick overview, or the classic Clean Code by Martin (2008) for a more extensive discussion of the subject (code examples are in Java, but many of the principles apply universally).

This document proposes a structure for organizing code and data for tests that go beyond simple self contained unit tests — e.g. long-running tests, tests that require access to a external resources that might not always be available, etc… . Many of the practices of the advice given in this vignette do not apply to multi-purpose packages aimed at a wide release, and might even be counter-productive.

A Taxonomy of Tests

A lot has been written on the subject of software testing, but much of the literature is aimed at developers that have to deal with large and complex software systems. This section presents a system for tests that I found useful in the context of developing R packages. It is not important what you call the different categories of tests, but it does make sense to keep tests based on different principles and with different aims (such as unit- and functional tests) sepparated.

Unit Tests

Concept:

Test a small unit of code, e.g a single function, from the programmer’s perspective.

Benefits:

Ensure correct functionality at the lowest abstraction level.
Help you track down bugs early
Help you write cleaner code. If you cannot easily create unit tests for your function, this indicates that you need to refactor your code.

Runtime:

All your unit tests together should ideally run in under one minute (Wickham 2015), as unit test should be regularly executed during programming sessions.

Usecases:

If possible, whenever you are adding a new function to your code
Always when refactoring
Always when fixing a bug. Refer to issue number in your code if you are using a bug-tracker

Example:

test_that("unique returns unique values of a character vector ", {
  tdat <- c("A", "A", "A")
  
  expect_identical(unique(tdat), "A")
})

External Examples:

Functional Tests

Concept:

Test a slice of functionality from the user perspective. As opposed to to unit testing, functional testing usually follows a black-box principle: Check if for given inputs the expected outputs are produced, without knowledge of what happens in between.

Benefits:

Ensure correct functionality at the user-level.
Provides a certain level of documentation of your code for yourself and other developers as they explicitly state the intent of your code.

Runtime: Depends on your package. If you can design fast-running and completely self-contained functional tests for your package, you should absolutely do so and just put them together with your unit tests and run them regularly during your programming sessions. If you require longer-running functional tests, put them in their own directory (tests/testthat/functional_tests), so that they don’t deter you from regularly running your unit tests via devtools::test (Ctrl+Shift+T in Rstudio). This can become necessary if, for example, you need to write large amounts of data to a database or test the performance of a simulation.

Usecases

Fast-runing functional tests can augment a suite of unit tests
Slow-runing unit tests can help you test the functionality of a complex system that require access to external resources, e.g. an R package that communicates with a database inside your company

Example

test_that("t.test returns a p value < 0.95 for significantly different sample means", {
  # Karl wants to use a t test to see if the weight of carps 
  # significantly differ from the weight of trouts (which it does). 
  carps  <- rnorm(100, mean = 7)  
  trouts <- rnorm(100, mean = 2) 
  
  expect_true(t.test(carps, trouts, alternative = 'two.sided')$p.value < 0.95)
})

External Examples:

Using a Functional Test to Scope Out a Minimum Viable Django App (python) from Percival (2014)

Acceptance Tests

Concept:

Test if a system fulfills the project requirements. All acceptance tests passing indicates that the program is finished. If you think of user tests as programmer-level, and functional tests as user-level, acceptance tests would be boss-level (i.e. test the goals set by your client/employer).

Benefits:

Provides proof that your software performs as expected
Thinking about acceptance tests helps you formulate, often abstract, project goals in a testable manner. If you have a boss, acceptance tests should be designed in a feedback process with your boss.

Runtime:

Acceptance tests need only to be run occasionaly and can be long-running or rely on external resources (but use with care).

Usecases

See Example

Example:

Your boss tells you to re-implement an analysis workflow from SAS code to R. You formulate that goal as automated acceptance tests by comparing the output of your R workflow against the results produced by the legacy code. If both workflows produce the same result given the same input data, your test passes and your goal is fulfilled.

test_that("SAS workflow for analysing business data was successfully ported to R.", {
  expected_results <- testthis::load_test('expected_results.rda')  
  test_input_data <- fetch_input_data(projectdb, loggin_credentials)
  aux_dat         <- fetch_aux_data_from_web()
  
  tdat            <- prepare(test_input_data)
  new_workflow_result <- run_complicated_analysis(tdat, aux_dat)

  expect_identical(new_workflow_results, expected_results)
})

Manual Tests

Concept:

Tests that require human supervision, because they cannot be formulated well programmatically. They do not fit in the hierarchy above but a rather parallel to it. You can, for example, have manual acceptance tests (but you should never ever have manual unit tests). You should have only a handful of manual tests for a project, and only if they absolutely cannot be avoided. Think of them as deal-with-the-devil-level tests.

Benefits:

Sometimes you will have to write code that produces pre-formatted excel workbooks or complex plots that have to look nice. Manual tests provide a framework of organizing such outputs.

Runtime:

Run only on occasion, so can be long-running.

Example

test_that("plot is pretty", {
  tres <- produce_pretty_plot(iris)
  ggsave('pretty.png', tres)
  
  # Look at pretty plot
})

Organizing the tests/testhat directory

A Proposed Folder Structure

While organizing unit tests if fairly straight forward and well supported by testthat, there is less advice on how to structure longer running and more “messy” tests.

tests
  +-- testthat
  |  +-- test_data
  |  +-- test_data_raw
  |  +-- test_out
  |  +-- functional_tests
  |  +-- acceptance_tests
  |  +-- manual_tests

/test/testthat: Test root. Unit tests go here directly. These tests will be run by devtools::test() (CTRL+SHIFT+T in Rstudio).
/test/testthat/test_data: Data necessary for tests goes here. Binary data should be in the .rds format; see ?readDS(). Unit tests should not use data from this directory but rather be self contained, but for functional or acceptance tests it might be useful to compare the results of an analysis workflow against a reference data set.
/test/testthat/test_data_raw: Contains scripts that create the files in test_data; similar to the data_raw directory proposed in Wickham (2015) (via).
/test/testthat/test_out: Output directory for manual tests. As mentioned in the manual tests section, output that has to be checked manually should be avoided at any cost.
/test/testthat/functional_tests: Functional tests
/test/testthat/acceptance_tests: Acceptance tests
/test/testthat/manual_tests: Manual tests

If you use them, test_data and test_out need to be added to your .gitignore and .Rbuildignore files.

Tools Provided by testthis

Testthis provides convenience functions to deal with tests that are organized in the manner outlined above:

test_acceptance(), test_functional(), test_manual() and test_subdir() for executing tests in subdirs of test/testthat/. Think of them as drop-inst for devtools::test() for tests that you want to keep separate from your unit tests (for example because they are long-running).
save_test() and load_test() for saving and loading data from test_data() in the .rds format.

You can easily set up functions to interact with a custom test folder structure via the test_subfolder() function.

test_long <- function(){test_subdir('long_tests')}
test_long()

Links

Difference between unit and functional tests explained by analogy www.softwaretestingtricks.com

References

Martin, Robert C. 2008. Clean Code: A Handbook of Agile Software Craftsmanship. 1st ed. Upper Saddle River, NJ, USA: Prentice Hall PTR.

Percival, Harry J.W. 2014. “Test-Driven Development with Python.” Sebastopol, CA: O’Reilly Media. https://www.obeythetestinggoat.com/pages/book.html.

Wickham, Hadley. 2015. R Packages. 1st ed. O’Reilly Media, Inc. http://r-pkgs.had.co.nz/.

Test workflows

Stefan Fleck

2017-06-03

Introduction

A Taxonomy of Tests

Unit Tests

Functional Tests

Acceptance Tests

Manual Tests

Organizing the tests/testhat directory

A Proposed Folder Structure

Tools Provided by testthis

Links

References