Automated testing is a powerful tool to ensure correct functioning of software. This vignette assumes that you are already familiar with the concept and benefits of automated tests. If you are not, I recommend the chapter on testing in Wickham (2015) (via) for a quick overview, or the classic Clean Code by Martin (2008) for a more extensive discussion of the subject (code examples are in Java, but many of the principles apply universally).
This document proposes a structure for organizing code and data for tests that go beyond simple self contained unit tests — e.g. long-running tests, tests that require access to a external resources that might not always be available, etc… . Many of the practices of the advice given in this vignette do not apply to multi-purpose packages aimed at a wide release, and might even be counter-productive.
A lot has been written on the subject of software testing, but much of the literature is aimed at developers that have to deal with large and complex software systems. This section presents a system for tests that I found useful in the context of developing R packages. It is not important what you call the different categories of tests, but it does make sense to keep tests based on different principles and with different aims (such as unit- and functional tests) sepparated.
Concept:
Test a small unit of code, e.g a single function, from the programmer’s perspective.
Benefits:
Runtime:
All your unit tests together should ideally run in under one minute (Wickham 2015), as unit test should be regularly executed during programming sessions.
Usecases:
Example:
test_that("unique returns unique values of a character vector ", {
tdat <- c("A", "A", "A")
expect_identical(unique(tdat), "A")
})External Examples:
Concept:
Test a slice of functionality from the user perspective. As opposed to to unit testing, functional testing usually follows a black-box principle: Check if for given inputs the expected outputs are produced, without knowledge of what happens in between.
Benefits:
Runtime: Depends on your package. If you can design fast-running and completely self-contained functional tests for your package, you should absolutely do so and just put them together with your unit tests and run them regularly during your programming sessions. If you require longer-running functional tests, put them in their own directory (tests/testthat/functional_tests), so that they don’t deter you from regularly running your unit tests via devtools::test (Ctrl+Shift+T in Rstudio). This can become necessary if, for example, you need to write large amounts of data to a database or test the performance of a simulation.
Usecases
Example
test_that("t.test returns a p value < 0.95 for significantly different sample means", {
# Karl wants to use a t test to see if the weight of carps
# significantly differ from the weight of trouts (which it does).
carps <- rnorm(100, mean = 7)
trouts <- rnorm(100, mean = 2)
expect_true(t.test(carps, trouts, alternative = 'two.sided')$p.value < 0.95)
})External Examples:
Concept:
Test if a system fulfills the project requirements. All acceptance tests passing indicates that the program is finished. If you think of user tests as programmer-level, and functional tests as user-level, acceptance tests would be boss-level (i.e. test the goals set by your client/employer).
Benefits:
Runtime:
Acceptance tests need only to be run occasionaly and can be long-running or rely on external resources (but use with care).
Usecases
See Example
Example:
Your boss tells you to re-implement an analysis workflow from SAS code to R. You formulate that goal as automated acceptance tests by comparing the output of your R workflow against the results produced by the legacy code. If both workflows produce the same result given the same input data, your test passes and your goal is fulfilled.
test_that("SAS workflow for analysing business data was successfully ported to R.", {
expected_results <- testthis::load_test('expected_results.rda')
test_input_data <- fetch_input_data(projectdb, loggin_credentials)
aux_dat <- fetch_aux_data_from_web()
tdat <- prepare(test_input_data)
new_workflow_result <- run_complicated_analysis(tdat, aux_dat)
expect_identical(new_workflow_results, expected_results)
})Concept:
Tests that require human supervision, because they cannot be formulated well programmatically. They do not fit in the hierarchy above but a rather parallel to it. You can, for example, have manual acceptance tests (but you should never ever have manual unit tests). You should have only a handful of manual tests for a project, and only if they absolutely cannot be avoided. Think of them as deal-with-the-devil-level tests.
Benefits:
Sometimes you will have to write code that produces pre-formatted excel workbooks or complex plots that have to look nice. Manual tests provide a framework of organizing such outputs.
Runtime:
Run only on occasion, so can be long-running.
Example
test_that("plot is pretty", {
tres <- produce_pretty_plot(iris)
ggsave('pretty.png', tres)
# Look at pretty plot
})While organizing unit tests if fairly straight forward and well supported by testthat, there is less advice on how to structure longer running and more “messy” tests.
tests
+-- testthat
| +-- test_data
| +-- test_data_raw
| +-- test_out
| +-- functional_tests
| +-- acceptance_tests
| +-- manual_tests
devtools::test() (CTRL+SHIFT+T in Rstudio)..rds format; see ?readDS(). Unit tests should not use data from this directory but rather be self contained, but for functional or acceptance tests it might be useful to compare the results of an analysis workflow against a reference data set.data_raw directory proposed in Wickham (2015) (via).If you use them, test_data and test_out need to be added to your .gitignore and .Rbuildignore files.
Testthis provides convenience functions to deal with tests that are organized in the manner outlined above:
test_acceptance(), test_functional(), test_manual() and test_subdir() for executing tests in subdirs of test/testthat/. Think of them as drop-inst for devtools::test() for tests that you want to keep separate from your unit tests (for example because they are long-running).save_test() and load_test() for saving and loading data from test_data() in the .rds format.You can easily set up functions to interact with a custom test folder structure via the test_subfolder() function.
test_long <- function(){test_subdir('long_tests')}
test_long()Difference between unit and functional tests explained by analogy www.softwaretestingtricks.com
Martin, Robert C. 2008. Clean Code: A Handbook of Agile Software Craftsmanship. 1st ed. Upper Saddle River, NJ, USA: Prentice Hall PTR.
Percival, Harry J.W. 2014. “Test-Driven Development with Python.” Sebastopol, CA: O’Reilly Media. https://www.obeythetestinggoat.com/pages/book.html.
Wickham, Hadley. 2015. R Packages. 1st ed. O’Reilly Media, Inc. http://r-pkgs.had.co.nz/.