May 28, 2019

Slides: https://github.com/annahprice/RAP-champions

What we will cover today

Background

  • The Information Services Division (ISD) of the National Health Service (NHS) Scotland produces around 200 official and national statistics publications each year.
  • Traditional publication output is a static pdf document with accompanying excel tables.
  • Production uses proprietary software and is time-consuming, involving extensive manual formatting and checking.

What is a RAP project?

  • No (or few) manual steps = data and output produced using code
  • High quality and auditable = version control
  • Sustainable = peer review
  • "Bells and whistles" = functions, documenting/testing these functions, package management and computing environment

Levels of RAP/automation

Level Description
1 Ad hoc R code
2 R project
3 R project under version control (VC)
4a R project under VC and peer reviewed (wrangling)
4b Replicable report in Rmarkdown (publication)
5 Near RAP (as above plus data quality assurance and package management)
6 Full RAP (as above plus unit testing and documentation)
7 R package

Challenges

  • Senior management buy in
  • Culture change (peer review and working in the open)
  • New skills for analysts to learn (R, git, etc)
  • Required development time
  • Range of data sources and/or unstable production process
  • IT…

IT Infrastructure

  • R desktop and server versions
  • RStudio Server Pro
  • Package management
  • git
  • git-repository hosting service (GitHub/Gitea)

And what's next…? Travis? Docker?

Questions/Discussion

Do others have experience of similar challenges in their department/organisation?

What makes a RAP project?

What level of RAP are others working to?