May 21, 2019

Background

  • The Information Services Division (ISD) of the National Health Service (NHS) Scotland produces around 200 publications each year which includes both official and national statistics.
  • Traditional publication format is a static pdf summary and report with accompanying excel tables.
  • Production is time-consuming, involves extensive manual formatting and checking and the resulting output is not useful to our customers.
  • The new publication process is largely automated and targets a wider range of customers.

Transformation Process

  • We organised a "publication incubator" where we rapidly developed a new publication format.
  • Each person in the team was assigned to a role in Data Wrangling, Data Visualisation or Project Management.
  • We also ran focus groups with a wide range of stakeholders to gather feedback and ideas.
  • Following the incubator we produced a prototype and conducted testing with colleagues and stakeholders to resolve issues before release.

Data Wrangling

  • The Data Wranglers extract the data and get it into the appropriate format for visualisation and publication.

  • This involved:
    • translating existing SPSS syntax into R code,

    • handling disclosure/confidentiality issues,

    • producing data files that complied with Open Data restrictions.

Data Wrangling: SPSS to R

  • Benefits:
    • Automated data production for the publication.

    • Easier to examine and validate the data in R.

    • Easier to reshape the data in R.

    • R is more flexible and much quicker.

    • The code is now reproducible and requires minimal manual input.

Data Wrangling: SPSS to R

  • Challenges:
    • Not every command works the same way in R as in SPSS.

    • SPSS is more stable.

    • Package management can be difficult.

    • Less 'official' support for R when something goes wrong.

    • Server capacity.

Data Wrangling: Disclosure issues

  • Customer feedback told us that they required data at a more detailed level than we had produced previously.
  • Mental Health is a sensitive subject so delivering this presented difficutlies around protecting patient confidentialty.
  • Considered and tested solutions and collaborated with colleagues who are experts in statistical disclosure.
  • Ruled out suppression almost immediately since it is very difficult to automate secondary suppression.
  • Our method allows the release of data at a more detailed level without risk to patients.

Data Wrangling: Open Data

  • Data is available through the explorer or the Open Data platform.
  • Required the data to be in a completely different format to the visualisations.
  • Much less user friendly to the typical customer.
  • Aimed at customers who are very confident in dealing with data and want to perform their own analysis.

Outcomes

  • The publication process and output is largely automated.

  • This has reduced the chance of manual errors.

  • We have reduced the time to produce the publication from 3/4 weeks to 1 week.

  • The publication we produce now appeals to a wider range of customers.

  • We have developed as a team and can now support others who want to change the way they publish.

Data Visualisation: Background

  • Assigned the role of Data Visualiser (March 2018)

  • No previous coding experience

  • Developed a prototype Data Explorer (March 2018 - July 2018) using Shiny

  • Took prototype to users for testing.

Data Visualisation: Testing Results

  • What people liked:
    • People "understood the purpose of the Explorer"
    • Its "core functionalities are highly usable" (lowest rating was 6/7)
    • It is "quite enjoyable to use, even as a technophobe".

  • What could be improved:
    • "The language used could be clearer"
    • Need for "further guidance on how to use the functions on each page"
    • Various suggestions on how to make the
      Explorer more accessible.

Data Visualisation: Testing Results

  • User feedback from testing led to several changes:
    • Clearer, more concise glossary
    • Enlarged font size
    • Removed unnecessary buttons from the plotly toolbar
    • Clearer instructions on how to use the plotly toolbar
    • Highlighted essential action buttons in blue
    • Hyperlinks in the "Introduction" tab linking to the other tabs
    • Inserted icons as visual cues to help users
      navigate and use the Explorer more easily.

Data Explorer Demo

Data Explorer: Benefits

  • It looks much better and is more interactive than our previous publication format.
  • We have saved time in both the running of the code and the production of the output:
    • If the basefiles are in the correct format, updating the Explorer should take seconds
    • No more manual formatting of tables and graphs in Excel. E.g., the graph titles update with the filter selections, and the y-axis range is re-calculated.

Data Explorer: GitHub

  • Link to my R Shiny code on Github.

  • GitHub is a great tool for version control, and it allowed us to access and review each other's code throughout the process.

  • Since we have made our code publically available, it has been used by others as a template, e.g., ISD Drugs & Alcohol team and the Ministry of Justice.

Future Plans:

  • We have expanded the original Explorer to include:
    • activity in non-psychiatric specialties
    • readmissions
    • the relative index of inequality as a trend over time
    • council area-level data.

  • This will be released in autumn.

  • We hope to transform our other publications in due course.

  • Demo of the new features.

Thank You