Overview: REACH 2.0 data & analysis

Scaling

Empowering others (Voldemorts and Dumbledores)

Builder-User “contract”

Package Maintanance & Responsibility

User Testing

Validation and Processes

Curriculum

Version Control with git(hub) - ½ day

Reading

  1. install this (3min) and set up your account
  2. why git? (don’t worry about understanding the technical/code bits!) (20mins)
  3. what is a VCS? (10min)
  4. most basic git commands (5min)
  5. interactive intro to git (15min)
  6. Using Git(hub) with RStudio and packages

Exercise

  1. Set up a github account, install Git and Git(hub) and link it with RStudio
  2. Set up a public repository with a dummy file. Make at least 3 commits and 1 merge using at least 2 different branches.
  3. Send link to data unit for review

Software design (1 ½ days)

Reading

  1. Basics of functional programming
    1. writing custom functions
    2. pure functions
    3. cohesion and coupling
    4. important for tidyverse / dplyr users: tidyeval & creating functions around NSE
  2. Test Driven Development
    1. why Test driven development
    2. test driven development in R with testthat

Exercise

  1. Write unit tests for a single (not yet existing) function (e.g. SMEB calculation)
  2. Write a pure function that passes the tests
  3. Push to github and send link to Data Unit for code review & Q&A

Writing Packages in R (2 days)

(this generally follows Hadley Wickham’s book on creating R packages)

Getting started

Exercise

  1. Set up an R package project and push to github. Send link to data unit for review

Package components

Exercise

  1. Adjust your pure function according to the reading
  2. Add the function to the package project you created earlier according to the reading
  3. Add your unit tests to the package according to the reading
  4. push to github for data unit review

Documentation

Exercise
  1. Add roxygen comments to all package functions
    1. It should have at least:
      1. Consise title and description
      2. @param for all function parameters. Be precise on the format of the input (what data structure and data type does the function expect). Clarify the default behaviour.
      3. @details concise description of the methodology; default behaviour; any unexpected behaviour; limitations/things to watch out for
      4. @value what does the function return, and in what format?
      5. @examples at least one self contained example
      6. @export for all user facing functions
  2. Add at least one Vignette with a basic example on a fake data frame. It should contain at least
    1. Summary
      1. what problem does the package solve
      2. When should your package be used? When should it not be used / limitations?
    2. Quickstart
      1. the minimum the user needs to know to use the package to solve the problem. Give a working example
    3. Details
      1. Additional features: Expand from the quickstart example all other functionality that wasn’t covered in the quickstart guide
      2. Methodology: Any details about the package behaviour that isn’t obvious
    4. Push everything to github for the data unit to test. The package should now be fully functional and transparently documented.

Certification Requirements

  1. A simple R package
    1. Write a package for a single, simple task to solve a task that you or someone in your team does on a regular basis. It is ok if the package contains only a single user facing function.
  2. Scope Requirements
    1. The package is up on github and works out of the box
    2. All functions are well documented with roxygen2 function manuals (I can use ?packagename::functionname and get all the information I need to use your packages functions.
    3. A clear manual is available via BrowseVignettes(packagename)
    4. All functions are pure (exceptions are functions that are explicitely meant to read or write files; functions are still pure if they use other functions, as long as those functions are pure)
    5. The package functions work with any data (they are independent from variable names etc.)
    6. The code is readable and easy to understand
    7. Best practices outlined in the curriculum are followed