1. Adopting a coding style for your team

Examples:

2. Organizing code into functions

DRY: “Don’t Repeat Yourself”

## Get form schema for multiple forms...
abcd_schema <- activityinfo::getFormSchema(formId = "abcd")
efgh_schema <- activityinfo::getFormSchema(formId = "efgh")
ijkl_schema <- activityinfo::getFormSchema(formId = "ijkl")
mnop_schema <- activityinfo::getFormSchema(formId = "mnop")
qrst_schema <- activityinfo::getFormSchema(formId = "qrst")
## etc...

## DRY: Don't Repeat Yourself! >>> Use iteration!
form_id_list <- list("abcd", "efgh", "ijkl", "mnop", "qrst")

## use `lapply` and other base R 'apply' functions
all_form_schemas <- lapply(form_id_list, function(r) as.data.frame(activityinfo::getFormSchema(r)))

## or use purrr::map() and other {purrr} functions
all_form_schemas <- purrr::map(
  form_id_list,
  ~ activityinfo::getFormSchema(formId = .x)
)

3. Organizing functions into packages

## Create a new package directory
usethis::create_package("MyNewPackage")

## Use git or connect to an existing GitHub repository
usethis::use_git()
## OR
usethis::use_github()

## Create README file
usethis::use_readme_rmd()

## Create R script to house your function code
usethis::use_r("viz_functions")

## Create test infrastructure with {testthat}
usethis::use_testthat()

## Create a vignette
usethis::use_vignette()

## Add {ggplot2} or any other package as a dependency
usethis::use_package("ggplot2")

## Create roxygen documentation
devtools::document()

## Build package
devtools::build()

## Test package
devtools::test()

## AND MANY MANY MORE!

Example Package for some ETL process

  • Individual component functions:

    1. get data: ex. getData() (and other smaller functions)
    1. clean data: ex. cleanData() (and other smaller functions)
    1. transform data: ex. transformData() (and other smaller functions)
    1. visualize data: ex. visualizeData() (and other smaller functions)
    1. Save to a folder or append data to a db: ex. saveData() and/or appendData() (and other smaller functions)
  • etc.

How to run package code in a script?

  • One large package function (ex. runAnalysis()): basically executes everything?
  • Have the individual component functions run one-after-another in a R script?
  • etc…

4. Documenting code

#' @title FUNCTION_TITLE
#' @description FUNCTION_DESCRIPTION
#' @param data PARAM_DESCRIPTION
#' @return OUTPUT_DESCRIPTION
#' @details DETAILS

## etc...

Then you can run:

devtools::document()

So that the code of your function and the comments will be transformed into Rd files in the man/ directory which is where documentation files for R packages are kept.

?tidyr::spread

Testing

  • Start with common sense ones: Is my function output the correct type?, Does my function error when I x/y/z?

  • Code can fail in ways that you didn’t think possible so at the start of a new package, you won’t really have comprehensive tests. Future bugs and issues from you/people using the package are an opportunity to add new tests to make sure that the problem doesn’t happen again!

  • Best way to learn is probably to read how other packages have written tests, especially if the package(s) are similar to what you’re working on.

  • Use {codecov} to discover how much code in your package functions that the tests are properly covering. Also note that you shouldn’t get too transfixed on the % of coverage but focus on the quality of the tests first and foremost.

Creating a {pkgdown} website

  • README file becomes your front page

  • Package documentation exist as individual web pages

  • Vignettes, news, and other things are included too

  • Bonus: Use HTML and CSS to spice up the appearance!

  • Examples: tvthemes, ggshakeR

5. Using version control

Git: version control system.

GitHub: Popular internet hosting for Git version control system, provides a nice user interface to store files and to enable collaboration with other users.

Github Projects: Kanban/Trello style board that you can use to organize your tasks as ‘issues’ in individual repositories

Github Issues:

  • Title: Start with a verb describing the main action that the issue is supposed to fix, then a short description.
    • Create..., Fix..., Simplify..., etc.
  • Description: Use the first comment box once you’ve created the issue to describe in a bit more detail. Specify the function(s) you want to work on (you might have already mentioned it in the title), possible steps you want to take, some brainstorming thoughts. For bug issues sometimes you may not have too much to say here… yet.
  • Assignments: On the right side-bar of every issue are various buttons that you can use to tag and organize your issue.
    • Assign a person via Assignees
    • Label your issue as bug, enhancement, etc. (you can customize these to fit your project)
    • Project/Milestone: If this is part of a larger release ‘Project’ or ‘Milestone’ you can add the issue to those from here.

You can also create issue templates to streamline this and/or make it easier to gather actionable information from users.

Branch

  • Create a new branch based on the issue: I like to name it in a similar vein to the issue. Start with an action verb then a short description (can be difficult at times especially as I prefer to keep it to less than 5 whole words…) but this time I also append the Github issue number at the end.
    • Example: create-passnetwork-function-#56.
  • GitHub commit message keywords: references, closes, etc. so that the changes you are making to the code is tracked and linked to a specific issue in a repository.

Pull Request

Merge the changes you made on a separate branch to the main branch via a Pull Request!

You can create a checklist template that shows up whenever a PR is created on Github so that the user/developer is reminded of what needs to be done before the code can be fully merged.

6. Examples from the field

AV Organization:

Contact Info 📫💬

Hope you found this helpful. If you need some help with organizing your R code base,

Contact Ryo Nakagawara: