This article provides an easy to follow checklist for creating an R package using the {devtools}/{usethis} framework as explained in the book R Packages (2e) by Hadley Wickham and Jennifer Bryan. The target audience for this article is for both new and experienced R developers. For new R developers, creating an R package may seem overwhelming and complicated at first but, this article aims to remove the mystery from the process by providing a tutorial. Experienced R developers who need a quick refresher on how to set-up an R package or add certain features to an existing R package (such as a {pkgdown} website) can use this article as a quick reference guide or cheat sheet.
This article will walk you through how to build a professional R package with the following features:
Version control using git and GitHub:
Automated GitHub actions for R CMD check with README badge
Automated documentation and code styling with pull requests
Testing:
Unit testing using {testthat}
Automated GitHub actions for code testing coverage using CodeCov and {covr} with README badge
Professional looking package website hosted on GitHub Pages and built using {pkgdown}:
Internal and external data files
Documentation:
Function and data documentation using {roxygen2}
README using a .Rmd file
NEWS.md
Articles and vignettes
DESCRIPTION file set-up with license
Code styling using {styler}
The checklist consists of 32 steps in total which are divided in to 10 sections, A through J. Where possible, I provide a rough time estimate to complete each section:
The first four sections A. Initial Set-up, B. Version Control Using git and GitHub, C. Package Documentation, and D. GitHub Actions, which collectively encompass steps 1 through 11, are considered the minimum required to establish a proper R package framework and take roughly 30 minutes to complete.
Section E. Functions and Function Documentation includes steps 12, 13, and 14 and involves writing the actual R code for your package; how long this takes depends on the package.
Note: once you have completed sections A through D (steps 1 through 11) and as your work on section E (steps 12 through 14) you should start using the standard development work flow described in the Post Set-up Workflow section of this article.
All sections after E. Functions and Function Documentation are optional and can be completed as needed or as time allows, in any order. These sections include:
F. Code Styling (step 15). Time commitment: about 5 minutes.
G. Additional Package Documentation (step 16). Time commitment: depends on how many package vignettes and articles you want.
H. Data Files (steps 17 through 21). Time commitment: depends on how many data files you need.
I. Testing (steps 22 through 28). Time commitment: depends on the amount of code in your package and what level of code test coverage you consider acceptable.
J. Website and Logo (steps 29 through 32). Time commitment: about 30 minutes.
Each step in this article also provides references with links.
GitHub and CodeCov accounts: throughout the article certain platforms are chosen to perform certain actions: GitHub for repository hosting, GitHub Actions for CI/CD pipelines, GitHub Pages for the {pkgdown} website, and CodeCov for unit testing coverage. There are alternatives to these platforms however, the {devtools}, {usethis}, {pkgdown}, {covr} packages used to build these features are conducive to using these popular platforms therefore, you must have GitHub and CodeCov accounts.
R Studio Installation Settings: this article assumes your R Studio installation was previously configured according to the {usethis} set-up guide. This guide ensures that {devtools} and {usethis} are loaded every time you start R Studio, it provides default DESCRIPTION file settings, and connects your GitHub account to R Studio.
available::available("[PROPOSED PACKAGE NAME]")
Reference: https://r-pkgs.org/workflow101.html#use-the-available-package
dir.create("[PATH TO PACKAGE]")
setwd("[PATH TO PACKAGE]")
create_package()
Reference: https://r-pkgs.org/whole-game.html#create_package
use_git()
Next, make your first commit using the terminal using the git commands add and commit.
Reference: https://r-pkgs.org/whole-game.html#use_git
# if you did step 5, ignore the first message about uncommitted changes
use_github()
Update these fields:
Title
Authors (this should be auto-populated if you followed the {usethis} set-up guide)
Description
Chose a license:
# the MIT license is one of the most permissive but there are other options
use_mit_license()
References:
DESCRIPTION file: https://r-pkgs.org/description.html
Licensing: https://r-pkgs.org/license.html
use_readme_rmd()
After populating the basic information in the README.Rmd file, build the .md version of the file using:
build_readme()
Reference: https://r-pkgs.org/whole-game.html#use_readme_rmd
use_news_md()
Edit the file as needed. It will be used as a change log for package version releases.
This creates a dummy file to establish package level documentation:
use_package_doc()
Every time you push the package to the remote repository, R CMD check will be run and system compatibility will be checked for a variety of OS’s.
use_github_action("check-standard")
This will also add the R CMD check badge to the README.Rmd file so the README.md file should be re-built:
build_readme()
Reference: https://r-pkgs.org/software-development-practices.html#r-cmd-check-via-gha
use_github_action("pr-commands")
Reference: https://usethis.r-lib.org/reference/github_actions.html#use-github-action-pr-commands-
Steps 12 through 14 outline how to build the core of your package: the code base. Instead of building the entire code base and then creating your tests, I recommend you that you iteratively build the code base and test it as you go. See section I Testing for instructions on how to set-up your testing suite.
Your package code must exist in .R files in the R/
directory. Create each .R file using:
use_r("[NEW R SCRIPT NAME]")
Reference: https://r-pkgs.org/whole-game.html#use_r
As functions from other packages are used throughout your code they
must always be called using the
[PACKAGE NAME]::[FUNCTION]() method. Each unique package
that is called in your code must be added as an import in the
DESCRIPTION file. This can be done using:
use_package("[IMPORTED PACKAGE NAME]")
It’s also a good idea to import {magritts}’s pipe operator:
use_pipe(export = TRUE)
References:
For each function in the R/ directory, create {roxygen2} comments using the references below, then build all the package documentation files by running:
document()
References:
To style the entire package run:
use_tidy_style()
Note: you should always commit changes immediately before and after running the above code to separate style changes from code changes in the git history.
References:
Use articles and vignettes to provide package instructions and example workflows beyond what is in README.Rmd. Vignettes will be included with your package on install. Articles will not be included with your package on install but will be available on the {pkgdown} website. These can be created using the functions:
use_vignette("[NEW VIGNETTE NAME]")
use_article("[NEW ARTICLE NAME]")
The first time each of these functions is called the directories
vignettes and vignettes/articles are created,
respectively.
Reference: https://r-pkgs.org/vignettes.html
There are five locations for storing/creating R package data files. Choosing a directory to use for a file depends on how the file is to be used and whether the user should have access to the file after they install your package.
data-raw/: this directory can store raw data files
and is used for creating data in any of the other data directories
below. Nothing in data-raw/ will be available to end users
after installation and load_all() does not load any data
files in this directory into the developer’s environment. See step
17.
data/: this directory stores data files, in a native
R format (ie .rda), that the user will have access to after package
installation after calling [PACKAGE NAME]::[DATA OBJECT NAME] or by
calling library([PACKAGE NAME]) and then referring to the
DATA OBJECT NAME. Each data object should have its own .rda file and
must be documented. As the package developer, can load these data files
into your environment using load_all(). See steps 18 and
19.
R/sysdata.rda: this is a file that stores all R data
objects for internal package use. These data objects will not be
available to the user after installation. As the package developer, can
load these data files into your environment using
load_all(). See step 20.
inst/extdata: this directory stores data files that
should be available to the user after installation but which are not in
a native R format (ie .csv). These files cannot be called directly as R
objects by the end user and load_all() will not make these
available in the developer’s environment. The package developer or end
user can get the path to a data file in this directory by calling
system.file("extdata/[DATA FILE NAME]", package = "[PACKAGE NAME]")
and then load that file into the global environment by calling the
appropriate read function (ie read.csv()). See step
21.
tests/testthat/fixtures: this directory stores data
files in any format that are used for unit testing. By default, these
files can be accessed by the user after installation. These files cannot
be called directly as R objects by the end user and
load_all() will not make these available in the developer’s
environment. The package developer can get the path to a data file in
this directory by calling
testthat::test_path("[DATA FILE NAME]") and then load that
file into the global environment by calling the appropriate read
function (ie read.csv()). An end users would need to
specify the full path to the file in their R library directory and then
call the appropriate read function. These types of data files will not
be addressed in this section, instead see the section I Testing,
specifically step 25.
The function below will create an R script for creating data files.
The first time it is called, it will also create the
data-raw/ directory.
use_data_raw("[NEW DATA FILE NAME]")
It’s good practice to have one R script in data-raw/ for
each data file in the package. At the end of each script, you will save
the data object to one of the other four data locations.
This step will create a file in the data/ directory
which will be available to a user after package installation. Data
stored in this manner must be in a native R data format (ie
.rda). For each script in data-raw/ that is
supposed to create an exported data file, call the function below at the
end to save the file in the correct location:
use_data("[NEW DATA FILE NAME]", overwrite = TRUE)
Reference: https://r-pkgs.org/data.html#sec-data-data
Every file in data/ must have package documentation
created using {roxygen2}. The documentation for all of the files in
data/ should reside in the file R/data.R file.
Create the file using:
use_r("data.R")
Then populate it per the reference below.
Reference: https://r-pkgs.org/data.html#sec-documenting-data
Internal package data will not be shared with the end users of the
package. All internal data is stored in the R object
R/sysdata.rda. You can create a script to create this
object in data-raw like this:
use_data_raw("sysdata")
At the end of the script call:
use_data(..., overwrite = TRUE, internal = TRUE)
Any non-R data files that you want the user to have access to after
installation should be stored in inst/extdata. They can be
created under data-raw but at the end of the script should
not call use_data(). Instead save them using something
like:
dir <- system.file("extdata", package = "[PACKAGE NAME]")
write.csv(data, file = file.path(dir, "[FILE NAME].csv"))
use_testthat()
Reference: https://r-pkgs.org/testing-basics.html#initial-setup
Create a test script for each file in R/ that you want
to test using:
use_test("[NAME OF .R FILE IN R/]")
Populate the file according to the {testthat} convention using the references below.
References:
You can run individual test functions inside a test file by
highlighting and running that section of code. Make sure to always run
load_all() first and after making any change to the file
being tested.
You can run all tests for a file in R using:
test_file("[NAME OF .R FILE IN R/]")
Test the entire package by running:
test()
References:
You can include test helper functions by creating the file
tests/testthat/helpers.R.
You can also include data files to assist with testing (called test
fixtures) by creating the directory tests/testthat/fixtures
and storing them there. These can be native R files or not. It’s a good
idea to create these data files in the data-raw/ directory.
Also see section H. Data Files.
References:
use_coverage(type = "codecov")
The code above will also add a CodeCov badge to the README so the README must be re-built:
build_readme()
Reference: https://usethis.r-lib.org/reference/use_coverage.html
To check test coverage for the entire package run one of the two lines of code below, depending the output type you want to see. Note, before running the functions below, you must restart R.
# this will produce a report in R Studio's Viewer window
covr::report()
# alternatively, this will print the results to the console
covr::package_coverage()
To check test coverage for an individual file, run the following line of code which will produce a report in the R Studio Viewer window that will highlight which lines of code are tested and which are not tested.
devtools::test_coverage_active_file()
References:
{covr} for unit testing automation
https://r-pkgs.org/testing-design.html#sec-testing-design-coverage
Sign-in to CodeCov, select the correct repo, and following the steps for GitHub Actions set-up. This will involve copying a token from CodeCov into a GitHub repo secret. Note, the GitHub repo must be public for this to work.
Finally, add test coverage as a GitHub Actions workflow:
use_github_action("test-coverage")
References:
GitHub Pages will be used to host the website. Every time you push to the remote, the website will be re-built. Run the line of code below once to set up:
use_pkgdown_github_pages()
References:
The step above set-up a GitHub Pages workflow to re-build your public site with every push but, it’s often helpful to check potential changes to your website on your local machine prior to publishing them. Running the line of code below will allow you to preview your site locally.
pkgdown::build_site()
After running the line of code above once, you will be able to
locally preview individual sections of the site, instead of the whole
site, by calling one of the pkgdown::build_*()
functions.
References:
By default, the reference page of the site will list all package functions in alphabetical order. To change this page and group packages into sections, see the reference below.
Reference: https://pkgdown.r-lib.org/reference/build_reference.html?q=subtitle#reference-index
Your package logo will be displayed in your README file and on the {pkgdown} website and really makes your package stand-out and look professional.
Design your logo using hexmake and download the .png file to a directory outside of your package. Move your logo into your package and configure it using:
use_logo("[PATH TO LOGO]")
The code ran in the previous chunk will add html code to your clipboard. Paste that into README.Rmd where the title is and then rebuild the README:
build_readme()
Next, to update your {pkgdown} site with your logo, follow the normal workflow steps to push your changes to the remote repo.
Reference: https://r-pkgs.org/website.html#logo
Once you have completed sections A through D and begin working on section E, it’s time to start using the iterative workflow for package development below. Following this process is good practice to ensure a quality R package that is tested and documented.
R/
using:# update global environment with recent package changes
load_all()
# test the file being worked on by running sections of the test file or by running:
test_active_file()
test()
document()
# when changes to the README are made, also run
build_readme()
check()
References:
The primary reference used is the book R Packages (2e) by Hadley Wickham and Jennifer Bryan. Other references include:
{usethis} for package development tools
{devtools} and the {devtools} cheatsheet for package development tools
{testthat} for unit tests
{covr} for unit testing automation
{roxygen2} for documentation
{pkgdown} for building package websites
{styler} for code styling