Why should you document your research with an R package?

A research project is typically composed of three types of items: Writing, Data, and Analysis scripts (WDA). Most people keep multiple versions of all of these item types, possibily in many locations on their computer. This can render the links between your WDA fragile if not even broken (i.e.; have you ever tried replicating a specific test statistic from a previous research project?). When the links between your WDA are weak (or broken), it is almost impossible to share your data with fellow scientists, or for your future self to be able to understand what the heck you did when you worked on that project in the past.

An R package is one method keep a strong link between your WDA. Storing your writing, data and analysis in an R package allows you to store everything in one file. If you put the file online, then you (or anyone else) can access your WPA from Rtudio with one line of code.

In addition to storing your WDA, R packages allow you can to write vignettes (aka, tutorials) that provide guidelines for others (and your future self!) to understand and use your data accurately.

Install / Update R and RStudio

Before you do anything else, make sure you’ve got the latest versions of R and RStudio installed!

Look at a package!

Let’s start by looking at a research R-package in action. I created an R-package called phillips2017cognition that contains the WDaA of a fictional future study.

To install the package, run the following code. When you do, you should see some red text, ending with the line DONE

install.packages(pkgs = "https://dl.dropboxusercontent.com/u/7618380/phillips2017cognition_0.1.0.tar.gz", 
                 repos = NULL, # Tells R not to try to get the package from CRAN
                 type = "source"  # Type of package is source
                 )

Exploring the phillips2017cognition package

Let’s look at the contents of the package. Like all packages, let’s start by loading it:

library("phillips2017cognition")

Now you can access the package. You can view the main help page as follows:

help(package = "phillips2017cognition")

On this page, you can see two main links: one to the DESCRIPTION file, and another to User guides and package vignettes. Feel free to click around on these to learn what they tell you.

Overview and Getting Started

The purpose of this document is to demonstrate the basics of creating an R package for the purposes of documenting scientific research. In this document, I will take you through the following 6 basics steps of creating a package.

6 Steps to creating an R package

  1. Start a new Project / Package in RStudio
  2. Update the DESCRIPTION file
  3. Save and document data
  4. Write Vignettes
  5. Write and document functions
  6. Document and build your package!

Install these packages!

Once you’ve installed R and RStudio, make sure you have the latest versions of the following packages installed by running the following:

install.packages("knitr")
install.packages("rmarkdown")
install.packages("devtools")
install.packages("rmdformats")

Step 1: Start a new package / project

Now you’re ready to get started on your package. You’ll start by opening a new project in RStudio – this project will essentially be your new package. To create your new project (aka package), do the following steps:

  1. Create a new project in R Studio (File – New Project – New Directory – R Package)
    • Give your project a name (I’ll call mine phillips2017cognition) and associate it with a directory on your computer.
    • Open the project.
  2. RStudio created a new folder on your computer with the project name. Navigate to the new folder you created and add the following folders:
    • /data (This is where you will store all of your .RData files)
    • /R (This folder only contains .R files. This is where you will store all of your documentation files, function files, and miscellaneous R code)
    • /inst (This folder contains any miscellaneous files you want to include in your package (e.g.; pdfs, images))
    • /man (This folder contains compiled documentation files generated by Roxygen (e.g.; when you run devtools::document()). You should never edit files here manually))

An R project is simply a directory with specific subfolders and a DESCRIPTION file. Here is how your package folder should look (my package is called phillips2017cognition)

Step 2: The DESCRIPTION file

Click here for a longer Guide to DESCRIPTION files

Next you’ll update the DESCRIPTION file for your package. Every R package must have a DESCRIPTION file that contains basic information about your package (e.g.; title, author, description etc.). You should update these by hand. Here are some of the main arguments:

  • Package: The name of your package. Don’t change this.
  • Title: A short (one sentence) description of your package.
  • Description: A longer (1 paragraph) description of your package and what it does.
  • Imports: The names of any other packages (separated by commas) that your package requires to work. For example, if your package includes functions from the BayesFactor package, you should include this here. If your package is stored on GitHub or CRAN, R will automatically install these packages on the user’s computer when they install your package.

There are additional fields you can add (like URL (to include websites))

Here is a simple DESCRIPTION FILE