This document provides a quick summary on package management in R.
Please do not use the word library when you mean package. In R, a library is a collection of packages. The confusion probably comes from the command library(somepackage), but this should be read as “Load the following package from my library”.
When updating R, it is quite annoying that you have to reinstall all your favorite packages. The solution to this problem is to always install the packages to the same directory. To do this, we have to set an environment variable, which R uses to find the directory to install your packages.
Basically, you have to set the environment variable R_LIBS to the directory you want, for example "c:\RLIBRARY". To set an environment variable in Windows, see this link. For Mac users, I don’t know for sure, but this link may help.
After setting the environment variable, make sure you restart Rstudio. Then, check it works :
.libPaths()
## [1] "c:/RLIBRARY"
## [2] "C:/Program Files/R/R-3.3.2/library"
That should show "c:\RLIBRARY" if it went correctly as the first path (which is the one that will be used from now on), followed by your typical R directory. Next time you update R, all packages will still be read from and installed to this directory.
A common question is how do we find the right package for our problem? There is no easy solution for this, as there are now ca. 10k packages on CRAN. The site r-pkg.org provides a nicer interface to the long list of package, with some more information. To find R packages for a particular topic, browse the Task Views on CRAN.
This should all be old news to you, but just to be sure: the first time you want to use a package, you must install it (which downloads it to your computer), from then on you can load it. Thus,
# Install a package
install.packages("gplots")
# Load a package
library(gplots)
Alternatively you can use require, which returns the success of the loading:
# Or load with require,
r <- require(gplots)
## Loading required package: gplots
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
# Check if success
r
## [1] TRUE
# Use this to write a message of some kind
if(r)message("Loaded gplots package successfully!")
## Loaded gplots package successfully!
Instead of loading a package and using functions from it, we can also use the :: operator to make sure we use a particular function from that package (also, this means you don’t have to use library though in rare cases this won’t work).
Hmisc::capitalize("remko")
## [1] "Remko"
Pro-tip: the operator ::: accesses functions in packages even when they are normally not visible (functions used internally in packages not intended for users - for this reason it is not a good idea to use that routinely!).
Now consider you wrote a script with lots of library statements in there, sent it to a collaborator, but the collaborator doesn’t have those packages installed. Repeated use of install.packages is then necessary for the script to work, which is annoying and slow. A nice new solution to this problem is the pacman package.
# The following command either loads a package, or first installs it and then loads it.
library(pacman)
p_load(geometry)
# The following example code is perfect for the start of your scripts, making it
# much more reproducible! Obviously we do need the pacman package first.
if(!require("pacman"))install.packages("pacman")
pacman::p_load(gplots, plantecophys, rgl, geometry)
In some rare cases you may want to check whether a certain package is available on a machine, but not actually do anything about it. For that, use .packages like so.
# Check whether 'somepack' is in the long list returned by .packages()
"somepack" %in% .packages(all.available=TRUE)
## [1] FALSE
Finally, don’t forget to update your packages - I recommend every month or so. The drama of working with a different package version than your collaborator (or yourself on a different computer) is difficult to underestimate.
update.packages(ask=FALSE)
For some reason this command fails from time to time with an error that Rcpp is not available..., in which case restart Rstudio (or just R via the menu Session/Restart R) and do,
install.packages("Rcpp")
update.packages(ask=FALSE)
(The reason for this is complicated but has to do with dependence on C++ code in many packages, for which the Rcpp package is always loaded).