Rcpp Attributes

JJ Allaire
July 1st, 2013

Greater Boston useR Group

History and Motivation

Origins of S

Languages in R and CRAN

Native Code: Problems and Opportunities

Can We Have the Best of Both Worlds?

Origins of S

On May 5th, 1976, a group of users from the statistics research group at Bell Labs discussed creating a new system for statistical computing. Some of the requirements:

  • Interoperability with their decade plus investment in Fortran code and libraries
  • Readily understood and used by statisticians
  • Interactive execution for iterative data exploration
  • Extensibilty for research into statistical methods

Origins of S

Languages in Core R

Base R Languages

Language Percent
C 52%
Fortran 26%
R 22%

librestats (8/27/11)

Languages in CRAN Packages

CRAN Languages

Language Percent
R 48%
C/C++ 46%
Fortran 5%
Perl 1%

librestats (8/29/11)

R Has the Best of Both Worlds

Native Code

  • Access to proven libraries and algorithms
  • Extremely high performance

Interpeted Code

  • Accessible high-level language
  • Interactive workflow for data analysis
  • Supports rapid prototyping, research, and experimentation

Do R Users Have the Best of Both Worlds?

  • When we directly use native code exposed by base R and packages our code runs fast
  • If only that were enough!
    • We write tons of custom R code, often it's too slow
    • Many C/C++ libraries are available, but aren't conveniently wrapped by an R package
  • What can be done? Should R users be writing more code in C/C++?
  • Perhaps, but…

There Be Dragons!

There be Dragons

What's Difficult About C++?

  • Memory Management
  • Build Systems (Makefiles)
  • OO Programming
  • Language Complexity
  • Library Complexity
  • Non-Interactive Execution

R .Call API: No Picnic Either!

 #include <R.h>
 #include <Rinternals.h>

 SEXP reverse(SEXP x) {
   SEXP res;
   int i, r, P=0;
   PROTECT(res = Rf_allocVector(
                  REALSXP, Rf_length(x)));
   for(i=Rf_length(x), r=0; i>0; i--, r++)
     REAL(res)[r] = REAL(x)[i-1];
   UNPROTECT(1);
   return res;
}

Traditional Division of Labor

R System/Library Developers

  • Expert at both C/C++ and the R .Call API
  • Create packages that wrap high-performance code

R Users

  • Write all code in R
  • Hope that a high performance library exists for their particular application

We can do better for R users!

What Do We Need?

R users need a system that:

  • Allows us to write C++ code that looks and acts like R code
  • Eliminates memory management concerns
  • Requires no build system
  • Has a fully interactive execution model

Rcpp and Rcpp Attributes

The Rcpp Package

Why “Attributes”?

Interactive C++ with sourceCpp

Some Examples

Step 1: Rcpp

The Rcpp Package

  • Created by Dirk Eddelbuettel & Romain François (with contributions from Doug Bates, John Chambers, and JJ Allaire)
  • Includes broad support for vectors, matrices, functions, environments, and more.
  • High level “sugar” API provides vectorized C++ equivalents for many R functions
  • Full support for STL algorithms
  • Currently used by over 120 packages on CRAN
  • More at: http://www.rcpp.org

R to C++ Object Mappings

// Matrix of 4 rows & 5 columns
NumericMatrix xx(4, 5);

// Fill with value
int xsize = xx.nrow() * xx.ncol();
for (int i = 0; i < xsize; i++) {
  xx[i] = 7;
}

// Assign to single element
xx(0,1) = 4;

Sugar for Vectorized Expressions

NumericVector x = NumericVector::create(
                    -2.0, -1.0, 1.0, 2.0);

NumericVector xAbs = abs( x );
NumericVector xCeiling = ceiling( x );
NumericVector xFloor = floor( x );

bool b = all( x < 3.0 ).is_true();

Calling R Functions, Error Handling

// lookup rnorm function
Environment stats("package:stats");
Function rnorm = stats["rnorm"];

// call rnorm function
rnorm(100, 
      _["mean"] = 10.2, 
      _["sd"] = 3.2);

// raise an R error
if (failed)
  stop("an unexpected error occurred!");

Step 2: Rcpp Attributes

Exporting a C++ Function to R

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double piCpp(int N) {
  NumericVector x = runif(N);
  NumericVector y = runif(N);
  NumericVector d = sqrt(x*x + y*y);
  return 4.0 * sum(d < 1.0) / N;
}
> sourceCpp("pi.cpp")
> piCpp(1000)
[1] 3.048

Why "Attributes"?

Attributes are annotations added to source code to express intent or automatically generate code (i.e. compiler extensions). They are part of the C++11 standard.

[[omp::parallel]]
void someFunction() {}

In R we use compilers that don't (yet) support C++11, so attributes are added as specially formatted comments:

// [[Rcpp::export]]

Interactive C++ with sourceCpp

  • sourceCpp is a function that sources C++ files just as the source function sources R files
  • Functions within the source file decorated with the [[Rcpp::export]] attribute are made available to R
  • Look Ma!
    • No build system
    • No memory management
    • No boilerplate R <-> C++ data marshaling code
    • Mix C++ and R in the same source file
    • Fully interactive workflow

Some Examples

Example: Computing a Running Sum

Running Sum: Rewritten in C++

set.seed(42)
y <- rnorm(100000)

benchmark(runSumCpp(y, 4500), 
          runSumR(y, 4500),
          order = "relative")[,1:4]
                     reps    time     rel
1 runSumCpp(y, 4500)  100   0.082    1.00
2   runSumR(y, 4500)  100  33.717  411.18

Example: Transforming a Matrix

Source:

http://gallery.rcpp.org/articles/transforming-a-matrix/

sourceCpp("matrix.cpp")
m <- matrix(c(1,2,3, 11,12,13), 
            nrow = 2, ncol=3)
matrixSqrt(m)
         [,1]     [,2]     [,3]
[1,] 1.000000 1.732051 3.464102
[2,] 1.414214 3.316625 3.605551

Example: Creating an xts Object

Source:

http://gallery.rcpp.org/articles/creating-xts-from-c++/

library(xts)
sourceCpp("xts.cpp")
foo <- createXts(1, 4) 
foo
           [,1]
1970-01-02    1
1970-01-03    2
1970-01-04    3
1970-01-05    4

Attributes for Build Configuration

Package Dependencies

  • You can use the [[Rcpp::depends]] attribute to declare a dependency on an R package.

  • The inclusion of the [[Rcpp::depends]] attribute causes sourceCpp to configure the build environment to compile and link against the package

  • Dependencies are discovered both by scanning for package include directories and by invoking inline plugins if they are available for a package.

http://gallery.rcpp.org/articles/fast-linear-model-with-armadillo/

Example: bigmemory

Plugins

  • [[Rcpp::plugins]] attribute loads inline plugins
  • Capable of custom build configuration and/or importing arbitrary external libraries
  • Essentially a shareable “recipe” for build configuration
  • Examples:
    • Enabling C++11 compiler features
    • Adding OpenMP switches to compiler and linker
    • Dynamically probing the system for a library

http://gallery.rcpp.org/articles/simple-lambda-func-c++11/

Package Development

Using Rcpp Attributes in Packages

Code built with sourceCpp can be seamlessly moved into packages for sharing with a wider audience via the compileAttributes() function:

Rcpp::compileAttributes()

Scans all C++ files in the src directory of the package looking for [[Rcpp::export]] attributes and writes the following files with R and C++ bindings:

  • R/RcppExports.R
  • src/RcppExports.cpp

Building the Package

  • Source file identical to what we'd use with sourceCpp

  • Need to call compileAttributes right before building the package

  • Tools that automatically call compileAttributes:

    • devtools
    • RStudio
  • Also supports transposing roxygen2 comments

Some Additional Resources

Modern C++

Two talks worth watching to get a flavor for what's happening in modern C++:

  • Not Your Father's C++ Herb Sutter (Chair, ISO C++ Standards Committee) http://goo.gl/N9YbS

  • Clang: Defending C++ from Murphy's Million Monkey's
    Chandler Carruth (Google)
    http://goo.gl/ixNGP

Learning Rcpp

Rcpp Gallery

http://gallery.rcpp.org

Contains a wealth of articles (all with full source code) demonstrating the use of Rcpp.

Questions?

Thank You!

Slides for this talk are at:

http://rpubs.com/jjallaire/rcpp-attributes-boston-useR