This is largely based on http://dirk.eddelbuettel.com/blog/2011/04/12/. Of course Dirk compares with Rcpp, not with .C(). Also he makes some additional points about the parser handling curlies. This is on a MacBook Pro, retina, Mid 2012, 2.6 GHz Intel Core i7, 8 GB of 1600 MHz DDR3, using OS X 10.9 (13A598), Apple LLVM version 5.0 (clang-500.2.78), and R-3.1.0 (2013-10-17 r64066).
First define three functions.
f <- function(n, x = 1) for (i in 1:n) x = 1/(1 + x)
g <- function(n, x = 1) for (i in 1:n) x = (1/(1 + x))
h <- function(n, x = 1) for (i in 1:n) x = (1 + x)^(-1)
Load the rbenchmark package.
library(rbenchmark)
Set the loop length.
N <- 1e+06
Run the benchmark.
benchmark(f(N, 1), g(N, 1), h(N, 1), columns = c("test", "replications", "elapsed",
"relative"), order = "relative", replications = 10)
## test replications elapsed relative
## 1 f(N, 1) 10 3.276 1.000
## 2 g(N, 1) 10 4.013 1.225
## 3 h(N, 1) 10 5.399 1.648
Load the compiler package.
library(compiler)
Make compiled versions of the functions.
lf <- cmpfun(f)
lg <- cmpfun(g)
lh <- cmpfun(h)
And now run the benchmark again, with both compiled and non-compiled functions.
benchmark(f(N, 1), g(N, 1), h(N, 1), lf(N, 1), lg(N, 1), lh(N, 1), columns = c("test",
"replications", "elapsed", "relative"), order = "relative", replications = 10)
## test replications elapsed relative
## 5 lg(N, 1) 10 1.252 1.000
## 4 lf(N, 1) 10 1.322 1.056
## 6 lh(N, 1) 10 1.961 1.566
## 1 f(N, 1) 10 3.261 2.605
## 2 g(N, 1) 10 3.972 3.173
## 3 h(N, 1) 10 5.366 4.286
We see byte compiling speeds up computation by a factor of three in this case.
If we rewrite the code in C, in file fC.c,
void fC (int *n, double *x) {
for (int i = 0; i < *n; i++)
*x = 1.0 / (1.0 + *x);
}
and compile it to a shared flibrary fC.so, using
R CMD SHLIB fC.c
We need a little R wrapper to get things going.
fRC <- function(n, x = 1) {
x <- .C("fC", n = as.integer(n), x = as.double(x))$x
}
The new benchmark results are
dyn.load("fC.so")
benchmark(f(N, 1), lf(N, 1), fRC(N, 1), columns = c("test", "replications",
"elapsed", "relative"), order = "relative", replications = 10)
## test replications elapsed relative
## 3 fRC(N, 1) 10 0.070 1.00
## 2 lf(N, 1) 10 1.276 18.23
## 1 f(N, 1) 10 3.247 46.39
So using object code speeds up the computations by a factor close to 50.