do.call

do.call() has rather surprising behaviour. You might expect that do.call("f", list(x) would equivalent be f(x), but that’s far from the truth. The following example shows you what various calls to do.call() actually generate:

df <- data.frame(x = runif(1), y = runif(1))

f <- function(x) {
  sys.call()
}
# Worst: inlines f and df
do.call(f, list(df))
#> (function (x) 
#> {
#>     sys.call()
#> })(list(x = 0.904893477912992, y = 0.77339426218532))

# Only inlines df
do.call("f", list(df))
#> f(list(x = 0.904893477912992, y = 0.77339426218532))

# Only inlines f
do.call(f, list(quote(df)))
#> (function (x) 
#> {
#>     sys.call()
#> })(df)

# Best: inlines neither. Equivalent to f(df)
do.call("f", list(quote(df)))
#> f(df)

Performance

This has some obvious performance implications: when the call is larger, match.call() is slower:

object.size(diamonds)
#> 3456256 bytes
g <- function(x) {
  match.call()
}

microbenchmark(
  do.call(g, list(diamonds)),
  do.call("g", list(diamonds)),
  do.call(g, list(quote(diamonds))),
  do.call("g", list(quote(diamonds)))
)
#> Unit: microseconds
#>                                 expr   min    lq median   uq   max neval
#>           do.call(g, list(diamonds)) 336.3 439.8 1544.2 1808 22521   100
#>         do.call("g", list(diamonds)) 330.5 534.0 1597.0 1826 23173   100
#>    do.call(g, list(quote(diamonds)))   3.6   5.1    7.2   11    35   100
#>  do.call("g", list(quote(diamonds)))   4.0   5.2    6.5   12    21   100

This is a bit of an aritifical benchmark, because match.call() is not that common. It’s more useful to inspect the performance of a data modifying function:

microbenchmark(
  do.call(head, list(diamonds)),
  do.call("head", list(diamonds)),
  do.call(head, list(quote(diamonds))),
  do.call("head", list(quote(diamonds)))
)
#> Unit: microseconds
#>                                    expr min   lq median   uq   max neval
#>           do.call(head, list(diamonds)) 572 1285   2052 2431 24562   100
#>         do.call("head", list(diamonds)) 567  971   1946 2340 23109   100
#>    do.call(head, list(quote(diamonds))) 211  228    246  310  1840   100
#>  do.call("head", list(quote(diamonds))) 217  227    246  308  1833   100

However, something strange happens if we wrap head() in another function:

head2 <- function(x) {
  head(x)
}
microbenchmark(
  do.call(head, list(diamonds)),
  do.call(head2, list(diamonds))
)
#> Unit: microseconds
#>                            expr min   lq median   uq   max neval
#>   do.call(head, list(diamonds)) 566 1071   2132 2383 24285   100
#>  do.call(head2, list(diamonds)) 213  225    237  308  1304   100

I suspect that this is because head() is an S3 generic, which is likely to do some manipulation of the call when before calling the method. If we call a function that isn’t an S3 generic, some of the inlined calls are actually slightly faster, probably because inlining avoids variable lookup.

microbenchmark(
  do.call(nrow, list(diamonds)),
  do.call("nrow", list(diamonds)),
  do.call(nrow, list(quote(diamonds))),
  do.call("nrow", list(quote(diamonds)))
)
#> Unit: microseconds
#>                                    expr min  lq median  uq max neval
#>           do.call(nrow, list(diamonds)) 5.4 5.7    6.0 6.5  13   100
#>         do.call("nrow", list(diamonds)) 5.5 5.8    6.1 6.4  52   100
#>    do.call(nrow, list(quote(diamonds))) 5.4 5.8    6.1 6.5  45   100
#>  do.call("nrow", list(quote(diamonds))) 5.5 5.9    6.1 6.5  16   100

The biggest impact of using do.call() is when an error occurs. It takes substantially longer for control to return back to the console when a large object is inlined:

h <- function(x) {
  stop("!")
}

system.time(do.call(h, list(diamonds)))
#> Timing stopped at: 0.687 0.006 0.693
system.time(do.call("h", list(quote(diamonds))))
#> Timing stopped at: 0.001 0 0.002 

(The previous chunk is not evaluated by knitr because I can’t reproduce the problem in a non-interactive setting. Instead I executed the code by hand and copied and pasted the output.)