do.call

do.call() has rather surprising behaviour. You might expect that do.call("f", list(x) would equivalent be f(x), but that’s far from the truth. The following example shows you what various calls to do.call() actually generate:

df <- data.frame(x = runif(1), y = runif(1))

f <- function(x) {
  sys.call()
}
# Worst: inlines f and df
do.call(f, list(df))
#> (function (x) 
#> {
#>     sys.call()
#> })(list(x = 0.263280282728374, y = 0.70501884073019))

# Only inlines df
do.call("f", list(df))
#> f(list(x = 0.263280282728374, y = 0.70501884073019))

# Only inlines f
do.call(f, list(quote(df)))
#> (function (x) 
#> {
#>     sys.call()
#> })(df)

# Best: inlines neither. Equivalent to f(df)
do.call("f", list(quote(df)))
#> f(df)

Performance

This has some obvious performance implications: when the call is larger, match.call() is slower:

object.size(diamonds)
#> 3456256 bytes

microbenchmark(
  do.call(f, list(diamonds)),
  do.call("f", list(diamonds)),
  do.call(f, list(quote(diamonds))),
  do.call("f", list(quote(diamonds)))
)
#> Unit: microseconds
#>                                 expr  min   lq median   uq   max neval
#>           do.call(f, list(diamonds)) 1.76 1.90   2.09 2.60 46.48   100
#>         do.call("f", list(diamonds)) 1.79 2.02   2.24 2.71 11.72   100
#>    do.call(f, list(quote(diamonds))) 1.76 1.92   2.18 2.45 28.63   100
#>  do.call("f", list(quote(diamonds))) 1.80 2.09   2.27 2.67  9.79   100

This is a bit of an aritifical benchmark, because match.call() is not that common. It’s more useful to inspect the performance of a data modifying function:

microbenchmark(
  do.call(head, list(diamonds)),
  do.call("head", list(diamonds)),
  do.call(head, list(quote(diamonds))),
  do.call("head", list(quote(diamonds)))
)
#> Unit: microseconds
#>                                    expr min   lq median   uq   max neval
#>           do.call(head, list(diamonds)) 559 1393   2077 2349 25044   100
#>         do.call("head", list(diamonds)) 618 1596   2074 2320 25191   100
#>    do.call(head, list(quote(diamonds))) 217  243    297  329  1137   100
#>  do.call("head", list(quote(diamonds))) 215  234    252  319   474   100

However, something strange happens if we wrap head() in another function:

head2 <- function(x) {
  head(x)
}
microbenchmark(
  do.call(head, list(diamonds)),
  do.call(head2, list(diamonds))
)
#> Unit: microseconds
#>                            expr min   lq median   uq   max neval
#>   do.call(head, list(diamonds)) 595 1146   2018 2383 24638   100
#>  do.call(head2, list(diamonds)) 216  234    291  318   391   100

I suspect that this is because head() is an S3 generic, which is likely to do some manipulation of the call when before calling the method. If we call a function that isn’t an S3 generic, some of the inlined calls are actually slightly faster, probably because inlining avoids variable lookup.

microbenchmark(
  do.call(nrow, list(diamonds)),
  do.call("nrow", list(diamonds)),
  do.call(nrow, list(quote(diamonds))),
  do.call("nrow", list(quote(diamonds)))
)
#> Unit: microseconds
#>                                    expr  min   lq median   uq  max neval
#>           do.call(nrow, list(diamonds)) 5.29 5.74   5.97 6.34 51.5   100
#>         do.call("nrow", list(diamonds)) 5.56 5.81   6.07 6.39 15.0   100
#>    do.call(nrow, list(quote(diamonds))) 5.53 5.78   5.98 6.32 16.4   100
#>  do.call("nrow", list(quote(diamonds))) 5.58 5.93   6.13 6.42 51.2   100

The biggest impact of using do.call() is when an error occurs. It takes substantially longer for control to return back to the console when a large object is inlined:

h <- function(x) {
  stop("!")
}

system.time(do.call(h, list(diamonds)))
#> Timing stopped at: 0.687 0.006 0.693
system.time(do.call("h", list(quote(diamonds))))
#> Timing stopped at: 0.001 0 0.002 

(The previous chunk is not evaluated by knitr because I can’t reproduce the problem in a non-interactive setting. Instead I executed the code by hand and copied and pasted the output.)