do.calldo.call() has rather surprising behaviour. You might expect that do.call("f", list(x) would equivalent be f(x), but that’s far from the truth. The following example shows you what various calls to do.call() actually generate:
df <- data.frame(x = runif(1), y = runif(1))
f <- function(x) {
sys.call()
}
# Worst: inlines f and df
do.call(f, list(df))
#> (function (x)
#> {
#> sys.call()
#> })(list(x = 0.904893477912992, y = 0.77339426218532))
# Only inlines df
do.call("f", list(df))
#> f(list(x = 0.904893477912992, y = 0.77339426218532))
# Only inlines f
do.call(f, list(quote(df)))
#> (function (x)
#> {
#> sys.call()
#> })(df)
# Best: inlines neither. Equivalent to f(df)
do.call("f", list(quote(df)))
#> f(df)
This has some obvious performance implications: when the call is larger, match.call() is slower:
object.size(diamonds)
#> 3456256 bytes
g <- function(x) {
match.call()
}
microbenchmark(
do.call(g, list(diamonds)),
do.call("g", list(diamonds)),
do.call(g, list(quote(diamonds))),
do.call("g", list(quote(diamonds)))
)
#> Unit: microseconds
#> expr min lq median uq max neval
#> do.call(g, list(diamonds)) 336.3 439.8 1544.2 1808 22521 100
#> do.call("g", list(diamonds)) 330.5 534.0 1597.0 1826 23173 100
#> do.call(g, list(quote(diamonds))) 3.6 5.1 7.2 11 35 100
#> do.call("g", list(quote(diamonds))) 4.0 5.2 6.5 12 21 100
This is a bit of an aritifical benchmark, because match.call() is not that common. It’s more useful to inspect the performance of a data modifying function:
microbenchmark(
do.call(head, list(diamonds)),
do.call("head", list(diamonds)),
do.call(head, list(quote(diamonds))),
do.call("head", list(quote(diamonds)))
)
#> Unit: microseconds
#> expr min lq median uq max neval
#> do.call(head, list(diamonds)) 572 1285 2052 2431 24562 100
#> do.call("head", list(diamonds)) 567 971 1946 2340 23109 100
#> do.call(head, list(quote(diamonds))) 211 228 246 310 1840 100
#> do.call("head", list(quote(diamonds))) 217 227 246 308 1833 100
However, something strange happens if we wrap head() in another function:
head2 <- function(x) {
head(x)
}
microbenchmark(
do.call(head, list(diamonds)),
do.call(head2, list(diamonds))
)
#> Unit: microseconds
#> expr min lq median uq max neval
#> do.call(head, list(diamonds)) 566 1071 2132 2383 24285 100
#> do.call(head2, list(diamonds)) 213 225 237 308 1304 100
I suspect that this is because head() is an S3 generic, which is likely to do some manipulation of the call when before calling the method. If we call a function that isn’t an S3 generic, some of the inlined calls are actually slightly faster, probably because inlining avoids variable lookup.
microbenchmark(
do.call(nrow, list(diamonds)),
do.call("nrow", list(diamonds)),
do.call(nrow, list(quote(diamonds))),
do.call("nrow", list(quote(diamonds)))
)
#> Unit: microseconds
#> expr min lq median uq max neval
#> do.call(nrow, list(diamonds)) 5.4 5.7 6.0 6.5 13 100
#> do.call("nrow", list(diamonds)) 5.5 5.8 6.1 6.4 52 100
#> do.call(nrow, list(quote(diamonds))) 5.4 5.8 6.1 6.5 45 100
#> do.call("nrow", list(quote(diamonds))) 5.5 5.9 6.1 6.5 16 100
The biggest impact of using do.call() is when an error occurs. It takes substantially longer for control to return back to the console when a large object is inlined:
h <- function(x) {
stop("!")
}
system.time(do.call(h, list(diamonds)))
#> Timing stopped at: 0.687 0.006 0.693
system.time(do.call("h", list(quote(diamonds))))
#> Timing stopped at: 0.001 0 0.002
(The previous chunk is not evaluated by knitr because I can’t reproduce the problem in a non-interactive setting. Instead I executed the code by hand and copied and pasted the output.)