library(dplyr)
library(purrr)
library(dplyr)
library(ggplot2)
library(ggthemes)
library(microbenchmark)
Carl Boettiger (@cboettig) posted the following on Twitter the other day:
Better #rstats pattern for do.call(bind_rows, lapply(list_of_stuff, function_returning_data.frame)) ? @drob @JennyBryan
— Carl Boettiger (@cboettig) December 29, 2015
The following is an attempt at a comparison between three ways of applying a function over a list/vector and ultimtately getting a data.frame/tbl_df back.
We’ll add a fourth with an attempt to replicate ldply using “modern” components.
ldmap <- function(lst, fun) { bind_rows(map(lst, fun)) }
Rock’em Sock’em Binders
microbenchmark(
using_ldply=plyr:::ldply(LETTERS, function(x) { data_frame(ltr=x)}),
using_ldmap=ldmap(LETTERS, function(x) { data_frame(x) }),
using_map=bind_rows(map(LETTERS, function(x) { data_frame(ltr=x)})),
using_lapply=bind_rows(lapply(LETTERS, function(x) { data_frame(ltr=x) })),
times=2500
) -> mb
Who will win?
autoplot(mb) +
scale_y_log10() +
theme_tufte(base_family="Helvetica") +
labs(x=NULL, y="Time (ms) - log scale",
title="Battle of the data frame binders (2,500 runs)")