Explicit promises: A new approach to non-standard evalution

This document outlines a new approach to non-standard evaluation that I plan to use in dplyr (and in tidyr, and the relevant parts of ggvis). There are three ideas:

The basic idea is to define an R class that allows us to make promises, capturing both the expression and environment associated with a promise.

Promises

We start with a new S3 class “promise” with two components, expr and env.

promise <- function(expr, env = parent.frame()) {
  structure(list(expr = expr, env = env), class = "promise")
}

is.promise <- function(x) inherits(x, "promise")

print.promise <- function(x, ...) {
  cat("<promise>\n")
  cat("  code: ", format(x$expr), "\n", sep = "")
  cat("  env: ", environmentName(x$env), "\n", sep = "")
}

promise(quote(a + x), globalenv())
## <promise>
##   code: a + x
##   env: R_GlobalEnv

Creating promises by hand is tedious - usually we’ll want to create them from a function argument, which is an internally a promise, but we can’t access the needed parts from R. To address that problem we add a little C++:

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
List make_promise_sexp(SEXP prom) {
  if (TYPEOF(prom) != PROMSXP) {
    stop("Not a promise");
  }

  // recurse until we find the real promise, not a promise of a promise
  while(true) {
    SEXP code = PRCODE(prom);
    if(TYPEOF(code) != PROMSXP) break;
    prom = code;
  }

  List promise = List::create(
    _["expr"] = PRCODE(prom),
    _["env"] = PRENV(prom)
  );
  promise.attr("class") = "promise";

  return promise;
}

// [[Rcpp::export]]
RObject make_promise_name_env(Symbol name, Environment env) {
  SEXP prom = Rf_findVar(name, env);
  return make_promise_sexp(prom);
}

We make this easy to use from R with make_promise():

make_promise <- function(x, env = parent.frame()) {
  if (identical(env, globalenv())) {
    # For interactive
    make_promise_name_env(quote(x), environment())
  } else {
    make_promise_name_env(substitute(x), env)  
  }
}

make_promise(a + b / c)
## <promise>
##   code: a + b/c
##   env: R_GlobalEnv
f <- function(x) {
  make_promise(x)
}
f(a + b / c)
## <promise>
##   code: a + b/c
##   env: R_GlobalEnv

Working with promises

We need two tools to work with promises:

  1. A way to evaluate them, optionally in the context of a data set

    eval_promise <- function(x, data = NULL) {
      stopifnot(is.promise(x))
    
      if (!is.null(data)) {
        eval(x$expr, data, x$env)
      } else {
        eval(x$expr, x$env)
      }
    }
  2. A way to make them from flexible inputs. This is not 100% reliable because since in some cases we have to guess the environment. Avoiding problems caused by that will require user education

    as.promise <- function(x, env) UseMethod("as.promise")
    as.promise.promise <- function(x, env) x
    as.promise.formula <- function(x, env) promise(x[[2]], environment(x))
    as.promise.character <- function(x, env) promise(parse(text = x)[[1]], env)
    as.promise.call <- function(x, env) promise(x, env)
    as.promise.name <- function(x, env) promise(x, env)

Putting it all together

We can use promises to implement a better version of subset():

subset <- function(data, cond) {
  cond <- make_promise(cond)
  subset_(data, cond)
}

subset_ <- function(data, cond, env = parent.frame()) {
  UseMethod("subset_")
}

subset_.data.frame <- function(data, cond, env = parent.frame()) {
  cond <- as.promise(cond, env)
  r <- eval_promise(cond, data)
  r <- r & !is.na(r)
  data[r, , drop = FALSE]
}

We only do NSE in subset(). This makes it easier to reason about what’s going on.

If you want to do standard evaluation, subset_() has a very flexible specification thanks to as.promise().

subset_(mtcars, quote(mpg > 31))
##                 mpg cyl disp hp drat    wt  qsec vs am gear carb
## Fiat 128       32.4   4 78.7 66 4.08 2.200 19.47  1  1    4    1
## Toyota Corolla 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1
subset_(mtcars, "mpg > 31")
##                 mpg cyl disp hp drat    wt  qsec vs am gear carb
## Fiat 128       32.4   4 78.7 66 4.08 2.200 19.47  1  1    4    1
## Toyota Corolla 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1
subset_(mtcars, ~ mpg > 31)
##                 mpg cyl disp hp drat    wt  qsec vs am gear carb
## Fiat 128       32.4   4 78.7 66 4.08 2.200 19.47  1  1    4    1
## Toyota Corolla 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1
subset_(mtcars, make_promise(mpg > 31))
##                 mpg cyl disp hp drat    wt  qsec vs am gear carb
## Fiat 128       32.4   4 78.7 66 4.08 2.200 19.47  1  1    4    1
## Toyota Corolla 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1
f <- function(x) ~mpg > x
subset_(mtcars, f(31))
##                 mpg cyl disp hp drat    wt  qsec vs am gear carb
## Fiat 128       32.4   4 78.7 66 4.08 2.200 19.47  1  1    4    1
## Toyota Corolla 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1
g <- function(x) bquote(mpg > .(x))
subset_(mtcars, g(31))
##                 mpg cyl disp hp drat    wt  qsec vs am gear carb
## Fiat 128       32.4   4 78.7 66 4.08 2.200 19.47  1  1    4    1
## Toyota Corolla 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1