Function operators

Learning objectives:

  • Define function operator
  • Explore some existing function operators
  • Make our own function operator

Introduction

  • A function operator is a function that takes one (or more) functions as input and returns a function as output.

A function operator (FO) is a higher-order function that takes one or more functions as input and returns a new function as output

  • Function operators are a special case of function factories, since they return functions.

  • They are often used to wrap an existing function to provide additional capability, similar to python’s decorators

Purpose To adapt, augment, or combine the behavior of existing functions without modifying their source code.

Examples FOs for Modifying Input and Output

These operators change how a function interacts with its arguments or what it returns.

Input Conversion: base::Vectorize()

The Vectorize() Function Operator (FO) is a convenience function that turns a scalar function (one that expects single values) into a vectorized one (one that can operate element-wise on vectors).

Note: Vectorize() does this by repeatedly calling the original function inside an mapply loop; it does not magically improve performance.

Input Conversion: splat()

  • The splat() Function(FO) converts a function that expects multiple named arguments into a function that expects a single list of arguments.

  • This is crucial for working with functions like lapply or purrr::map, which only iterate over one vector.

Output Augmentation: Timing and Capturing

FO Purpose Technique
time_it() Returns a function that executes the original function and returns a vector of timing information. Wraps the function call in system.time().
capture_it() Returns a function that executes the original function and returns its printed output as a character vector. Wraps the function call in capture.output().

3. FOs for Handling Errors ⚠️

These Funtional Operators(FOs), typically from the purrr package, modify a function to handle errors gracefully, allowing iteration (e.g., with map()) to continue even if some calls fail.

Existing function operators

Two function operator examples are purrr:safely() and memoise::memoise(). These can be found in purr and memoise:

library(purrr)
library(memoise)

purrr::safely

Capturing Errors: turns errors into data!

safely(f) returns a new function that will never throw an error. Instead, it always returns a list with two components: result and error.

purrr::possibly()

possibly(f, otherwise) returns a new function that returns a specified default value (otherwise) if the original function fails.

Funtional Operators(FOs) for Combining Functions (Composition)

Function composition links functions so the output of one becomes the input of the next.

  • Function Composition Operator: %o%
  • The infix operator %o% (from purrr::compose or custom definition) creates a new function by combining others.

Mathematical Notation: \((f \circ g)(x) = f(g(x))\) R Equivalent: f %o% g returns a function that computes \(f(g(x))\).

Functional Algebra: and()

FOs can combine functions that return a logical value into a single function, often used for data filtering.

Key Advantage: This avoids scattering logical conditions throughout your code, centralizing filtering logic.

purrr comes with three other function operators in a similar vein:

  • possibly(): returns a default value when there’s an error. It provides no way to tell if an error occured or not, so it’s best reserved for cases when there’s some obvious sentinel value (like NA).

  • quietly(): turns output, messages, and warning side-effects into output, message, and warning components of the output.

  • auto_browser(): automatically executes browser() inside the function when there’s an error.

possibly()

possibly() function operator from the purrr package

Example and Explanation

  • The possibly() function takes two arguments:

    .f: The function to be adapted.

    otherwise: The value to be returned if an error occurs.

Detailed Example: Calculating Logarithms Consider the log() function, which throws an error if given a negative number. We can use possibly() to make it return NA (Not Available) in case of an error.

The main use of possibly() is within functional programming tools like lapply or purrr::map when processing a list or vector that may contain problematic elements. This ensures the iteration completes, and you can review the failures later.

quietly()

The quietly() function operator from the purrr package is used to create a modified function that captures and silences all messages, warnings, and printed output, returning them as part of the result.

quietly() Example and Explanation

The quietly() function takes a single argument:

.f: The function to be adapted.

The function returned by quietly(f) always returns a list of four elements, regardless of whether the original function \(f\) succeeds or fails:

result: The standard return value of \(f\).

output: Any text printed to the console (captured from stdout)

warnings: Any warnings generated by \(f\).

messages: Any messages generated by \(f\).

Detailed Example: Capturing and Silencing

Consider a function that prints a message, issues a warning, and returns a value.

Reviewing the Captured Output The side effects are now stored as components of the returned list:

Let’s test the negative input, which triggers the warning:

quietly() is essential when you need to run functions that are noisy (print a lot of text, messages, or warnings) within loops or reports, but you still need to log or inspect those side effects later without cluttering the console output. It allows you to run functions silently while retaining the information about what happened during execution.

auto_browser()

The auto_browser() function operator from the purrr package (or similar concepts in other packages like rlang)

is designed to insert an automatic call to the R debugger, browser(), into a function. It’s primarily used for debugging a function only when it fails (i.e., when it throws an error)

The auto_browser() function takes a single argument:

.f: The function to be adapted.

The function returned by auto_browser(.f) is a new function that behaves identically to .f unless an error occurs. If an error is thrown, the new function automatically pauses execution just before the error is thrown, dropping you into an interactive debugging session (browser()).

rlang::with_options(): Temporarily changes R’s global options just for the duration of the enclosed code block.

c(error = rlang::qq_show_browser): This sets the global error option.

The standard option is error = NULL (which prints the error and stops).

Setting it to a function like rlang::qq_show_browser (or simply recover in base R) tells R: ” If an error occurs, pause execution and drop me into the debugger (browser()) at the point of failure.”

Debugger Entry: When divide_by_positive(10, -5) is called:

The stop() line is hit.

Instead of terminating, R enters the debugger immediately before the environment is destroyed.

You can then inspect the variables (denominator = -5) to diagnose the cause of the failure.

memoise::memoise

Funtion Operator(FO) for Optimization: Memoisation

Memoisation is an optimization technique where the results of expensive function calls are cached, so that subsequent calls with the same arguments return the stored result immediately, avoiding re-computation.

Caching computations: avoid repeated computations!

slow_function <- function(x) {
  Sys.sleep(1)
  x * 10 * runif(1)
}
system.time(print(slow_function(1)))
system.time(print(slow_function(1)))
fast_function <- memoise::memoise(slow_function)
system.time(print(fast_function(1)))
system.time(print(fast_function(1)))

Applicability: Memoisation is ideal for pure functions (same input –> same output, no side effects) that are slow.

Be careful about memoising impure functions!

Exercise

How does safely() work? The source code looks like this:

The real work is done in capture_error which is defined in the package namespace. We can access it with the ::: operator. (Could also grab it from the function’s environment.)

Case study: make your own function operator

urls <- c(
  "adv-r" = "https://adv-r.hadley.nz", 
  "r4ds" = "http://r4ds.had.co.nz/"
  # and many many more
)
path <- paste(tempdir(), names(urls), ".html")

walk2(urls, path, download.file, quiet = TRUE)

Here we make a function operator that add a little delay in reading each page:

delay_by <- function(f, amount) {
force(f)
force(amount)

function(...) {
  Sys.sleep(amount)
  f(...)
}
}
system.time(runif(100))

system.time(delay_by(runif, 0.1)(100))

And another to add a dot after nth invocation:

Can now use both of these function operators to express our desired result:

walk2(
urls, path, 
download.file %>% dot_every(10) %>% delay_by(0.1), 
quiet = TRUE
)

Exercise

  1. Should you memoise file.download? Why or why not?