Beyond Async

ShinyConf 2024

Joe Cheng

2024-04-18

Let’s talk about slow code

Ideally, all the code you put into your Shiny app should be fast and responsive.

If something is slow, first try making it fast.

From my talk from rstudio::conf 2019, “Shiny in Production: Principles, Practices, and Tools”.

Don’t guess: measure with a profiler.
Compute before the app even launches, if you can.
Use fast file formats: instead of CSV, try feather or parquet.
Use caching if appropriate (renderCachedPlot, bindCache).

But sometimes, things are just going to be slow…

Calling a slow API
Training a large model
Compiling a large, dynamic report

Demo 1: Slow API

Credit: This demo builds on examples by Veerle van Leemput at @hypebright/async_shiny

Long-running operations are a problem!

They block other users from connecting or interacting with the app. (inter-session concurrency)
They block the user from doing other things in the same app while they’re waiting. (intra-session concurrency)

Introducing ExtendedTask

A new feature for both Shiny for R (1.8.1) and Python (0.7.0)
Allows you to run long-running tasks for a user while preserving both inter- and intra-session concurrency.
No crazy steep learning curve for app authors—unlike previous approaches (but we’ll get to that).

Demo 2: Slow API

This time with ExtendedTask

Without ExtendedTask (R)

server <- function(input, output, session) {
  msg <- eventReactive(input$go, {
    # Simulate a long-running task
    Sys.sleep(5)
    paste0("Hello, ", input$name, "!")
  })

  output$message <- renderText({
    msg()
  })
}

With ExtendedTask (R)

server <- function(input, output, session) {
  # Define the task
  msg_task <- ExtendedTask$new(function(name) {
    future({
      # Simulate a long-running task
      Sys.sleep(5)
      paste0("Hello, ", name, "!")
    })
  })

  observeEvent(input$go, {
    # Start the task
    msg_task$invoke(input$name)
  })

  output$message <- renderText({
    # Use the task's result
    msg_task$result()
  })
}

Without ExtendedTask (Python)

@reactive.calc
@reactive.event(input.go)
def msg():
    # Simulate a long-running task
    time.sleep(5)
    return f"Hello, {input.name()}"

@render.text
def message():
    return msg()

With ExtendedTask (Python)

# Define the task
@reactive.extended_task
async def msg_task(name):
    # Simulate a long-running task
    await asyncio.sleep(5)
    return f"Hello, {name}!"

@reactive.effect
@reactive.event(input.go)
def _():
    # Start the task
    msg_task.invoke(input.name())

@render.text
def message():
    # Use the task's result
    return msg_task.result()

Getting started

For Shiny for R:

https://shiny.posit.co/r/articles/improve/nonblocking/

For Shiny for Python:

https://shiny.posit.co/py/docs/nonblocking.html

How did we get here?

In 2017-2018, we introduced async programming to R, and then to Shiny.

A very technically challenging and conceptually elegant feature.

Use {future} (by Hernik Bengtsson) to perform long-running operations in background R process.
Use {promises} to handle the results of these operations, back in the original R process, in a principled way.
Rewrite Shiny’s internals to support async programming.

ExtendedTask vs. Shiny Async

Shiny Async, while necessary, was never a truly satisfying solution.

Incredibly steep learning curve.
Async operations “infect” everything downstream of them; any code that calls an async function must become an async function.¹
Didn’t solve the problem of intra-session concurrency. People noticed.

Cover art of the O'Reilly book 'Mastering Shiny' by Hadley Wickham

The graph is at equilibrium now.

Everything up til now has been a single “tick” (as in “tick of the clock”).

Somewhere in the depths of Shiny is a loop that looks like this:

while (TRUE) {
  changes <- wait_for_input_changes()
  changed_outputs <- recompute_all_affected_things(changes)
  send_outputs(changed_outputs)
}

Each trip through the loop is a “reactive tick”.

Only at the beginning of a reactive tick do we check for input changes.
Only at the end of a reactive tick do we send the outputs to the client.

Long-running tasks bring the reactive graph to a halt

while (TRUE) {
  changes <- wait_for_input_changes()
  changed_outputs <- recompute_all_affected_things(changes)
  send_outputs(changed_outputs)
}

Long-running tasks bring the reactive graph to a halt

Inputs and outputs are stalled
Other users are blocked

How can we separate the task from the tick?

With Shiny Async, you can run multiple graphs concurrently…

But you can’t run multiple tasks concurrently within a single graph because the shape of the graph is unchanged.

ExtendedTask (vs. Shiny Async)

ET supports both inter- and intra-session concurrency
ET doesn’t require you to learn a strange syntax
ET doesn’t spill its async-ness all over your codebase
ET still relies on {future} to put the task in the background

To do

Cancellation is only supported in Python for now
Progress reporting is not yet supported in either language
Both of these will require cooperation from {future} or alternatives

{future} alternatives

future is a great package, but it’s not the only way to run R code in the background.

Pros:

Very convenient, “automagic” API
Popular and well supported for many years

Cons:

High runtime overhead
An intimidating number of options and extensions
Complex implementation

{future} alternatives

{mirai} is a new package by Charlie Gao that, like future, can run code in a background R process.

Pros:

Very low overhead compared to future
API is easy to understand, no “magical” features

Cons:

Still relatively new, not as battle-tested as future
Lack of “automagic” features makes it less convenient to launch tasks

{future} alternatives

{crew} by Will Landau builds on mirai to provide a convenient way to launch and manage multiple (potentially very many) tasks.

crew works with ExtendedTask and has examples in its docs. I’m not yet familiar enough with crew to give it a full evaluation.

{future} alternatives

A {future.mirai} package is currently in beta, and should combine the convenient API of future with the low overhead of mirai.

Summary

Avoid long-running tasks in your Shiny app if you can.
If you can’t, use ExtendedTask to run them in the background.
ExtendedTask docs for R
ExtendedTask docs for Python

Thank you!