artUI <- function(id) {
ns <- NS(id)
tagList(
checkboxInput(
ns("input1"), # wrap ids in ns()
"Check Here"
),
selectInput(
ns("input2"), # wrap ids in ns()
"Select Object",
choices = c("jar", "vase"),
selected = "jar",
multiple = FALSE
),
plotOutput(ns("plot1"))
)
}
artServer <- function(id) {
moduleServer( # wrap the regular server stuff in moduleServer()
id,
function(input, output, session) { # regular server part
df <- reactive({
# do something fancy
})
output$plot1 <- renderPlot({
ggplot(df(), aes(x = x, y = y)) +
geom_point()
})
}
)
}rstudio::conf(2022)
Highlights and Notes from the Conference
Conference Takeaways
This document highlights insights from rstudio::conf(2022). There are likely others that I likely missed, but luckily just about everything is/will be online. For example, check out the workshops or this repo linking to talk materials.
A few things that are immediately important to mention:
- RStudio is changing its name to Posit! This is a move to show their place in data science (and science in general). They do not just see themselves as just an R company, but rather a much broader analytics and communication company open to using the best tools available (which can often be R but can also be python, julia, etc.). This name change will impact a few things but we’ve been told RStudio (the IDE) will stay RStudio. As such, much of the contact points between us and the tools will be unchanged. They said many times this isn’t a move away from R, but rather a move that makes it more clear they are inclusive and plan to incorporate more languages into their toolbox.
- Quarto is taking over the world (at least that’s how it felt at the conference). Quarto is the new
rmarkdownwith expanded features and modern styling. It has so much going for it that its pretty obvious that it is the future of analytic/scientific communication. Note that this document was written in Quarto. - Shiny is expanding. First, there is now a Shiny for python, with similar features but with no plans to keep them “equal.” Second, new tools (like a point-n-click UI designer that writes code for you) are being distributed. In conjunction with all the extensions to shiny, it makes shiny a tool for production-grade applications.
- New tools for machine learning, working with databases, and designing for impact. This is broadly the themes of the rest of the talks (not shiny or quarto related).
dbcooper: slick interface for working with databases that feel more natural for individuals working with data framespool: powerful tool to reduce congestion when access databases, particularly useful for shiny applicationsrenv: just use it because it is amazing and will save you heartache
Posit
For more information on the rebranding, visit their website.
Quarto
Quarto is quite impressive. It can be used to produce individual documents, websites, blogs, presentations, and more. It feels more modern and is clearly the tool of the future for tying data, code, and output together in beautiful documents. A few things are noteworthy.
- It is multi-lingual. It is designed to work with R obviously but already handles python, julia, observable, and will certainly add many more.
- It’s ability to produce beautiful presentations is helpful when communicating analytical concepts (e.g., code, output, equations).
- If you use
rmarkdownyou can change the.rmdextension to.qmdand it will work out of the box. So no need to make drastic changes if you want to adopt Quarto in your workflow.
There are already great docs online if you want to learn more!
Note that this document is made with Quarto.
Shiny
Given I spent the first two days in a workshop devoted to shiny, I’ve spent a considerable amount of time diving deeper into its new features, extensions, and future. My work thus far is strictly with R but I’m sure use cases for python are many.
A few useful tools that I picked up in the workshop and in the talks are below.
Modules
- Avoids namespace collisions when using same widget across different areas of your app
- Allow you to encapsulate distinct app interfaces
- Organize code into logical and easy to understand components
- Facilitate collaboration
- Kind of like regular functions in R
Anatomy of a module:
The moduleServer() encapsulates server-side logic with namespace applied
Invoking modules:
ui <- fluidPage(
fluidRow(
artUI("mod1")
)
)
server <- function(input, output, session) {
artServer("mod1")
}
shinyApp(ui, server)Can include other arguments in the UI and server module functions.
UI function:
- Reasonable inputs: static values, vectors, flags
- Avoid reactive parameters in UI
- Return value for UI is a
tagList()of inputs, output placeholders, and other UI elements
Server function:
- Input parameters and return values can be a mix of static and reactive objects
artServer <- function(id, df, title = "Amazing") {
moduleServer(id,
function(input, output, session) {
user_selections <- reactive({
list(input1 = input$input1,
input2 = input$input2)
})
output$plot1 <- renderPlot({
ggplot(df(), aes(x = x, y = y)) +
geom_point() +
ggtitle(title)
})
user_selections
}
)
}
# app server
df <- reactive({
art_data |>
filter(dept == input$dept)
})
artServer("mod1", df)In the code above, df is a reactive but we do not use () when we pass it to the function. But when we use (“invoke”) df in the code, we use df() to get the value. The user_selections is being returned by the function, we return the name (user_selections) not the value (user_selections()).
Put module scripts in the R folder.
Consider the example below:
art_search_UI <- function(id, dept_choices) {
ns <- NS(id)
tagList(
fluidRow(
column(
width = 4,
textInput(
ns("search_box"),
"Search Query",
placeholder = "enter single word"
)
),
column(
width = 6,
selectInput(
ns("dept"),
"Select Department",
choices = dept_choices,
selectize = FALSE
)
)
),
fluidRow(
column(
width = 4,
actionButton(
ns("search_btn"),
label = "Search",
icon = icon("keyboard")
)
)
)
)
}
art_search_Server <- function(id) {
moduleServer(
id,
function(input, output, session) {
search_results <- reactive({
if (!shiny::isTruthy(input$search_box)) {
shinyWidgets::show_toast(
"Enter a search term",
type = "error",
position = "top"
)
return(NULL)
}
search_test <- search_dept_data(q = input$search_box, departmentId = input$dept)
if (is.null(search_test)) {
message("I got nothing")
shinyWidgets::show_toast(
"I got nothing",
type = "error",
position = "center"
)
return(NULL)
}
search_test
}) %>% bindEvent(input$search_btn, ignoreInit = TRUE)
search_results
}
)
}bindEvent() makes it so it isn’t evaluated until the other input is triggered. So this will only search once input$search_btn is clicked.
shinyWidgets notifications
show_toast()provides a nice error message pop up (see code above)- Then
return(NULL)to essentially abort the function
This is a form of defensive programming where you plan for problems and communicate it clearly to the user, instead of having a weird, cryptic error
bslib
This package allows you to edit elements of the default bootstrap theme used by shiny directly in R.
- Can explore theme options interactively
- Built upon the Sass stylesheet language to extend traditional CSS with modern features
Run the following to play around with the theme:
library(shiny)
library(bslib)
bslib::bs_theme_preview()When you make changes to the preview, the code needed to use that style will show up in the console.
To see it in the app itself, you can insert the following into the server.
bs_themer()This allows you to play around with theme elements within your own app.
shinytest2
Built on testthat, shinytest2 allows you to automate the testing of your app. This can be a very important addition to your workflow as you will be able to catch bugs far quicker.
# File: simple-app/app.R
library(shiny)
ui <- fluidPage(
textInput("name", "What is your name?"),
actionButton("greet", "Greet"),
textOutput("greeting")
)
server <- function(input, output, session) {
output$greeting <- renderText({
req(input$greet)
paste0("Hello ", isolate(input$name), "!")
})
}
shinyApp(ui, server)With this simple app created, we can create a test that will call the app, insert inputs, click on “greet”, and produce values.
# File: simple-app/tests/testthat/test-shinytest2.R
library(shinytest2)
test_that("shinytest2 recording: simple-app", {
app <- AppDriver$new(name = "simple-app", height = 407, width = 348)
app$set_inputs(name = "Tyson")
app$click("greet")
app$expect_values()
})This package has a lot of depth so check out the docs for more use cases and more in-depth tests.
cicerone
This package allows you to have a walk through of your app when someone first encounters it. This code is an example of using cicerone (put use_cicerone() in the UI as well).
guide <- cicerone::Cicerone$
new(allow_close = TRUE)$
step(
"dept",
"Department",
"Choose from any department"
)$
step(
"choice_table",
"Your Choices",
"Each choice will appear in a table here"
)The dept and choice_table are names of objects in the UI.
Debugging with browser()
If you want to assess how the environment looks at certain parts of the app, you can put browser() in any reactive and the browser environment will pop up and you can look at current objects (including inputs).
You can also use conditionals to invoke browser() only when certain things are triggered. For example:
if (!is.null(input$timevis_selected)) browser()httr2
httr2 provides a pipeable API of httr
- Build a request object to facilitate different pieces of a request workflow
- Ability to perform dry-runs before actually sending the request
- Converts HTTP errors to R errors
An example of pulling from the MET API:
library(dplyr)
library(tidyr)
library(purrr)
library(httr2)
# refer to https://metmuseum.github.io/ for documentation of API endpoints
base_url <- "https://collectionapi.metmuseum.org/public/collection/v1"
# How many artwork pieces have been updated in the museum database since July 1st, 2022?
req <- request(base_url) %>%
req_url_path_append("objects") %>%
# add query parameter metadataDate
req_url_query(metadataDate = "2022-07-01")
req_dry_run(req) # dry run
resp <- req_perform(req) # actually performs the request
resp_status(resp) # 200 is OK
# exports JSON
objects_updated <- resp %>%
resp_body_json()
# Example of taking JSON to tibble
req <- request(base_url) %>%
req_url_path_append("departments")
resp <- req_perform(req)
resp_status(resp)
departments <- resp %>%
resp_body_json() %>%
purrr::pluck("departments") %>%
transpose() %>%
tibble::as_tibble() %>%
tidyr::unnest(cols = c("departmentId", "displayName"))Debounce
When you want to control how quickly something reacts. For example, we don’t want to search for each time another letter is entered into a search bar.
query_term <- reactive({
input$object_search
}) %>%
debounce(1000) # makes it wait 1 second
search_res <- reactive({
req(query_term())
# other stuff you want it to do
})CSS
To use an external CSS file, you can link to it using:
tags$head(
tags$link(
rel = "stylesheet",
type = "text/css",
href = "custom.css"
)
)Tidymodels
From the very start, tidymodels made appearances via the first keynote (Julia Silge and Max Kuhn who have a new book out, available for free here), a whole section of talks devoted to extensions to it, and a book signing with Max and Julia. I highlight some of the key takeaways from my perspective (although there are so many things that could be included). Take a look at the docs if you want to see more.
Censored models
One useful for my work is censored models. It is called censored and has some documentation and examples.
The example they provide that most relevant to my work is with the propoortional models.
library(tidymodels)Warning: package 'tidymodels' was built under R version 4.1.2
── Attaching packages ────────────────────────────────────── tidymodels 1.0.0 ──
✔ broom 1.0.0 ✔ recipes 1.0.1
✔ dials 1.0.0 ✔ rsample 1.0.0
✔ dplyr 1.0.9 ✔ tibble 3.1.7
✔ ggplot2 3.3.6 ✔ tidyr 1.2.0
✔ infer 1.0.2 ✔ tune 1.0.0
✔ modeldata 1.0.0 ✔ workflows 1.0.0
✔ parsnip 1.0.0 ✔ workflowsets 1.0.0
✔ purrr 0.3.4 ✔ yardstick 1.0.0
Warning: package 'broom' was built under R version 4.1.2
Warning: package 'dials' was built under R version 4.1.2
Warning: package 'scales' was built under R version 4.1.2
Warning: package 'dplyr' was built under R version 4.1.2
Warning: package 'ggplot2' was built under R version 4.1.2
Warning: package 'infer' was built under R version 4.1.2
Warning: package 'modeldata' was built under R version 4.1.2
Warning: package 'parsnip' was built under R version 4.1.2
Warning: package 'recipes' was built under R version 4.1.2
Warning: package 'rsample' was built under R version 4.1.2
Warning: package 'tibble' was built under R version 4.1.2
Warning: package 'tidyr' was built under R version 4.1.2
Warning: package 'tune' was built under R version 4.1.2
Warning: package 'workflows' was built under R version 4.1.2
Warning: package 'workflowsets' was built under R version 4.1.2
Warning: package 'yardstick' was built under R version 4.1.2
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ purrr::discard() masks scales::discard()
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
✖ recipes::step() masks stats::step()
• Use tidymodels_prefer() to resolve common conflicts.
library(censored)Warning: package 'censored' was built under R version 4.1.2
Loading required package: survival
Warning: package 'survival' was built under R version 4.1.2
library(survival)
tidymodels_prefer()
data(cancer)
# some adjustment of the data to fit the example
lung <- lung %>% drop_na()
lung_train <- lung[-c(1:5), ]
lung_test <- lung[1:5, ]Then to actually run the model we will use a few steps:
set.seed(1)
proportional_hazards() %>%
set_engine("survival") %>%
set_mode("censored regression") %>%
fit(Surv(time, status) ~ ., data = lung_train)parsnip model object
Call:
survival::coxph(formula = Surv(time, status) ~ ., data = data,
model = TRUE, x = TRUE)
coef exp(coef) se(coef) z p
inst -0.0291726 0.9712488 0.0131293 -2.222 0.02629
age 0.0146341 1.0147417 0.0119705 1.223 0.22151
sex -0.5977137 0.5500678 0.2051326 -2.914 0.00357
ph.ecog 0.7507039 2.1184906 0.2536100 2.960 0.00308
ph.karno 0.0137315 1.0138262 0.0132752 1.034 0.30096
pat.karno -0.0082098 0.9918238 0.0082560 -0.994 0.32002
meal.cal -0.0001233 0.9998767 0.0002841 -0.434 0.66435
wt.loss -0.0188464 0.9813301 0.0082051 -2.297 0.02162
Likelihood ratio test=32.61 on 8 df, p=7.224e-05
n= 162, number of events= 116
Clustering
I wanted to just highlight that this was available if you are doing cluster analysis. It’s new but looks cool. It has documentation here and is designed to work within the tidymodels framework.
Working with Databases
Two packages seem very useful for ED&A:
dbcooper: an innovative way to interact with databases that feels more natural for R users.pool: a package to automatically pool connections to databases (and automatically disconnect). Particularly useful for integration withshiny.
dbcooper
dbcooper turns a database connection into a collection of functions, handling logic for keeping track of connections and letting you take advantage of autocompletion when exploring a database. This example is from the GitHub page.
library(dbcooper)
dbc_init(con, "con_name")dbc_init then creates user-friendly accessor functions in your global environment. (You could also pass it an environment in which the functions will be created).
dbc_init adds several functions when it initializes a database source. In this case, each will start with the lahman_ prefix.
_list: Get a list of tables_tbl: Access a table that can be worked with in dbplyr_query: Perform of a SQL query and work with the result_execute: Execute a query (such as a CREATE or DROP)_src: Retrieve a dbi_src for the database
For instance, we could start by finding the names of the tables in the Lahman database.
lahman_list()
#> [1] "AllstarFull" "Appearances" "AwardsManagers"
#> [4] "AwardsPlayers" "AwardsShareManagers" "AwardsSharePlayers"
#> [7] "Batting" "BattingPost" "CollegePlaying"
#> [10] "Fielding" "FieldingOF" "FieldingOFsplit"
#> [13] "FieldingPost" "HallOfFame" "HomeGames"
#> [16] "LahmanData" "Managers" "ManagersHalf"
#> [19] "Master" "Parks" "People"
#> [22] "Pitching" "PitchingPost" "Salaries"
#> [25] "Schools" "SeriesPost" "Teams"
#> [28] "TeamsFranchises" "TeamsHalf" "sqlite_stat1"
#> [31] "sqlite_stat4"We can access one of these tables with lahman_tbl(), then put it through any kind of dplyr operation.
lahman_tbl("Batting")
#> # Source: SQL [?? x 22]
#> # Database: sqlite 3.34.1
#> # [/private/var/folders/wp/6jpw10dj1b13vw5n9bvf1dvc0000gn/T/RtmpuEyzKR/lahman.sqlite]
#> playerID yearID stint teamID lgID G AB R H X2B
#> <chr> <int> <int> <chr> <chr> <int> <int> <int> <int> <int>
#> 1 abercda01 1871 1 TRO NA 1 4 0 0 0
#> 2 addybo01 1871 1 RC1 NA 25 118 30 32 6
#> 3 allisar01 1871 1 CL1 NA 29 137 28 40 4
#> 4 allisdo01 1871 1 WS3 NA 27 133 28 44 10
#> 5 ansonca01 1871 1 RC1 NA 25 120 29 39 11
#> 6 armstbo01 1871 1 FW1 NA 12 49 9 11 2
#> 7 barkeal01 1871 1 RC1 NA 1 4 0 1 0
#> 8 barnero01 1871 1 BS1 NA 31 157 66 63 10
#> 9 barrebi01 1871 1 FW1 NA 1 5 1 1 1
#> 10 barrofr01 1871 1 BS1 NA 18 86 13 13 2
#> # … with more rows, and 12 more variables: X3B <int>, HR <int>,
#> # RBI <int>, SB <int>, CS <int>, BB <int>, SO <int>, IBB <int>,
#> # HBP <int>, SH <int>, SF <int>, GIDP <int>
#> # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable namespool
pool with database connections (avoids opening and closing many connections so a shiny app can scale to many users).
- Can help scale the use of databases with
shiny dbPool()allows you to do that (it replacesdbConnect())- Each query goes to the pool first, then fetches or initializes a connection
- Also handles the disconnects in shiny
Use renv
Create reproducible environments for your R projects
- Next generation of packrat
- Isolated package library from rest of your system
- Transfer projects to different collaborators/platforms
- Reproducible package installation
- Easily create new projects or convert existing projects
Upon initializing a project:
- Creates a project level
.Rprofileto activate custom package library on start up - Lockfile
renv.lockto describe state of project library renv/libraryhas the package information (doesn’t actually store them, instead those are at a central folder)renv/activate.Rperforms activation