Non-Standard Evaluation for Bioconductor

Lucas Schiffer (@_schifferl)

June 24, 2019

Who cares about NSE?

“Are there any guidelines against using the %>% operator while doing software development? The HMP16SData package which is in review now, has completely missed the point of the pipe operator. … Extremely hard to read the code.”

— Nitesh Turaga

“Looks OK from my end. Remain skeptical about the NSE.”

— Michael Lawrence

So what is NSE?

coreTeam <-
    c("Marcel", "Qian", "Kayla", "Michael", "Lori", "Martin", "Valerie",
      "Hervé", "Daniel", "James", "Andrzej", "Nitesh", "Jiefei")
"Marcel" %in% coreTeam
## [1] TRUE
"Lucas" %in% coreTeam
## [1] FALSE

Major concepts of NSE

  1. R code is itself computeable
  2. R code is a hierarchical tree
  3. R code can write R code
  4. Evaluation = expressions + environment

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

R code is itself computeable

rlang::expr(mean(x, na.rm = TRUE))
## mean(x, na.rm = TRUE)

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

R code is a hierarchical tree

lobstr::ast(1 + 2 * 3)
## █─`+` 
## ├─1 
## └─█─`*` 
##   ├─2 
##   └─3

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

R code can write R code

xx <-
    rlang::expr(x + x)

yy <-
    rlang::expr(y + y)

rlang::expr(!!xx / !!yy)
## (x + x)/(y + y)

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

Evaluation = expressions + environment

eval(rlang::expr(x + y), rlang::env(x = 1, y = 10))
## [1] 11
eval(rlang::expr(x + y), rlang::env(x = 2, y = 43))
## [1] 45

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

A Bioconductor example

SE <-
    HMP16SData::V35()
SE[, SE$HMP_BODY_SUBSITE == "Stool"]
## class: SummarizedExperiment 
## dim: 45383 319 
## metadata(2): experimentData phylogeneticTree
## assays(1): 16SrRNA
## rownames(45383): OTU_97.1 OTU_97.10 ... OTU_97.9998 OTU_97.9999
## rowData names(7): CONSENSUS_LINEAGE SUPERKINGDOM ... FAMILY GENUS
## colnames(319): 700013549 700014386 ... 700114717 700114750
## colData names(7): RSID VISITNO ... HMP_BODY_SUBSITE SRS_SAMPLE_ID
HMP16SData::V35() %>%
    subset(select = HMP_BODY_SUBSITE == "Stool")
## class: SummarizedExperiment 
## dim: 45383 319 
## metadata(2): experimentData phylogeneticTree
## assays(1): 16SrRNA
## rownames(45383): OTU_97.1 OTU_97.10 ... OTU_97.9998 OTU_97.9999
## rowData names(7): CONSENSUS_LINEAGE SUPERKINGDOM ... FAMILY GENUS
## colnames(319): 700013549 700014386 ... 700114717 700114750
## colData names(7): RSID VISITNO ... HMP_BODY_SUBSITE SRS_SAMPLE_ID

Key references

R Packages Books
furrr Advanced R
lazyeval R for Data Science
lobstr Tidy Evaluation
magrittr
plyranges
purrr
rlang

These slides

https://rpubs.com/schifferl/NSE4BioC