Non-Standard Evaluation for Bioconductor

June 24, 2019

“Are there any guidelines against using the %>% operator while doing software development? The HMP16SData package which is in review now, has completely missed the point of the pipe operator. … Extremely hard to read the code.”

— Nitesh Turaga

“Looks OK from my end. Remain skeptical about the NSE.”

— Michael Lawrence

So what is NSE?

coreTeam <-
c("Marcel", "Qian", "Kayla", "Michael", "Lori", "Martin", "Valerie",
"Hervé", "Daniel", "James", "Andrzej", "Nitesh", "Jiefei")
"Marcel" %in% coreTeam
## [1] TRUE
"Lucas" %in% coreTeam
## [1] FALSE

Major concepts of NSE

1. R code is itself computeable
2. R code is a hierarchical tree
3. R code can write R code
4. Evaluation = expressions + environment

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

R code is itself computeable

rlang::expr(mean(x, na.rm = TRUE))
## mean(x, na.rm = TRUE)

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

R code is a hierarchical tree

lobstr::ast(1 + 2 * 3)
## █─+
## ├─1
## └─█─*
##   ├─2
##   └─3

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

R code can write R code

xx <-
rlang::expr(x + x)

yy <-
rlang::expr(y + y)

rlang::expr(!!xx / !!yy)
## (x + x)/(y + y)

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

Evaluation = expressions + environment

eval(rlang::expr(x + y), rlang::env(x = 1, y = 10))
## [1] 11
eval(rlang::expr(x + y), rlang::env(x = 2, y = 43))
## [1] 45

Wickham, H. Advanced R (Chapman & Hall/CRC The R Series). (Routledge, 2014).

A Bioconductor example

SE <-
HMP16SData::V35()
SE[, SE\$HMP_BODY_SUBSITE == "Stool"]
## class: SummarizedExperiment
## dim: 45383 319
## assays(1): 16SrRNA
## rownames(45383): OTU_97.1 OTU_97.10 ... OTU_97.9998 OTU_97.9999
## rowData names(7): CONSENSUS_LINEAGE SUPERKINGDOM ... FAMILY GENUS
## colnames(319): 700013549 700014386 ... 700114717 700114750
## colData names(7): RSID VISITNO ... HMP_BODY_SUBSITE SRS_SAMPLE_ID
HMP16SData::V35() %>%
subset(select = HMP_BODY_SUBSITE == "Stool")
## class: SummarizedExperiment
## dim: 45383 319
## assays(1): 16SrRNA
## rownames(45383): OTU_97.1 OTU_97.10 ... OTU_97.9998 OTU_97.9999
## rowData names(7): CONSENSUS_LINEAGE SUPERKINGDOM ... FAMILY GENUS
## colnames(319): 700013549 700014386 ... 700114717 700114750
## colData names(7): RSID VISITNO ... HMP_BODY_SUBSITE SRS_SAMPLE_ID

Key references

R Packages Books