Intro: The Bioconductor project contains analysis packages which often depend on a number of core packages, which contain core classes (S4 classes like eSet, SummarizedExperiment, etc.) It is recommended not to duplicate the core classes, but instead to extend these classes when contributing new packages to the project. This means that many methods can be applied to a new class that is defined in a new analysis package. (This is good! We don’t have to re-implement [ subsetting everytime.)
Problem: As far as I can tell there is not an quick and easy way for users to find out which methods are defined for a specific class, but restricted to the package that defines the class. By quick and easy, I mean that this should be a few keystrokes, be easy to remember, and should output a simple character vector. In Bioconductor, these methods are really the main ones that a user would be interested in. These are the important custom accessors and typical analysis steps.
Example: The DESeq2 package has a main class DESeqDataSet, which contains the data, metadata, and the results that are added through a typical analysis.
suppressPackageStartupMessages(library(DESeq2))
dds <- makeExampleDESeqDataSet()
class(dds)
## [1] "DESeqDataSet"
## attr(,"package")
## [1] "DESeq2"
The DESeqDataSet extends the following classes:
extends("DESeqDataSet")
## [1] "DESeqDataSet" "RangedSummarizedExperiment"
## [3] "SummarizedExperiment0" "Vector"
## [5] "Annotated"
So if I want to know, what can I do with this dds thing, I can ask:
methods(class="DESeqDataSet")
## [1] aggregate anyNA <=
## [4] < == >=
## [7] > != append
## [10] as.character as.complex as.data.frame
## [13] as.env as.integer as.list
## [16] as.logical as.numeric as.raw
## [19] assayNames<- assayNames assays<-
## [22] assays assay<- assay
## [25] cbind coef coerce
## [28] coerce<- colData<- colData
## [31] compare Compare countOverlaps
## [34] counts<- counts coverage
## [37] design<- design dimnames<-
## [40] dimnames dim disjointBins
## [43] dispersionFunction<- dispersionFunction dispersions
## [46] dispersions<- distance distanceToNearest
## [49] duplicated elementMetadata<- elementMetadata
## [52] end<- end estimateDispersions
## [55] estimateSizeFactors eval expand
## [58] exptData<- exptData extractROWS
## [61] findOverlaps flank follow
## [64] granges head high2low
## [67] %in% isDisjoint is.unsorted
## [70] length lengths match
## [73] mcols<- mcols metadata<-
## [76] metadata mstack names<-
## [79] names narrow nearest
## [82] normalizationFactors<- normalizationFactors NROW
## [85] order overlapsAny parallelSlotNames
## [88] plotDispEsts plotMA precede
## [91] promoters ranges<- ranges
## [94] rank rbind relist
## [97] rename rep.int replaceROWS
## [100] rep resize restrict
## [103] rev ROWNAMES rowRanges
## [106] rowRanges<- seqinfo<- seqinfo
## [109] seqlevelsInUse seqnames shiftApply
## [112] shift showAsCell show
## [115] sizeFactors sizeFactors<- sort
## [118] split split<- start<-
## [121] start strand<- strand
## [124] subsetByOverlaps subset [<-
## [127] [ [[<- [[
## [130] $<- $ table
## [133] tail tapply trim
## [136] unique updateObject values<-
## [139] values width<- width
## [142] window<- window with
## [145] xtfrm
## see '?methods' for accessing help and source code
But this is a giant list of methods of all the possible things I can do, not restricted to the methods that the package author wrote. Of course, both kinds of information are valuable, but users typically want to know the smaller set.
Martin Morgan pointed me to:
showMethods(classes="DESeqDataSet", where=getNamespace("DESeq2"))
## Function: counts<- (package BiocGenerics)
## object="DESeqDataSet", value="matrix"
##
## Function: counts (package BiocGenerics)
## object="DESeqDataSet"
##
## Function: design<- (package BiocGenerics)
## object="DESeqDataSet", value="formula"
##
## Function: design (package BiocGenerics)
## object="DESeqDataSet"
##
## Function: dispersionFunction<- (package DESeq2)
## object="DESeqDataSet", value="function"
##
## Function: dispersionFunction (package DESeq2)
## object="DESeqDataSet"
##
## Function: dispersions<- (package DESeq2)
## object="DESeqDataSet", value="numeric"
##
## Function: dispersions (package DESeq2)
## object="DESeqDataSet"
##
## Function: estimateDispersions (package BiocGenerics)
## object="DESeqDataSet"
##
## Function: estimateSizeFactors (package BiocGenerics)
## object="DESeqDataSet"
##
## Function: normalizationFactors<- (package DESeq2)
## object="DESeqDataSet", value="matrix"
##
## Function: normalizationFactors (package DESeq2)
## object="DESeqDataSet"
##
## Function: plotDispEsts (package BiocGenerics)
## object="DESeqDataSet"
##
## Function: plotMA (package BiocGenerics)
## object="DESeqDataSet"
##
## Function: sizeFactors<- (package BiocGenerics)
## object="DESeqDataSet", value="numeric"
##
## Function: sizeFactors (package BiocGenerics)
## object="DESeqDataSet"
This is the right information, but is a lot to type, and in my opinion, too verbose in its output.
I’ve come up with two messy lines of code, which take a class name as input and return a simple character vector of the methods:
intersect(sapply(strsplit(as.character(methods(class="DESeqDataSet")), ","), `[`, 1), ls(attr(findClass("DESeqDataSet")[[1]],"name")))
## [1] "counts<-" "counts"
## [3] "design<-" "design"
## [5] "dispersionFunction<-" "dispersionFunction"
## [7] "dispersions" "dispersions<-"
## [9] "estimateDispersions" "estimateSizeFactors"
## [11] "normalizationFactors<-" "normalizationFactors"
## [13] "plotDispEsts" "plotMA"
## [15] "show" "sizeFactors"
## [17] "sizeFactors<-"
And another approach:
sub("Function: (.*) \\(package .*\\)","\\1",grep("Function",showMethods(classes="DESeqDataSet", where=findClass("DESeqDataSet")[[1]], printTo=FALSE), value=TRUE))
## [1] "counts<-" "counts"
## [3] "design<-" "design"
## [5] "dimnames" "dispersionFunction<-"
## [7] "dispersionFunction" "dispersions<-"
## [9] "dispersions" "estimateDispersions"
## [11] "estimateSizeFactors" "names"
## [13] "normalizationFactors<-" "normalizationFactors"
## [15] "plotDispEsts" "plotMA"
## [17] "sizeFactors<-" "sizeFactors"
Maybe there is an even better way? I’m considering to define this function and put it into rafalib where we have some other convenience functions stashed.
If you have feedback, you can reply to me @mikelove