1 API
2 Design overview
3 Generate toy data
4 Setup for creating the MultiAssayExperiment object
5 Create a multiAssayExperiment class object
6 RangedRaggedAssay class
7 Validity checking of data classes
8 Very next steps
9 Wishlist

This vignette is the current working document for developing the MultiAssayExperiment class and methods. See a built html version.

1 API

See the API wiki by typing:

API()

2 Design overview

Here is an overview of the design:

empty <- MultiAssayExperiment()
empty

## A "MultiAssayExperiment" object of 0 listed
##  experiments with no user-defined names and respective classes. 
##  Containing an "Elist" class object of length 0:  
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

slotNames(empty)

## [1] "Elist"     "pData"     "sampleMap" "metadata"  "drops"

class(Elist(empty))       #Elist

## [1] "Elist"
## attr(,"package")
## [1] "biocMultiAssay"

class(pData(empty)) #DataFrame

## [1] "DataFrame"
## attr(,"package")
## [1] "S4Vectors"

class(sampleMap(empty))   #DataFrame

## [1] "DataFrame"
## attr(,"package")
## [1] "S4Vectors"

class(metadata(empty))    #NULL (class "ANY")

## [1] "NULL"

methods(class="MultiAssayExperiment")

##  [1] [           assay       colnames    Elist       Elist<-    
##  [6] getHits     isEmpty     length      metadata    names      
## [11] pData       rownames    sampleMap   sampleMap<- show       
## [16] subset     
## see '?methods' for accessing help and source code

methods(class="RangedRaggedAssay")

##   [1] !                   !=                  [                  
##   [4] [[                  [[<-                [<-                
##   [7] %in%                <                   <=                 
##  [10] ==                  >                   >=                 
##  [13] $                   $<-                 aggregate          
##  [16] anyNA               append              as.character       
##  [19] as.complex          as.data.frame       as.env             
##  [22] as.integer          as.list             as.logical         
##  [25] as.numeric          as.raw              assay              
##  [28] by                  c                   classNameForDisplay
##  [31] coerce              colnames            compare            
##  [34] do.call             droplevels          duplicated         
##  [37] elementLengths      elementMetadata     elementMetadata<-  
##  [40] elementType         end                 end<-              
##  [43] endoapply           eval                expand.grid        
##  [46] extractROWS         Filter              getHits            
##  [49] getListElement      head                ifelse             
##  [52] is.na               is.unsorted         isEmpty            
##  [55] lapply              length              lengths            
##  [58] match               mcols               mcols<-            
##  [61] mendoapply          metadata            metadata<-         
##  [64] names               names<-             ncol               
##  [67] nrow                NROW                order              
##  [70] parallelSlotNames   range               rank               
##  [73] Reduce              relist              rename             
##  [76] rep                 rep.int             replaceROWS        
##  [79] revElements         rownames            ROWNAMES           
##  [82] sapply              score               score<-            
##  [85] shiftApply          show                showAsCell         
##  [88] sort                split               start              
##  [91] start<-             strand              strand<-           
##  [94] subset              table               tail               
##  [97] tapply              unique              unlist             
## [100] unsplit             updateObject        values             
## [103] values<-            width               width<-            
## [106] window              with                within             
## [109] xtabs               xtfrm              
## see '?methods' for accessing help and source code

getMethod("colnames", "RangedRaggedAssay")

## Method Definition:
## 
## function (x, do.NULL = TRUE, prefix = "col") 
## {
##     .local <- function (x) 
##     base::names(x)
##     .local(x)
## }
## <environment: namespace:biocMultiAssay>
## 
## Signatures:
##         x                  
## target  "RangedRaggedAssay"
## defined "RangedRaggedAssay"

Subsetting of samples and features is harmonized through some generic functions:

methods("rownames") # features

## [1] rownames,ANY-method                       
## [2] rownames,DataFrame-method                 
## [3] rownames,DataFrameList-method             
## [4] rownames,ExpressionSet-method             
## [5] rownames,MultiAssayExperiment-method      
## [6] rownames,RangedData-method                
## [7] rownames,RangedRaggedAssay-method         
## [8] rownames,RangedSummarizedExperiment-method
## see '?methods' for accessing help and source code

methods("colnames") # samples

##  [1] colnames,ANY-method                         
##  [2] colnames,CompressedSplitDataFrameList-method
##  [3] colnames,DataFrame-method                   
##  [4] colnames,DataFrameList-method               
##  [5] colnames,ExpressionSet-method               
##  [6] colnames,MultiAssayExperiment-method        
##  [7] colnames,RangedData-method                  
##  [8] colnames,RangedRaggedAssay-method           
##  [9] colnames,RangedSelection-method             
## [10] colnames,SDFLWrapperForTransform-method     
## [11] colnames,SimpleSplitDataFrameList-method    
## see '?methods' for accessing help and source code

3 Generate toy data

In this example we have 4 patients, and a bit of metadata on them:

masPheno <- data.frame(sex=c("M", "F", "M", "F"),
                          age=38:41,
                          row.names=c("Jack", "Jill", "Bob", "Barbara"))
masPheno

##         sex age
## Jack      M  38
## Jill      F  39
## Bob       M  40
## Barbara   F  41

We have three matrix-like datasets. First let’s say expression data:

library(Biobase)
(arraydat <- matrix(seq(101, 108), ncol=4, dimnames=list(c("ENST00000294241", "ENST00000355076"), c("array1", "array2", "array3", "array4"))))

##                 array1 array2 array3 array4
## ENST00000294241    101    103    105    107
## ENST00000355076    102    104    106    108

arraypdat <- as(data.frame(slope53=rnorm(4), row.names=c("array1", "array2", "array3", "array4")), "AnnotatedDataFrame")
exprdat <- ExpressionSet(assayData=arraydat, phenoData=arraypdat)
exprdat

## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 2 features, 4 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: array1 array2 array3 array4
##   varLabels: slope53
##   varMetadata: labelDescription
## featureData: none
## experimentData: use 'experimentData(object)'
## Annotation:

The following map matches pData sample names to exprdata sample names. Note that row orders aren’t initially matched up.

(exprmap <- data.frame(master=rownames(masPheno)[c(1, 2, 4, 3)],
                       assay=c("array1", "array2", "array3", "array4"), stringsAsFactors = FALSE))

##    master  assay
## 1    Jack array1
## 2    Jill array2
## 3 Barbara array3
## 4     Bob array4

Now methylation data. It uses gene identifiers also, but measures a partially overlapping set of genes. For fun, let’s store this as a simple matrix. Also, it contains a replicate for one of the patients.

(methyldat <- matrix(1:10, ncol=5, 
                     dimnames=list(c("ENST00000355076", "ENST00000383706"),
                                   c("methyl1", "methyl2", "methyl3", "methyl4", "methyl5"))))

##                 methyl1 methyl2 methyl3 methyl4 methyl5
## ENST00000355076       1       3       5       7       9
## ENST00000383706       2       4       6       8      10

The following map matches pData sample names to methyldat sample names.

(methylmap <- data.frame(master = c("Jack", "Jack", "Jill", "Barbara", "Bob"),
                        assay = c("methyl1", "methyl2", "methyl3", "methyl4", "methyl5"), stringsAsFactors = FALSE))

##    master   assay
## 1    Jack methyl1
## 2    Jack methyl2
## 3    Jill methyl3
## 4 Barbara methyl4
## 5     Bob methyl5

Now we have a microRNA platform, which has no common identifiers. It is also missing data for Jill. Just for fun, let’s use the same sample naming convention as we did for arrays.

(microdat <- matrix(201:212, ncol=3, 
                    dimnames=list(c("hsa-miR-21", "hsa-miR-191", "hsa-miR-148a", "hsa-miR148b"), 
                                  c("micro1", "micro2", "micro3"))))

##              micro1 micro2 micro3
## hsa-miR-21      201    205    209
## hsa-miR-191     202    206    210
## hsa-miR-148a    203    207    211
## hsa-miR148b     204    208    212

And the following map matches pData sample names to microdat sample names.

(micromap <- data.frame(master = c("Jack", "Barbara", "Bob"),
                        assay = c("micro1", "micro2", "micro3"), stringsAsFactors = FALSE))

##    master  assay
## 1    Jack micro1
## 2 Barbara micro2
## 3     Bob micro3

Let’s include a GRangesList:

suppressPackageStartupMessages(library(GenomicRanges))
gr1 <-
  GRanges(seqnames = "chr3", ranges = IRanges(58000000, 59502360), #completely encompasses ENST00000355076
          strand = "+", score = 5L, GC = 0.45)
gr2 <-
  GRanges(seqnames = c("chr3", "chr3"),
          ranges = IRanges(c(58493000, 3), width=9000), #first is within ENST0000035076
          strand = c("+", "-"), score = 3:4, GC = c(0.3, 0.5))
gr3 <-
  GRanges(seqnames = c("chr1", "chr2"),
          ranges = IRanges(c(1, 4), c(3, 9)),
          strand = c("-", "-"), score = c(6L, 2L), GC = c(0.4, 0.1))
grl <- GRangesList("gr1" = gr1, "gr2" = gr2, "gr3" = gr3)
names(grl) <- c("snparray1", "snparray2", "snparray3")
grl

## GRangesList object of length 3:
## $snparray1 
## GRanges object with 1 range and 2 metadata columns:
##       seqnames               ranges strand |     score        GC
##          <Rle>            <IRanges>  <Rle> | <integer> <numeric>
##   [1]     chr3 [58000000, 59502360]      + |         5      0.45
## 
## $snparray2 
## GRanges object with 2 ranges and 2 metadata columns:
##       seqnames               ranges strand | score  GC
##   [1]     chr3 [58493000, 58501999]      + |     3 0.3
##   [2]     chr3 [       3,     9002]      - |     4 0.5
## 
## $snparray3 
## GRanges object with 2 ranges and 2 metadata columns:
##       seqnames ranges strand | score  GC
##   [1]     chr1 [1, 3]      - |     6 0.4
##   [2]     chr2 [4, 9]      - |     2 0.1
## 
## -------
## seqinfo: 3 sequences from an unspecified genome; no seqlengths

The following data.frame matches pData sample to the GRangesList:

(rangemap <- data.frame(master = c("Jack", "Jill", "Jill"), 
                        assay = c("snparray1", "snparray2", "snparray3"), stringsAsFactors = FALSE))

##   master     assay
## 1   Jack snparray1
## 2   Jill snparray2
## 3   Jill snparray3

Adding the new RangedSummarizedExperiment class:

Create a GenomicRangesList object for the RangedSummarizedExperiment:

library(SummarizedExperiment)
nrows <- 5; ncols <- 4
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
rowRanges <- GRanges(rep(c("chr1", "chr2"), c(2, nrows - 2)),
                     IRanges(floor(runif(nrows, 1e5, 1e6)), width=100),
                     strand=sample(c("+", "-"), nrows, TRUE),
                     feature_id=sprintf("ID\\%03d", 1:nrows))
names(rowRanges) <- letters[1:5]
colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 2),
                     row.names= c("mysnparray1", "mysnparray2", "mysnparray3", "mysnparray4"))
rse <- SummarizedExperiment(assays=SimpleList(counts=counts),
                            rowRanges=rowRanges, colData=colData)

(rangemap2 <- data.frame(master = c("Jack", "Jill", "Bob", "Barbara"), 
                        assay = c("mysnparray1", "mysnparray2", "mysnparray3", "mysnparray4"), stringsAsFactors = FALSE))

##    master       assay
## 1    Jack mysnparray1
## 2    Jill mysnparray2
## 3     Bob mysnparray3
## 4 Barbara mysnparray4

4 Setup for creating the `MultiAssayExperiment` object

Create an ID map for all available experiments. Names required, and must be identical to names of Elist.

listmap <- list(exprmap, methylmap, micromap, rangemap, rangemap2)
names(listmap) <- c("Affy", "Methyl 450k", "Mirna", "CNV gistic", "CNV gistic2")
listmap

## $Affy
##    master  assay
## 1    Jack array1
## 2    Jill array2
## 3 Barbara array3
## 4     Bob array4
## 
## $`Methyl 450k`
##    master   assay
## 1    Jack methyl1
## 2    Jack methyl2
## 3    Jill methyl3
## 4 Barbara methyl4
## 5     Bob methyl5
## 
## $Mirna
##    master  assay
## 1    Jack micro1
## 2 Barbara micro2
## 3     Bob micro3
## 
## $`CNV gistic`
##   master     assay
## 1   Jack snparray1
## 2   Jill snparray2
## 3   Jill snparray3
## 
## $`CNV gistic2`
##    master       assay
## 1    Jack mysnparray1
## 2    Jill mysnparray2
## 3     Bob mysnparray3
## 4 Barbara mysnparray4

Allowing for the possibility of ID maps entered as dataframes, convert to conventional list:

dfmap <- biocMultiAssay:::.convertList(listmap)
toListMap(dfmap, "assayname")

## $Affy
## DataFrame with 4 rows and 2 columns
##    master       assay
##     <Rle> <character>
## 1    Jack      array1
## 2    Jill      array2
## 3 Barbara      array3
## 4     Bob      array4
## 
## $`CNV gistic`
## DataFrame with 3 rows and 2 columns
##   master       assay
##    <Rle> <character>
## 1   Jack   snparray1
## 2   Jill   snparray2
## 3   Jill   snparray3
## 
## $`CNV gistic2`
## DataFrame with 4 rows and 2 columns
##    master       assay
##     <Rle> <character>
## 1    Jack mysnparray1
## 2    Jill mysnparray2
## 3     Bob mysnparray3
## 4 Barbara mysnparray4
## 
## $`Methyl 450k`
## DataFrame with 5 rows and 2 columns
##    master       assay
##     <Rle> <character>
## 1    Jack     methyl1
## 2    Jack     methyl2
## 3    Jill     methyl3
## 4 Barbara     methyl4
## 5     Bob     methyl5
## 
## $Mirna
## DataFrame with 3 rows and 2 columns
##    master       assay
##     <Rle> <character>
## 1    Jack      micro1
## 2 Barbara      micro2
## 3     Bob      micro3

Create an named list of experiments objlist for the MultiAssay function

objlist <- list("Affy" = exprdat, "Methyl 450k" = methyldat, "Mirna" = microdat, "CNV gistic" = grl, "CNV gistic2" = rse)

5 Create a `multiAssayExperiment` class object

myMultiAssay <- MultiAssayExperiment(objlist, masPheno, dfmap)
myMultiAssay

## A "MultiAssayExperiment" object of 5 listed
##  experiments with user-defined names and respective classes. 
##  Containing an "Elist" class object of length 5: 
##  [1] Affy: "ExpressionSet" - 2 rows, 4 columns 
##  [2] Methyl 450k: "matrix" - 2 rows, 5 columns 
##  [3] Mirna: "matrix" - 4 rows, 3 columns 
##  [4] CNV gistic: "RangedRaggedAssay" - 5 rows, 3 columns 
##  [5] CNV gistic2: "RangedSummarizedExperiment" - 5 rows, 4 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

Elist(myMultiAssay)

## "Elist" class object of length 5: 
##  [1] Affy: "ExpressionSet" - 2 rows, 4 columns 
##  [2] Methyl 450k: "matrix" - 2 rows, 5 columns 
##  [3] Mirna: "matrix" - 4 rows, 3 columns 
##  [4] CNV gistic: "RangedRaggedAssay" - 5 rows, 3 columns 
##  [5] CNV gistic2: "RangedSummarizedExperiment" - 5 rows, 4 columns

pData(myMultiAssay)

## DataFrame with 4 rows and 2 columns
##              sex       age
##         <factor> <integer>
## Jack           M        38
## Jill           F        39
## Bob            M        40
## Barbara        F        41

sampleMap(myMultiAssay)

## DataFrame with 19 rows and 3 columns
##      master       assay   assayname
##       <Rle> <character>       <Rle>
## 1      Jack      array1        Affy
## 2      Jill      array2        Affy
## 3   Barbara      array3        Affy
## 4       Bob      array4        Affy
## 5      Jack     methyl1 Methyl 450k
## ...     ...         ...         ...
## 15     Jill   snparray3  CNV gistic
## 16     Jack mysnparray1 CNV gistic2
## 17     Jill mysnparray2 CNV gistic2
## 18      Bob mysnparray3 CNV gistic2
## 19  Barbara mysnparray4 CNV gistic2

metadata(myMultiAssay)

## NULL

6 `RangedRaggedAssay` class

Note that the GRangesList got converted to a RangedRaggedAssay, which has some additional methods:

class(Elist(myMultiAssay)[[4]])

## [1] "RangedRaggedAssay"
## attr(,"package")
## [1] "biocMultiAssay"

rownames(Elist(myMultiAssay)[[4]])

## [1] "1" "1" "2" "1" "2"

colnames(Elist(myMultiAssay)[[4]])

## [1] "snparray1" "snparray2" "snparray3"

assay(Elist(myMultiAssay)[[4]])

## DataFrame with 5 rows and 2 columns
##       score        GC
##   <integer> <numeric>
## 1         5      0.45
## 2         3      0.30
## 3         4      0.50
## 4         6      0.40
## 5         2      0.10

6.1 Subsetting by Sample

MultiAssayView_cl <- MultiAssayView(myMultiAssay, 2:1, "colnames")
MultiAssayView_cl

## A "MultiAssayView" class object of length 5: 
## Query: Jill, Jack
##  Viewed by: "colnames"
##  [1] Affy: 2 colnames 
##  [2] Methyl 450k: 3 colnames 
##  [3] Mirna: 1 colname 
##  [4] CNV gistic: 3 colnames 
##  [5] CNV gistic2: 2 colnames

6.1.1 Using a `MultiAssayView` class to subset

6.1.2 TODO: Requires a different function for subsetting

subMultiAssay <- subset(myMultiAssay, MultiAssayView_cl)
as.list(Elist(subMultiAssay))
subset(subMultiAssay, c(TRUE, FALSE, FALSE, TRUE, FALSE), "assays", drop = TRUE)[[1]] %>% exprs 
subMultiAssay

Endogenous operation, returns a MultiAssay object containing Elist of length 1, map of length 1, and pData for only Jack, Barbara, and Bob. The “Mirna” argument is used to index the Elist object using [, so could also be integer or logical:

subset(myMultiAssay, "Mirna", "assays")

## A "MultiAssayExperiment" object of 1 listed
##  experiment with a user-defined name and respective class. 
##  Containing an "Elist" class object of length 1: 
##  [1] Mirna: "matrix" - 4 rows, 3 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

6.2 Bracket operations

The bracket method for the MultiAssayExperiment returns a specified subset of the data. The positions within the bracket operator, indicate rownames, colnames, and assays respectively. To subset by a particular assay, one can use this syntax:

myMultiAssay[,,"Mirna"]

## A "MultiAssayExperiment" object of 1 listed
##  experiment with a user-defined name and respective class. 
##  Containing an "Elist" class object of length 1: 
##  [1] Mirna: "matrix" - 4 rows, 3 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

6.3 Subsetting by Feature

This operation returns a MultiAssayExperiment class, with any Elist element not containing the feature having zero rows.

Returns object of class MultiAssayView:

MultiAssayView(myMultiAssay, "ENST00000355076", "rownames")

Returns MultiAssayExperiment where Affy and Methyl 450k contain only ENST0000035076 row, and “Mirna” and “CNV gistic” have zero rows: (drop argument is set to TRUE by default)

featSubsetted0 <- subset(myMultiAssay, "ENST00000355076", "rownames")
class(featSubsetted0)

## [1] "MultiAssayExperiment"
## attr(,"package")
## [1] "biocMultiAssay"

class(Elist(featSubsetted0))

## [1] "Elist"
## attr(,"package")
## [1] "biocMultiAssay"

Elist(featSubsetted0)

## "Elist" class object of length 2: 
##  [1] Affy: "ExpressionSet" - 1 rows, 4 columns 
##  [2] Methyl 450k: "matrix" - 1 rows, 5 columns

6.3.1 Subset by rownames using the bracket `[` method

myMultiAssay["ENST00000355076",,]

## A "MultiAssayExperiment" object of 2 listed
##  experiments with user-defined names and respective classes. 
##  Containing an "Elist" class object of length 2: 
##  [1] Affy: "ExpressionSet" - 1 rows, 4 columns 
##  [2] Methyl 450k: "matrix" - 1 rows, 5 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

In the following, Affy ExpressionSet keeps both rows but with their order reversed, and Methyl 450k keeps only its second row.

featSubsetted <- subset(myMultiAssay, c("ENST00000355076", "ENST00000294241"), "rownames")
exprs(Elist(myMultiAssay)[[1]])
exprs(Elist(featSubsetted)[[1]])

6.4 Identify and view assays that contain any of a vector of features

Note that the output of this function could be used as the input for subset.

MultiAssayView(myMultiAssay, c("ENST00000355076", "ENST00000294241"), "rownames")

6.5 Feature extraction by Ranges

See arguments to IRanges::subsetByOverlaps for flexible types of subsetting. The first two arguments are for subset, the rest passed on through “…”:

rangeSubset <- GRanges(seqnames = c("chr1"), strand = c("-", "+", "-"), ranges = IRanges(start = c(1, 4, 6), width = 3))
subsetted <- subset(myMultiAssay, rangeSubset, "rownames", maxgap = 2L, type = "within")
Elist(subsetted)

## "Elist" class object of length 1: 
##  [1] CNV gistic: "RangedRaggedAssay" - 1 rows, 3 columns

6.6 Subsetting by square bracket

myMultiAssay[rangeSubset, , ]

## A "MultiAssayExperiment" object of 1 listed
##  experiment with a user-defined name and respective class. 
##  Containing an "Elist" class object of length 1: 
##  [1] CNV gistic: "RangedRaggedAssay" - 1 rows, 3 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

myMultiAssay[c("ENST00000355076", "ENST00000294241"), , ]

## A "MultiAssayExperiment" object of 2 listed
##  experiments with user-defined names and respective classes. 
##  Containing an "Elist" class object of length 2: 
##  [1] Affy: "ExpressionSet" - 2 rows, 4 columns 
##  [2] Methyl 450k: "matrix" - 1 rows, 5 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

myMultiAssay[, c("Jack", "Jill"), ]

## A "MultiAssayExperiment" object of 5 listed
##  experiments with user-defined names and respective classes. 
##  Containing an "Elist" class object of length 5: 
##  [1] Affy: "ExpressionSet" - 2 rows, 2 columns 
##  [2] Methyl 450k: "matrix" - 2 rows, 3 columns 
##  [3] Mirna: "matrix" - 4 rows, 1 columns 
##  [4] CNV gistic: "RangedRaggedAssay" - 5 rows, 3 columns 
##  [5] CNV gistic2: "RangedSummarizedExperiment" - 5 rows, 2 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

myMultiAssay[, , "Mirna"]

## A "MultiAssayExperiment" object of 1 listed
##  experiment with a user-defined name and respective class. 
##  Containing an "Elist" class object of length 1: 
##  [1] Mirna: "matrix" - 4 rows, 3 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

6.6.1 Auto-create sampleMap slot from data

exprss1 <- matrix(rnorm(16), ncol = 4,
                 dimnames = list(sprintf("ENST00000%i", sample(288754:290000, 4)),
                                 c("Jack", "Jill", "Bob", "Bobby")))
exprss2 <- matrix(rnorm(16), ncol = 4, 
                 dimnames = list(sprintf("ENST00000%i", sample(288754:290000, 4)),
                                 c("Jack", "Jane", "Bob", "Bobby")))
doubleExp <- list("methyl 2k"  = exprss1, "methyl 3k" = exprss2)
(genMapMA <- MultiAssayExperiment(doubleExp, masPheno))

## Warning in MultiAssayExperiment(doubleExp, masPheno): sampleMap not
## provided, map will be generated

## Warning in .generateMap(pData, Elist): Data from rows:
##  Bobby - methyl 2k
##  Jane - methyl 3k
##  Bobby - methyl 3k
## dropped due to missing phenotype data

## A "MultiAssayExperiment" object of 2 listed
##  experiments with user-defined names and respective classes. 
##  Containing an "Elist" class object of length 2: 
##  [1] methyl 2k: "matrix" - 4 rows, 3 columns 
##  [2] methyl 3k: "matrix" - 4 rows, 2 columns 
## To access slots use: 
##  Elist() - to obtain the "Elist" of experiment instances 
##  pData() - for the phenotype "DataFrame" 
##  sampleMap() - for the sample availability "DataFrame" 
##  metadata() - for the metadata object of 'ANY' class 
## See also: subsetByAssay(), subsetByFeature(), subsetBySample()

For now, fill the map with all observed samples:

sampleMap(MultiAssayExperiment(doubleExp, masPheno))

## DataFrame with 5 rows and 3 columns
##   master       assay assayname
##    <Rle> <character>     <Rle>
## 1   Jack        Jack methyl 2k
## 2   Jill        Jill methyl 2k
## 3    Bob         Bob methyl 2k
## 4   Jack        Jack methyl 3k
## 5    Bob         Bob methyl 3k

7 Validity checking of data classes

Any data classes in the Elist object must support the following methods:

colnames()
rownames()
assay() #to return experimental data
[

Here is what happens if one of the methods doesn’t:

objlist2 <- objlist
objlist2[[2]] <- data.frame(objlist2[[2]])
invalid.obj <- try(MultiAssayExperiment(objlist2, masPheno, dfmap))
invalid.obj

## [1] "Error in validObject(.Object) : \n  invalid class \"Elist\" object: Element [2] of class 'data.frame' does not have method(s): assay\n"
## attr(,"class")
## [1] "try-error"
## attr(,"condition")
## <simpleError in validObject(.Object): invalid class "Elist" object: Element [2] of class 'data.frame' does not have method(s): assay>

8 Very next steps

Figure out how to support a “long-and-skinny” SQL database
“mergeDups” function to merge duplicate samples in any assay
- For matrix-like objects, it is clear how to do this. Default would be simple mean of the columns, but could allow user-specified functions.
- For GRangesList, it’s not obvious how to merge duplicates. Just concatenate?

9 Wishlist

c() function for adding new assays to existing MultiAssayExperiment
- e.g. c(myMultiAssay, neweset)
- require that sample names in the new object match pData sample names
- require that sample names in the new object already exist in pData

MultiAssayExperiment toy example

Marcel Ramos, Levi Waldron

January 25, 2016

1 API

2 Design overview

3 Generate toy data

4 Setup for creating the `MultiAssayExperiment` object

5 Create a `multiAssayExperiment` class object

6 `RangedRaggedAssay` class

6.1 Subsetting by Sample

6.1.1 Using a `MultiAssayView` class to subset

6.1.2 TODO: Requires a different function for subsetting

6.2 Bracket operations

6.3 Subsetting by Feature

6.3.1 Subset by rownames using the bracket `[` method

6.4 Identify and view assays that contain any of a vector of features

6.5 Feature extraction by Ranges

6.6 Subsetting by square bracket

6.6.1 Auto-create sampleMap slot from data

7 Validity checking of data classes

8 Very next steps

9 Wishlist

MultiAssayExperiment toy example

Marcel Ramos, Levi Waldron

January 25, 2016

1 API

2 Design overview

3 Generate toy data

4 Setup for creating the MultiAssayExperiment object

5 Create a multiAssayExperiment class object

6 RangedRaggedAssay class

6.1 Subsetting by Sample

6.1.1 Using a MultiAssayView class to subset

6.1.2 TODO: Requires a different function for subsetting

6.2 Bracket operations

6.3 Subsetting by Feature

6.3.1 Subset by rownames using the bracket [ method

6.4 Identify and view assays that contain any of a vector of features

6.5 Feature extraction by Ranges

6.6 Subsetting by square bracket

6.6.1 Auto-create sampleMap slot from data

7 Validity checking of data classes

8 Very next steps

9 Wishlist

4 Setup for creating the `MultiAssayExperiment` object

5 Create a `multiAssayExperiment` class object

6 `RangedRaggedAssay` class

6.1.1 Using a `MultiAssayView` class to subset

6.3.1 Subset by rownames using the bracket `[` method