After completing this module, the student will:
Bioconductor is an open source project for the analysis of biomedical data. It contains more than 1000 packages for example data, analysis tools, and development tools for building new packages. The bioconductor website http://bioconductor.org/ contains much more information on the project.
From the bioconductor webpage
The broad goals of the Bioconductor project are:
- To provide widespread access to a broad range of powerful statistical and graphical methods for the analysis of genomic data.
- To facilitate the inclusion of biological metadata in the analysis of genomic data, e.g. literature data from PubMed, annotation data from Entrez genes.
- To provide a common software platform that enables the rapid development and deployment of extensible, scalable, and interoperable software.
- To further scientific understanding by producing high-quality documentation and reproducible research.
- To train researchers on computational and statistical methods for the analysis of genomic data.
We will view some exaples inspired by the paper:
Bioconductor has special tools for installing new packages.
source("http://bioconductor.org/biocLite.R")
Once you have loaded the installer, you can use it to install bioconductor packages.
biocLite("GenomicRanges")
A package must be loaded into each session that makes use of the package.
library(GenomicRanges)
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
##
## IQR, mad, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, append, as.data.frame, as.vector, cbind,
## colnames, do.call, duplicated, eval, evalq, Filter, Find, get,
## grep, grepl, intersect, is.unsorted, lapply, lengths, Map,
## mapply, match, mget, order, paste, pmax, pmax.int, pmin,
## pmin.int, Position, rank, rbind, Reduce, rownames, sapply,
## setdiff, sort, table, tapply, union, unique, unlist, unsplit
## Loading required package: S4Vectors
## Loading required package: stats4
## Loading required package: IRanges
## Loading required package: GenomeInfoDb
bioconductor contains extensive documentation, usually of a very high quality.
help(package="GenomicRanges")
The package vignettes, in particular, have high level overviews of the package, with examples and data.
vignette("GenomicRangesIntroduction")
As a general principle, bioconductor tries to maintain metadata about your experiment together with the raw data whenever possible. It is very rare to see a plain matrix of data.
In the paper, the SummarizedExperiment data structure is mentioned. We will use the vignette to explore a summarized experiment.
biocLite("airway")
library(airway)
## Loading required package: SummarizedExperiment
## Loading required package: Biobase
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
data(airway, package="airway")
airway
## class: RangedSummarizedExperiment
## dim: 64102 8
## metadata(1): ''
## assays(1): counts
## rownames(64102): ENSG00000000003 ENSG00000000005 ... LRG_98 LRG_99
## rowRanges metadata column names(0):
## colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
## colData names(9): SampleName cell ... Sample BioSample
Here we extract one assay from the airway object.
head(assays(airway)$counts)
## SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
## ENSG00000000003 679 448 873 408 1138
## ENSG00000000005 0 0 0 0 0
## ENSG00000000419 467 515 621 365 587
## ENSG00000000457 260 211 263 164 245
## ENSG00000000460 60 55 40 35 78
## ENSG00000000938 0 0 2 0 1
## SRR1039517 SRR1039520 SRR1039521
## ENSG00000000003 1047 770 572
## ENSG00000000005 0 0 0
## ENSG00000000419 799 417 508
## ENSG00000000457 331 233 229
## ENSG00000000460 63 76 60
## ENSG00000000938 0 0 0
Extensive data about each feature is directly available in the dataset.
head(rowRanges(airway))
## GRangesList object of length 6:
## $ENSG00000000003
## GRanges object with 17 ranges and 2 metadata columns:
## seqnames ranges strand | exon_id exon_name
## <Rle> <IRanges> <Rle> | <integer> <character>
## [1] X [99883667, 99884983] - | 667145 ENSE00001459322
## [2] X [99885756, 99885863] - | 667146 ENSE00000868868
## [3] X [99887482, 99887565] - | 667147 ENSE00000401072
## [4] X [99887538, 99887565] - | 667148 ENSE00001849132
## [5] X [99888402, 99888536] - | 667149 ENSE00003554016
## ... ... ... ... ... ... ...
## [13] X [99890555, 99890743] - | 667156 ENSE00003512331
## [14] X [99891188, 99891686] - | 667158 ENSE00001886883
## [15] X [99891605, 99891803] - | 667159 ENSE00001855382
## [16] X [99891790, 99892101] - | 667160 ENSE00001863395
## [17] X [99894942, 99894988] - | 667161 ENSE00001828996
##
## ...
## <5 more elements>
## -------
## seqinfo: 722 sequences (1 circular) from an unspecified genome
As is information about each sample.
head(colData(airway))
## DataFrame with 6 rows and 9 columns
## SampleName cell dex albut Run avgLength
## <factor> <factor> <factor> <factor> <factor> <integer>
## SRR1039508 GSM1275862 N61311 untrt untrt SRR1039508 126
## SRR1039509 GSM1275863 N61311 trt untrt SRR1039509 126
## SRR1039512 GSM1275866 N052611 untrt untrt SRR1039512 126
## SRR1039513 GSM1275867 N052611 trt untrt SRR1039513 87
## SRR1039516 GSM1275870 N080611 untrt untrt SRR1039516 120
## SRR1039517 GSM1275871 N080611 trt untrt SRR1039517 126
## Experiment Sample BioSample
## <factor> <factor> <factor>
## SRR1039508 SRX384345 SRS508568 SAMN02422669
## SRR1039509 SRX384346 SRS508567 SAMN02422675
## SRR1039512 SRX384349 SRS508571 SAMN02422678
## SRR1039513 SRX384350 SRS508572 SAMN02422670
## SRR1039516 SRX384353 SRS508575 SAMN02422682
## SRR1039517 SRX384354 SRS508576 SAMN02422673
Since we have information on the features, we can query online databases to get more information.
head(rownames(airway))
## [1] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457"
## [5] "ENSG00000000460" "ENSG00000000938"
rownames(airway)[5]
## [1] "ENSG00000000460"
These identifiers are Ensembl gene IDs. The biomaRt package can get the data related to the gene. Ensembl is a huge online database for dozens of organisms.
library(biomaRt)
mart = useEnsembl('ENSEMBL_MART_ENSEMBL')
dim(listDatasets(mart))
## [1] 69 3
head(listDatasets(mart))
## dataset
## 1 oanatinus_gene_ensembl
## 2 cporcellus_gene_ensembl
## 3 gaculeatus_gene_ensembl
## 4 itridecemlineatus_gene_ensembl
## 5 lafricana_gene_ensembl
## 6 choffmanni_gene_ensembl
## description version
## 1 Ornithorhynchus anatinus genes (OANA5) OANA5
## 2 Cavia porcellus genes (cavPor3) cavPor3
## 3 Gasterosteus aculeatus genes (BROADS1) BROADS1
## 4 Ictidomys tridecemlineatus genes (spetri2) spetri2
## 5 Loxodonta africana genes (loxAfr3) loxAfr3
## 6 Choloepus hoffmanni genes (choHof1) choHof1
mart=useEnsembl(biomart="ensembl", dataset="hsapiens_gene_ensembl")
head(listAttributes(mart))
## name description
## 1 ensembl_gene_id Ensembl Gene ID
## 2 ensembl_transcript_id Ensembl Transcript ID
## 3 ensembl_peptide_id Ensembl Protein ID
## 4 ensembl_exon_id Ensembl Exon ID
## 5 description Description
## 6 chromosome_name Chromosome Name
dim(listAttributes(mart))
## [1] 1319 2
g=getBM(attributes=c("ensembl_gene_id", "ensembl_transcript_id", "coding"),
filters="ensembl_gene_id",
values=rownames(airway)[5],
mart=mart)
g
## ensembl_gene_id
## 1 Sequence unavailable
## 2 ATGTTTTTACCTCATATGAACCACCTGACATTGGAACAGACTTTCTTTTCACAAGTGTTACCAAAGACTGTGAAATTATTCGATGACATGATGTATGAATTAACCAGTCAAGCCAGAGGACTGTCAAGCCAAAATTTGGAAATCCAGACCACTCTAAGGAATATTTTACAAACAATGGTGCAGCTCTTAGGAGCTCTCACAGGATGTGTTCAGCATATCTGTGCCACACAGGAATCCATCATTTTGGAAAATATTCAGAGTCTCCCCTCCTCAGTCCTTCATATAATTAAAAGCACATTTGTGCATTGTAAGAATAGTGAATCTGTGTATTCTGGGTGTTTACACCTAGTTTCAGACCTTCTCCAGGCTCTTTTCAAGGAGGCCTATTCTCTTCAAAAGCAGTTAATGGAACTGCTGGACATGGTTTGCATGGACCCTTTAGTAGATGACAATGATGATATTTTGAATATGGTAATAGTTATTCATTCTTTATTGGATATCTGCTCTGTTATTTCCAGTATGGACCATGCATTTCATGCCAATACTTGGAAGTTTATAATTAAGCAGAGCCTTAAGCACCAGTCCATAATAAAAAGCCAGTTGAAACACAAAGATATAATTACTAGCTTGTGTGAAGACATTCTTTTCTCCTTCCATTCTTGTTTACAGTTAGCTGAGCAGATGACACAGTCAGATGCACAGGATAATGCTGACTACAGATTATTTCAGAAAACACTCAAATTGTGTCGTTTTTTTGCCAACTCCCTTTTGCACTACGCTAAGGAATTTCTTCCTTTCCTCTCTGATTCTTGCTGTACTTTGCACCAACTGTATCTTCAGATACACAGCAAGTTTCCTCCAAGCCTTTATGCTACCAGGATTTCTAAAGCACACCAAGAGGAAATAGCAGGTGCTTTCCTAGTGACACTGGATCCACTTATCAGTCAGCTGCTCACATTTCAGCCTTTCATGCAGGTGGTTTTGGACAGTAAATTAGACCTGCCATGTGAACTGCAGTTTCCACAATGTCTTCTTCTGGTTGTTGTCATGGATAAGCTGCCATCTCAGCCTAAGGAAGTGCAAACCCTGTGGTGCACAGACAGCCAGGTCTCAGAAACGACAACCAGGATATCTCTACTCAAAGCCGTTTTCTACAGTTTTGAGCAGTGTTCTGGTGAACTCTCTCTACCTGTTCATTTACAGGGATTAAAGAGTAAGGGGAAAGCTGAGGTGGCTGTCACCTTGTATCAGCATGTTTGTGTTCATCTGTGTACATTTATTACTTCCTTTCATCCCTCACTGTTTGCTGAACTGGATGCTGCTCTGCTGAATGCTGTACTTAGTGCTAATATGATCACCTCTTTGTTAGCTATGGATGCATGGTGCTTCCTTGCTCGATATGGGACTGCTGAACTGTGTGCACACCATGTCACCATAGTGGCTCATCTGATAAAGTCATGCCCTGGAGAATGTTATCAACTCATCAACCTATCAATACTGTTGAAGCGTCTCTTTTTCTTCATGGCACCACCCCATCAGCTGGAGTTTATCCAGAAATTTTCCCCAAAAGAAGCAGAAAATCTGCCTCTGTGGCAACATATTTCCTTCCAGGCGTTACCTCCTGAGCTTAGGGAACAAACTGTCCATGAGGTCACCACAGTAGGCACTGCAGAATGCAGGAAATGGCTGAGCAGGAGTCGTACTTTGGGAGAACTAGAATCTCTGAACACAGTACTGTCTGCTTTGCTTGCAGTATGTAATTCTGCTGGTGAAGCTTTGGATACAGGAAAACAAACTGCAATTATCGAAGTTGTGAGTCAGCTTTGGGCTTTTTTAAACATTAAACAGGTAGCAGATCAACCTTATGTTCAACAGACATTCAGCCTTTTACTTCCACTGTTGGGATTTTTCATTCAAACTCTAGATCCTAAACTGATACTTCAGGCAGTAACTTTGCAGACCTCGCTACTTAAATTAGAGCTTCCTGACTATGTTCGTTTGGCAATGTTGGATTTTGTATCTTCTTTAGGAAAACTTTTTATACCTGAAGCTATCCAGGACAGAATTCTGCCCAACCTGTCCTGTATGTTTGCCTTACTGCTAGCTGACAGGAGTTGGCTGCTAGAACAACATACCTTGGAGGCGTTTACTCAGTTCGCTGAGGGAACAAATCATGAAGAGATAGTTCCACAGTGTCTCAGTTCTGAAGAAACTAAGAACAAAGTTGTATCCTTTCTGGAGAAGACTGGGTTTGTAGATGAAACTGAAGCTGCCAAAGTGGAACGTGTGAAACAGGAAAAAGGTATTTTCTGGGAACCCTTTGCTAATGTGACTGTAGAAGAAGCAAAGAGGTCATCTTTACAGCCTTATGCAAAAAGAGCTCGTCAGGAGTTCCCCTGGGAAGAAGAGTACAGGTCAGCGCTGCATACAATAGCAGGGGCTTTGGAAGCAACTGAGTCACTACTCCAAAAGGGTCCTGCTCCAGCCTGGCTTTCAATGGAAATGGAGGCGCTCCAAGAAAGGATGGATAAGCTAAAACGTTACATACATACTCTAGGGTGA
## 3 Sequence unavailable
## 4 Sequence unavailable
## 5 ATGCAGGTGGTTTTGGACAGTAAATTAGACCTGCCATGTGAACTGCAGTTTCCACAATGTCTTCTTCTGGTTGTTGTCATGGATAAGCTGCCATCTCAGCCTAAGGAAGTGCAAACCCTGTGGTGCACAGACAGCCAGGTCTCAGAAACGACAACCAGGATATCTCTACTCAAAGCCGTTTTCTACAGTTTTGAGCAGTGTTCTGGTGAACTCTCTCTACCTGTTCATTTACAGGGATTAAAGAGTAAGGGGAAAGCTGAGGTGGCTGTCACCTTGTATCAGCATGTTTGTGTTCATCTGTGTACATTTATTACTTCCTTTCATCCCTCACTGTTTGCTGAACTGGATGCTGCTCTGCTGAATGCTGTACTTAGTGCTAATATGATCACCTCTTTGTTAGCTATGGATGCATGGTGCTTCCTTGCTCGATATGGGACTGCTGAACTGTGTGCACACCATGTCACCATAGTGGCTCATCTGATAAAGTCATGCCCTGGAGAATGTTATCAACTCATCAACCTATCAATACTGTTGAAGCGTCTCTTTTTCTTCATGGCACCACCCCATCAGCTGGAGTTTATCCAGAAATTTTCCCCAAAAGAAGCAGAAAATCTGCCTCTGTGGCAACATATTTCCTTCCAGGCGTTACCTCCTGAGCTTAGGGAACAAACTGTCCATGAGGTCACCACAGTAGGCACTGCAGAATGCAGGAAATGGCTGAGCAGGAGTCGTACTTTGGGAGAACTAGAATCTCTGAACACAGTACTGTCTGCTTTGCTTGCAGTATGTAATTCTGCTGGTGAAGCTTTGGATACAGGAAAACAAACTGCAATTATCGAAGTTGTGAGTCAGCTTTGGGCTTTTTTAAACATTAAACAGGTAGCAGATCAACCTTATGTTCAACAGACATTCAGCCTTTTACTTCCACTGTTGGGATTTTTCATTCAAACTCTAGATCCTAAACTGATACTTCAGGCAGTAACTTTGCAGACCTCGCTACTTAAATTAGAGCTTCCTGACTATGTTCGTTTGGCAATGTTGGATTTTGTATCTTCTTTAGGAAAACTTTTTATACCTGAAGCTATCCAGGACAGAATTCTGCCCAACCTGTCCTGTATGTTTGCCTTACTGCTAGCTGACAGGAGTTGGCTGCTAGAACAACATACCTTGGAGGCGTTTACTCAGTTCGCTGAGGGAACAAATCATGAAGAGATAGTTCCACAGTGTCTCAGTTCTGAAGAAACTAAGAACAAAGTTGTATCCTTTCTGGAGAAGACTGGGTTTGTAGATGAAACTGAAGCTGCCAAAGTGGAACGTGTGAAACAGGAAAAAGGTATTTTCTGGGAACCCTTTGCTAATGTGACTGTAGAAGAAGCAAAGAGGTCATCTTTACAGCCTTATGCAAAAAGAGCTCGTCAGGAGTTCCCCTGGGAAGAAGAGTACAGGTCAGCGCTGCATACAATAGCAGGGGCTTTGGAAGCAACTGAGTCACTACTCCAAAAGGGTCCTGCTCCAGCCTGGCTTTCAATGGAAATGGAGGCGCTCCAAGAAAGGATGGATAAGCTAAAACGTTACATACATACTCTAGGGTGA
## 6 Sequence unavailable
## 7 ATGTTTTTACCTCATATGAACCACCTGACATTGGAACAGACTTTCTTTTCACAAGTGTTACCAAAGACTGTGAAATTATTCGATGACATGATGTATGAATTAACCAGTCAAGCCAGAGGACTGTCAAGCCAAAATTTGGAAATCCAGACCACTCTAAGGAATATTTTACAAACAATGGTGCAGCTCTTAGGAGCTCTCACAGGATGTGTTCAGCATATCTGTGCCACACAGGAATCCATCATTTTGGAAAATATTCAGAGTCTCCCCTCCTCAGTCCTTCATATAATTAAAAGCACATTTGTGCATTGTAAGAATAGTGAATCTGTGTATTCTGGGTGTTTACACCTAGTTTCAGACCTTCTCCAGGCTCTTTTCAAGGAGGCCTATTCTCTTCAAAAGCAGTTAATGGAACTGCTGGACATGGTTTGCATGGACCCTTTAGTAGATGACAATGATGATATTTTGAATATGGTAATAGTTATTCATTCTTTATTGGATATCTGCTCTGTTATTTCCAGTATGGACCATGCATTTCATGCCAATACTTGGAAGTTTATAATTAAGCAGAGCCTTAAGCACCAGTCCATAATAAAAAGCCAGTTGAAACACAAAGATATAATTACTAGCTTGTGTGAAGACATTCTTTTCTCCTTCCATTCTTGTTTACAGTTAGCTGAGCAGATGACACAGTCAGATGCACAGGATAATGCTGACTACAGATTATTTCAGAAAACACTCAAATTGTGTCGTTTTTTTGCCAACTCCCTTTTGCACTACGCTAAGGAATTTCTTCCTTTCCTCTCTGATTCTTGCTGTACTTTGCACCAACTGTATCTTCAGATACACAGCAAGTTTCCTCCAAGCCTTTATGCTACCAGGATTTCTAAAGCACACCAAGAGGAAATAGCAGGTGCTTTCCTAGTGACACTGGATCCACTTATCAGTCAGCTGCTCACATTTCAGCCTTTCATGCAGGTGGTTTTGGACAGTAAATTAGACCTGCCATGTGAACTGCAGTTTCCACAATGTCTTCTTCTGGTTGTTGTCATGGATAAGCTGCCATCTCAGCCTAAGGAAGTGCAAACCCTGTGGTGCACAGACAGCCAGGTCTCAGAAACGACAACCAGGATATCTCTACTCAAAGCCGTTTTCTACAGTTTTGAGCAGTGTTCTGGTGAACTCTCTCTACCTGTTCATTTACAGGGATTAAAGAGTAAGGGGAAAGCTGAGGTGGCTGTCACCTTGTATCAGCATGTTTGTGTTCATCTGTGTACATTTATTACTTCCTTTCATCCCTCACTGTTTGCTGAACTGGATGCTGCTCTGCTGAATGCTGTACTTAGTGCTAATATGATCACCTCTTTGTTAGCTATGGATGCATGGTGCTTCCTTGCTCGATATGGGACTGCTGAACTGTGTGCACACCATGTCACCATAGTGGCTCATCTGATAAAGTCATGCCCTGGAGAATGTTATCAACTCATCAACCTATCAATACTGTTGAAGCGTCTCTTTTTCTTCATGGCACCACCCCATCAGCTGGAGTTTATCCAGAAATTTTCCCCAAAAGAAGCAGAAAATCTGCCTCTGTGGCAACATATTTCCTTCCAGGCGTTACCTCCTGAGCTTAGGGAACAAACTGTCCATGAGGTCACCACAGTAGGCACTGCAGAATGCAGGAAATGGCTGAGCAGGAGTCGTACTTTGGGAGAACTAGAATCTCTGAACACAGTACTGTCTGCTTTGCTTGCAGTATGTAATTCTGCTGGTGAAGCTTTGGATACAGGAAAACAAACTGCAATTATCGAAGTTGTGAGTCAGCTTTGGGCTTTTTTAAACATTAAACAGGTAGCAGATCAACCTTATGTTCAACAGACATTCAGCCTTTTACTTCCACTGTTGGGATTTTTCATTCAAACTCTAGATCCTAAACTGATACTTCAGGCAGTAACTTTGCAGACCTCGCTACTTAAATTAGAGCTTCCTGACTATGTTCGTTTGGCAATGTTGGATTTTGTATCTTCTTTAGGAAAACTTTTTATACCTGAAGCTATCCAGGACAGAATTCTGCCCAACCTGTCCTGTATGTTTGCCTTACTGCTAGCTGACAGGAGTTGGCTGCTAGAACAACATACCTTGGAGGCGTTTACTCAGTTCGCTGAGGGAACAAATCATGAAGAGATAGTTCCACAGTGTCTCAGTTCTGAAGAAACTAAGAACAAAGTTGTATCCTTTCTGGAGAAGACTGGGTTTGTAGATGAAACTGAAGCTGCCAAAGTGGAACGTGTGAAACAGGAAAAAGGTATTTTCTGGGAACCCTTTGCTAATGTGACTGTAGAAGAAGCAAAGAGGTCATCTTTACAGCCTTATGCAAAAAGAGCTCGTCAGGAGTTCCCCTGGGAAGAAGAGTACAGGTCAGCGCTGCATACAATAGCAGGGGCTTTGGAAGCAACTGAGTCACTACTCCAAAAGGGTCCTGCTCCAGCCTGGCTTTCAATGGAAATGGAGGCGCTCCAAGAAAGGATGGATAAGCTAAAACGTTACATACATACTCTAGGGTGA
## 8 Sequence unavailable
## 9 Sequence unavailable
## ensembl_transcript_id coding
## 1 ENSG00000000460 ENST00000481744
## 2 ENSG00000000460 ENST00000286031
## 3 ENSG00000000460 ENST00000472795
## 4 ENSG00000000460 ENST00000459772
## 5 ENSG00000000460 ENST00000413811
## 6 ENSG00000000460 ENST00000466580
## 7 ENSG00000000460 ENST00000359326
## 8 ENSG00000000460 ENST00000496973
## 9 ENSG00000000460 ENST00000498289
biocLite("ggbio")
biocLite("Homo.sapiens")
library(ggbio)
## Loading required package: ggplot2
## Warning: replacing previous import by 'BiocGenerics::Position' when loading
## 'ggbio'
## Need specific help about ggbio? try mailing
## the maintainer or visit http://tengfei.github.com/ggbio/
##
## Attaching package: 'ggbio'
## The following objects are masked from 'package:ggplot2':
##
## geom_bar, geom_rect, geom_segment, ggsave, stat_bin,
## stat_identity, xlim
library(Homo.sapiens)
## Loading required package: AnnotationDbi
## Loading required package: OrganismDbi
## Loading required package: GenomicFeatures
## Loading required package: GO.db
## Loading required package: DBI
##
## Loading required package: org.Hs.eg.db
##
## Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
## Now getting the GODb Object directly
## Now getting the OrgDb Object directly
## Now getting the TxDb Object directly
We can try plotting the Granges in our summarized experiment, but the Homo.sapiens object names chromosomes as “chrN”, and the names in airway are just the number.
wh = unlist(rowRanges(airway)[5])
p.g <- autoplot(Homo.sapiens, which=wh)
## Error in .local(obj, ...): 1 is not matched with seqlevels of your object, please rename your 'which' arguments
We need to rename the chromosomes.
wh2 <- renameSeqlevels(wh, sub("1","chr1", seqlevels(wh)))
p.g2 <- autoplot(Homo.sapiens, which=wh2)
## Parsing transcripts...
## Parsing exons...
## Parsing cds...
## Parsing utrs...
## ------exons...
## ------cdss...
## ------introns...
## ------utr...
## aggregating...
## Done
## 'select()' returned 1:1 mapping between keys and columns
## "gap" not in any of the valid gene feature terms "cds", "exon", "utr"
## Constructing graphics...
p.g2
## Warning: Removed 2 rows containing missing values (geom_text).
library(limma)
##
## Attaching package: 'limma'
## The following object is masked from 'package:BiocGenerics':
##
## plotMA
design <- model.matrix(~ dex, data=colData(airway))
v <- voom(assays(airway)$counts,design=design)
fit <- lmFit(v, design)
fit <- eBayes(fit)
summary(decideTests(fit))
## (Intercept) dexuntrt
## -1 49118 674
## 0 1666 62495
## 1 13318 933
topTable(fit, coef="dexuntrt")
## logFC AveExpr t P.Value adj.P.Val
## ENSG00000134686 -1.304196 6.837469 -22.28727 1.384466e-09 8.874707e-05
## ENSG00000148175 -1.364975 8.854588 -18.96349 6.315648e-09 1.428428e-04
## ENSG00000162614 -1.945885 7.636408 -17.51779 1.325278e-08 1.428428e-04
## ENSG00000162616 -1.430717 5.903480 -17.27478 1.509630e-08 1.428428e-04
## ENSG00000189221 -3.263290 5.949117 -16.91641 1.835153e-08 1.428428e-04
## ENSG00000152583 -4.480843 4.165462 -17.88614 1.091355e-08 1.428428e-04
## ENSG00000120129 -2.847874 6.643013 -16.56703 2.228368e-08 1.428428e-04
## ENSG00000178695 2.589942 6.450283 16.57994 2.212289e-08 1.428428e-04
## ENSG00000196517 2.289351 3.024889 17.85198 1.111011e-08 1.428428e-04
## ENSG00000125148 -2.136635 7.022630 -15.98043 3.114408e-08 1.814907e-04
## B
## ENSG00000134686 12.547426
## ENSG00000148175 11.263744
## ENSG00000162614 10.516600
## ENSG00000162616 10.307478
## ENSG00000189221 10.103458
## ENSG00000152583 10.049829
## ENSG00000120129 9.977213
## ENSG00000178695 9.974713
## ENSG00000196517 9.831402
## ENSG00000125148 9.673848
plotMA(fit)