2. Software Usage
2.1 Installation
The package can be installed with following command
if (!requireNamespace("BiocManager"))
install.packages("BiocManager")
BiocManager::install("bioAnno")
2.2 Load package
library(bioAnno)
2.3 How to use it
library(bioAnno)
## build E.coli annotation package by using fromKEGG function from
## KEGG database.
fromKEGG(species="eco", install = FALSE)
## #########################################################################
## The bioAnno package downloads and uses KEGG data.Non-academic uses may
## require a KEGG license agreement (details at http://www.kegg.jp/kegg/legal.html)
## The Gene Ontology are downloaded from NCBI.
## #########################################################################
## Creating package in /var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.eco.eg.db
## ################################################################
## Please find your annotation package in ...
## /var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.eco.eg.db
## You can install it by using
## install.packages("/var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.eco.eg.db",repos = NULL,type='source')
## ################################################################
## Here are the tables in the package org.eco.eg.db ...
## gene_info genes go go_all go_bp go_bp_all go_cc go_cc_all go_mf go_mf_all ko map_counts map_metadata metadata path
## ################################################################
## [1] "/var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.eco.eg.db"
## which will build "org.eco.eg.db" package. The package contains
## KEGG, GO annotation. You can use install = TRUE to direct
## install the package.
## build from arabidopsis thaliana annotation package by using fromAnnHub
## function
fromAnnHub(species="ath", install = FALSE)
## Creating package in /var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.ath.eg.db
## ################################################################
## Please find your annotation package in ...
## /var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.ath.eg.db
## You can install it by using
## install.packages("/var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.ath.eg.db",repos = NULL,type='source')
## ################################################################
## Here are the tables in the package org.ath.eg.db ...
## gene_info genes go go_all go_bp go_bp_all go_cc go_cc_all go_mf go_mf_all map_counts map_metadata metadata path refseq symbol
## ################################################################
2.4 Main Functions
– fromKEGG build annotation package by extracting annotation information from Kyoto Encyclopedia of Genes and Genomes (KEGG) database. You can use KEGG species code as the query name.
– fromNCBI build annotation package by extracting annotation information from NCBI database.
– fromENSEMBL build annotation package by extracting annotation information fromENSEMBL database. It includes function to build annotaion package for plant with parameter plant = TRUE.
– fromAnnhub build annotation package with the AnnotationHub package
– getTable get annotataion table from temporary package which need user provide the temporary path
3 To use the annotation package user created
An organism level package (an ‘org’ package) you created uses a central gene identifier and contains mappings between this identifier and other kinds of identifiers. The most common interface for retrieving data is the select method.
#First make your own anntation package and loading the package
data(ath)
fromOwn(geneinfo = ath, install = TRUE)
## Please make sure you have Gene Ontology and KEGG pathway
## or KO data.frame ready.
## Creating package in /var/folders/p_/0q7dys0x1g53nw6wypk8_qgm0000gn/T//RtmpLyWnIU/org.species.eg.db
library(org.species.eg.db)
There are 4 common methods that work together to allow a select interface. The 1st one is columns, which help you to discover which sorts of annotations can be extracted from it.
columns(org.species.eg.db)
## [1] "ENTREZID" "EVIDENCE" "EVIDENCEALL" "GID" "GO"
## [6] "GOALL" "ONTOLOGY" "ONTOLOGYALL" "PATH"
The next method is keytypes which tells you the kinds of things that can be used as keys.
keytypes(org.species.eg.db)
## [1] "ENTREZID" "EVIDENCE" "EVIDENCEALL" "GID" "GO"
## [6] "GOALL" "ONTOLOGY" "ONTOLOGYALL" "PATH"
The third method is keys which is used to retrieve all the viable keys of a particular type.
key <- keys(org.species.eg.db,keytype="ENTREZID")
And finally there is select, which extracts data by using values supplied by the other method
result <- select(org.species.eg.db, keys=key,
columns=c("GID","GO","PATH"),keytype="ENTREZID")
## 'select()' returned 1:1 mapping between keys and columns
head(result)
## ENTREZID GID GO PATH
## 1 10723018 AT4G37553 GO:0008150 01100
## 2 10723019 AT1G27045 GO:0008150 01100
## 3 10723020 AT2G41231 GO:0008150 01100
## 4 10723022 AT5G01542 GO:0008150 01100
## 5 10723023 AT1G24095 GO:0008150 01100
## 6 10723024 AT3G02832 GO:0008150 01100
Users are also allowed to use mapIds extract gene identifiers KEGG pathway from the annotation package.
KEGG<-mapIds(org.species.eg.db,keys=key,column="PATH",keytype="ENTREZID")
head(KEGG)
## 10723018 10723019 10723020 10723022 10723023 10723024
## "01100" "01100" "01100" "01100" "01100" "01100"
Or for id conversion
mapIds(org.species.eg.db,keys=key[1:10],column="GID",keytype="ENTREZID")
## 10723018 10723019 10723020 10723022 10723023 10723024
## "AT4G37553" "AT1G27045" "AT2G41231" "AT5G01542" "AT1G24095" "AT3G02832"
## 10723025 10723026 10723027 10723028
## "AT4G39838" "AT1G12855" "AT4G22758" "AT3G57965"
The version number of R and packages loaded for generating the vignette were:
sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] org.species.eg.db_0.0.1 AnnotationDbi_1.50.3 IRanges_2.22.2
## [4] S4Vectors_0.26.1 Biobase_2.48.0 BiocGenerics_0.34.0
## [7] bioAnno_0.99.32 colorout_1.2-2
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.5 GO.db_3.11.4
## [3] prettyunits_1.1.1 png_0.1-7
## [5] Biostrings_2.56.0 assertthat_0.2.1
## [7] digest_0.6.25 mime_0.9
## [9] BiocFileCache_1.12.1 R6_2.4.1
## [11] RSQLite_2.2.0 evaluate_0.14
## [13] httr_1.4.2 pillar_1.4.6
## [15] zlibbioc_1.34.0 rlang_0.4.7
## [17] progress_1.2.2 curl_4.3
## [19] data.table_1.13.0 blob_1.2.1
## [21] R.utils_2.10.1 R.oo_1.24.0
## [23] rmarkdown_2.3 AnnotationHub_2.20.2
## [25] stringr_1.4.0 RCurl_1.98-1.2
## [27] bit_4.0.4 biomaRt_2.44.1
## [29] shiny_1.5.0 compiler_4.0.2
## [31] httpuv_1.5.4 xfun_0.17
## [33] askpass_1.1 pkgconfig_2.0.3
## [35] htmltools_0.5.0 openssl_1.4.2
## [37] tidyselect_1.1.0 KEGGREST_1.28.0
## [39] tibble_3.0.3 interactiveDisplayBase_1.26.3
## [41] XML_3.99-0.5 AnnotationForge_1.30.1
## [43] crayon_1.3.4 dplyr_1.0.2
## [45] dbplyr_1.4.4 later_1.1.0.1
## [47] bitops_1.0-6 R.methodsS3_1.8.1
## [49] rappdirs_0.3.1 jsonlite_1.7.1
## [51] xtable_1.8-4 lifecycle_0.2.0
## [53] DBI_1.1.0 magrittr_1.5
## [55] stringi_1.5.3 XVector_0.28.0
## [57] promises_1.1.1 ellipsis_0.3.1
## [59] generics_0.0.2 vctrs_0.3.4
## [61] tools_4.0.2 bit64_4.0.5
## [63] glue_1.4.2 purrr_0.3.4
## [65] BiocVersion_3.11.1 hms_0.5.3
## [67] fastmap_1.0.1 yaml_2.2.1
## [69] BiocManager_1.30.10 memoise_1.1.0
## [71] knitr_1.29