INSTALACIÓN DE PAQUETERÍA

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

1. IMPORTANDO BASE DE DATOS

Inicalmente vamos a trabajar con la base de datos all de la semana 1. El archivo se llama Rn5K_08_15_all.txt.

w1_all <- read.delim("C:/Users/acard/Downloads/R_project/MICROARREGLO_NAT/DATA/Work Semana 1/w1_all.txt")
kable(head(w1_all,20), caption = "BASE DE DATOS SEMANA 1")

BASE DE DATOS SEMANA 1
Cy3	Cy5	Id	Symbol	Zscore
19.0	15.0	AB008889	Trpc4	-2.0495676
17.0	19.0	AF016182		0.9753379
18.0	18.0	BG375351		0.0074636
21.0	16.5	AJ223599	Kif3c	-2.0911042
20.3	17.4	AF023657	Enman	-1.3339400
19.9	18.0	NM_019315	Kcnn3	-0.8657543
18.0	20.4	U39208		1.0966206
17.0	22.0	AW140637		2.2510665
18.0	21.0	NM_031542	Brca2	1.3488673
21.0	19.0	AB010467	Abcc3	-0.8634525
20.5	21.5	NM_019368	Bet1l	0.4219181
23.0	20.3	NM_019274	Colq	-1.0791715
23.0	22.0	NM_031087	Psen2	-0.3793511
24.0	23.2	BF549571		-0.2875443
23.5	23.9	NM_012581	Hoxa2	0.1543346
22.4	25.5	L22294	Pdk1	1.1353820
28.0	28.0	M84000	Pzp	0.0074636
27.0	30.5	AW918595		1.0681365
29.4	29.2	NM_031332	LOC83500	-0.0519352
27.7	31.4	L07073	Ap3m1	1.0984684

2. ANALISIS DE LOS GENES ANOTADOS CON SIMBOLOS

Esta base de datos tiene 5 columnas

Cy3: que contiene la fluorescencia en el color Cy3.
Cy5: que contiene la fluorescencia en el color Cy5.
Id: que contiene los códigos de identificación para cada uno de los genes/clones amplificados.
Symbol: que contiene los simbolos de los genes (muchos de ellos no tienen esta anotación.
Zscore: contiene el valor de la expresión diferencial escalado en Zscore.

Esta base de datos tiene una dimensión de 5593 registros/genes/clones y 5 variables/columnas

gene.data <- w1_all
dim(gene.data)

## [1] 5593    5

La cantidad de genes anotados con símbolos es de solo 22% aproximadamente

df <- data.frame(anotacion = c("con simbolo", "sin simbolo"),
                 valor = c(length(which(gene.data$Symbol == "")),
                           length(which(gene.data$Symbol != "")))
                 )
df <- df %>%  
mutate(prop = round(valor / nrow(gene.data) *100,2))
  
bp<- ggplot(df, aes(x="", y=prop, fill=anotacion))+
  geom_bar(width = 1, stat = "identity")+
  coord_polar("y", start=0)

bp

kable(df)

anotacion	valor	prop
con simbolo	1230	21.99
sin simbolo	4363	78.01

Por esta razón, y porque los paquetes habitualmente usan la codificación ENTREZ GENE ID para hacer los análisis de enriquecimiento vamos a agregar las anotaciones con las bases de datos org.Rn.eg.db

3. ANOTACIÓN DE LOS CÓDIGOS ENTREZ GENE ID

Primero tenemos que importar la base de datos específica para rata

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("org.Rn.eg.db")

Una vez que se instala esta base de datos, se carga al entorno de trabajo y con la función columns() podemos ver que terminos podemos asociar

library(org.Rn.eg.db)

## Loading required package: AnnotationDbi

## Warning: package 'AnnotationDbi' was built under R version 4.0.3

## Loading required package: stats4

## Loading required package: BiocGenerics

## Warning: package 'BiocGenerics' was built under R version 4.0.3

## Loading required package: parallel

## 
## Attaching package: 'BiocGenerics'

## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB

## The following objects are masked from 'package:dplyr':
## 
##     combine, intersect, setdiff, union

## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs

## The following objects are masked from 'package:base':
## 
##     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
##     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
##     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
##     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
##     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
##     union, unique, unsplit, which.max, which.min

## Loading required package: Biobase

## Warning: package 'Biobase' was built under R version 4.0.3

## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.

## Loading required package: IRanges

## Warning: package 'IRanges' was built under R version 4.0.3

## Loading required package: S4Vectors

## Warning: package 'S4Vectors' was built under R version 4.0.3

## 
## Attaching package: 'S4Vectors'

## The following objects are masked from 'package:dplyr':
## 
##     first, rename

## The following object is masked from 'package:base':
## 
##     expand.grid

## 
## Attaching package: 'IRanges'

## The following objects are masked from 'package:dplyr':
## 
##     collapse, desc, slice

## The following object is masked from 'package:grDevices':
## 
##     windows

## 
## Attaching package: 'AnnotationDbi'

## The following object is masked from 'package:dplyr':
## 
##     select

##

columns(org.Rn.eg.db)

##  [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
##  [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
## [11] "GO"           "GOALL"        "IPI"          "ONTOLOGY"     "ONTOLOGYALL" 
## [16] "PATH"         "PFAM"         "PMID"         "PROSITE"      "REFSEQ"      
## [21] "SYMBOL"       "UNIGENE"      "UNIPROT"

Estas diferentes columnas nos dicen que terminos podemos anotar, como por ejemplo

Nombre	definición
“ACCNUM”	Número de acceso del gen
“ENSEMBL”	Código Ensembl
“REFSEQ”	Código de refseq de NCBI
“ENTREZID”	Código Entrez gene id

Como esta base de datos tiene tantos valores de “REFSEQ”, como de “ACCNUM” y GenBank (este último no definido en la base de datos utilizada), vamos a ir anotando uno por uno estos datos.

3.1 Anotacion de códigos REFSEQ

Vamos a crear un objeto temporal temp para ir agregando estos términos

temp <-AnnotationDbi::select(org.Rn.eg.db,
              keys=as.character(gene.data$Id),
              columns="ENTREZID",
              keytype="REFSEQ")

## 'select()' returned 1:1 mapping between keys and columns

temp2 <- temp[is.na(temp$ENTREZID),]
temp <- temp[complete.cases(temp),]
names(temp) <- c("Id", "ENTREZID")
dim(temp)

## [1] 2731    2

kable(head(temp,10))

	Id	ENTREZID
6	NM_019315	54263
9	NM_031542	360254
11	NM_019368	54400
12	NM_019274	29755
13	NM_031087	81751
15	NM_012581	103690123
19	NM_031332	83500
21	NM_017267	29635
26	NM_013105	25642
28	NM_031801	83632

Con esto se logra anotar 2731 de los 5593 genes anotados para el microarreglo.

3.2 Anotación de códigos ACCNUM

Algunos del resto de genes pueden ser anotados con el término ACCUM. Para ello vamos a utilizar los genes no anotados anteriormente (temp2), y sobre generar el nuevo data.frame con las anotaciones.

names(temp2) <- c("Id", "ENTREZID")
temp2 <-AnnotationDbi::select(org.Rn.eg.db,
              keys=as.character(temp2$Id),
              columns="ENTREZID",
              keytype="ACCNUM")

## 'select()' returned 1:1 mapping between keys and columns

temp2 <- temp2[complete.cases(temp2),]
names(temp2) <- c("Id", "ENTREZID")
dim(temp2)

## [1] 1760    2

kable(head(temp2,10))

	Id	ENTREZID
1	AB008889	84494
2	AF016182	286914
4	AJ223599	85248
6	U39208	266689
8	AB010467	140668
10	L22294	116551
11	M84000	252922
13	L07073	171126
15	AF010466	25712
16	L40030	94203

Con esto se logra anotar 1760 genes, que junto con los 2731 anotados anteriormente ya se tiene 4491 de los 5593 genes anotados para el microarreglo.

3.3 Agregando ENTREZID a la base de datos gene.data

Ahora agregaremos los resultados a la base de datos gene.data. Primero fusionamos los dos archivos temporales generados anteriormente (temp, temp2), y luego lo fusionamos con la base de datos gene.data con la función merge() sin eliminar los que aún no tinene ENTREZID asociados

temp <- rbind(temp,temp2)
gene.data <- merge(x = gene.data, y=temp, by = "Id", all = T)
kable(head(gene.data,20), caption = "BASE DE DATOS con ENTREZID")

BASE DE DATOS con ENTREZID
Id	Cy3	Cy5	Zscore	ENTREZID
AA799713	28538.8	28914.0	-0.0650482	NA
AA800019	8004.7	8665.5	1.3238440	NA
AA800172	5058.3	4891.9	-0.2539321	NA
AA818018	31303.2	28960.7	-0.6334669	NA
AA818786	396.0	421.0	0.4746041	NA
AA819151	3979.6	3806.5	-0.4492376	NA
AA850135	4263.8	4096.8	-0.5941062	NA
AA859006	13511.0	10860.9	-1.9669838	NA
AA874881	2167.2	2103.9	-0.4867235	64622
AA875305	275.3	232.0	-1.1845871	NA
AA892272	4244.5	4394.0	0.4663000	NA
AA894165	377.2	374.3	-0.0112549	NA
AA899618	32367.4	30951.9	-0.4526110	NA
AA899923	4122.3	4449.5	0.9817978	NA
AA923857	8109.1	7909.9	-0.3251830	NA
AA924509	109.9	117.9	0.6862849	502374
AA925587	8777.5	7869.0	-1.6998069	NA
AA943126	4805.0	5105.7	0.9408647	NA
AA943594	13860.7	14181.4	0.2731025	NA
AA944017	4619.0	5185.6	1.4256890	NA

Como vemos aún faltan genes para anotar los ENTREZID, que si los cuantificamos con la siguiente linea de código, corresponde a

nrow(gene.data[is.na(gene.data$ENTREZID),])

## [1] 1102

a 1102 genes que aún no se han anotado. Para tratar de no perder estos genes, vamos a tratar de anotarlos utilizando su SIMBOLO.

3.4 Anotación usando simbolos

Primero hacemos un subgrupo de los genes que aún no han sido anotados

temp <- gene.data[is.na(gene.data$ENTREZID),]

Ahora usamos la columna “SYMB” para hacer la anotación

temp <-AnnotationDbi::select(org.Rn.eg.db,
              keys=as.character(temp$Symbol),
              columns="ENTREZID",
              keytype="SYMBOL")

## 'select()' returned many:1 mapping between keys and columns

temp <- temp[!is.na(temp$ENTREZID),]
kable(head(temp,20))

	SYMBOL	ENTREZID
38	Smarcd2	83833
43	Kcnj13	94341
44	Syngap1	192117
48	Slc2a8	85256
49	Cpb2	113936
50	Sval1	680174
55	Grin2a	24409
56	Cdh14	25198
57	Igfals	79438
60	Itgae	83577
63	Gfap	24387
72	Stx12	65033
73	Cdc20	64515
89	Ogfr	83525
92	Egfr	24329
96	Tert	301965
97	Kcnj16	29719
98	Sacm1l	116482
100	Sry	25221
103	Adra2b	24174

dim(temp)

## [1] 81  2

Con esta estrategia logramos anotar 81 genes mas. Ahora agregaremos estos datos a la base de datos gene.data

for (i in 1:nrow(temp)){
  gene.data[which(gene.data$Symbol==temp$SYMBOL[i]),6] <- temp$ENTREZID[i]
}

3.5 Analizando genes anotados

Ahora vamos a revisar cuantos genes han sido anotados

df <- data.frame(anotacion = c("con ENTREZ", "sin ENTREZ"),
                 valor = c(nrow(gene.data[!is.na(gene.data$ENTREZID),]),
                           nrow(gene.data[is.na(gene.data$ENTREZID),]))
                 )
df <- df %>%  
mutate(prop = round(valor / nrow(gene.data) *100,2))
  
bp<- ggplot(df, aes(x="", y=prop, fill=anotacion))+
  geom_bar(width = 1, stat = "identity")+
  coord_polar("y", start=0)

bp

kable(df)

anotacion	valor	prop
con ENTREZ	4572	81.75
sin ENTREZ	1021	18.25

Aún tenemos una perdida de casi el 14% de genes sin anotar. Si buscamos estos Ids faltantes con la herramienta en linea DAVID, tampoco logramos conseguir anotaciones.

4. ANÁLISIS DE DATOS DUPLICADOS

Ahora, aprovechand que ya tenemos los códigos ENTREZ GENE ID, vamos a ver que genes están anotados de forma duplicada. Primero eliminaremos los datos sin código ENTREZ GENE ID, porque todos ellos serían duplicados, para luego usar la función duplicated() podemos encontrar los duplicados.

gene.data.fil <- gene.data[!is.na(gene.data$ENTREZID),]
duplicados <- gene.data.fil[duplicated(gene.data.fil$ENTREZID),]
dim(duplicados)

## [1] 73  6

dim(gene.data.fil)

## [1] 4572    6

De los 5593 genes iniciales, anotamos 4491 (85%) de los genes, de estos 1833 están duplicados, quedando 2658 genes no duplicados

df <- data.frame(grupo = c("No anotados", "Duplicados", "Unicos"),
                 valor = c(1102, 1833, 2658)
                 )
df <- df %>%  
mutate(prop = round((valor/5593) *100,2))
  
bp<- ggplot(df, aes(x="", y=prop, fill=grupo))+
  geom_bar(width = 1, stat = "identity")+
  coord_polar("y", start=0)

bp

kable(df)

grupo	valor	prop
No anotados	1102	19.70
Duplicados	1833	32.77
Unicos	2658	47.52

Ahora vamos a analizar algunos de estos duplicados

kable(head(duplicados,10), caption = "DUPLICADOS")

DUPLICADOS
	Id	Cy3	Cy5	Symbol	Zscore	ENTREZID
145	AB032828	7922.4	7892.5	Epb4.1l3	-0.0382058	116724
386	AF071205	20689.9	23120.5	LOC60428	1.2469631	60427
411	AF084544	2595.4	2544.8		-0.2622435	114122
692	AF277892	25253.1	25523.3	Gnb2	0.1004706	81667
722	AF302842	4124.3	3663.8	Kcnk2	-1.6562332	116489
756	AF322217	4768.3	4868.5		0.2638374	302822
1284	AY125814	31721.0	31417.1		-0.1547005	292657
2030	L19109	4342.7	4394.3		0.1951188	25022
2052	L24776	1290.9	1476.4	Tpm3	1.7900972	117557
2083	L41045	1522.7	1419.4	Pde1c	-1.4858760	81742

Vamos a buscar el gen mas repetido

kable(head(sort(table(duplicados$ENTREZID),decreasing = T)))

Var1	Freq
117557	2
114122	1
114247	1
116489	1
116561	1
116591	1

Este correspondería al código ENTREZID “117557”. Vamos a buscar en la base de datos a que genes se anoto este ENTREZ GENE ID

gene.data[which(gene.data$ENTREZID == "117557"),]

##            Id     Cy3     Cy5 Symbol     Zscore ENTREZID
## 345  AF053359   912.5   931.8         0.3236289   117557
## 2052   L24776  1290.9  1476.4   Tpm3  1.7900972   117557
## 5454   X72859 11534.3 10982.0        -0.4973582   117557

Efectivamente estos tres Ids corresponden al mismo gen de tropomiosina.

Uno hace referencia al gen completo y los otros a fragmentos o isoformas

5. ANOTACIÓN DE GENES CON BIOMART

Primero creamos el objeto Mart para Rattus norvegicus

library(biomaRt)

## Warning: package 'biomaRt' was built under R version 4.0.3

biomaRt::listMarts()

##                biomart                version
## 1 ENSEMBL_MART_ENSEMBL      Ensembl Genes 102
## 2   ENSEMBL_MART_MOUSE      Mouse strains 102
## 3     ENSEMBL_MART_SNP  Ensembl Variation 102
## 4 ENSEMBL_MART_FUNCGEN Ensembl Regulation 102

ensembl <- biomaRt::useMart("ENSEMBL_MART_ENSEMBL")
head(biomaRt::listDatasets(mart = ensembl))

##                      dataset                                  description
## 1   acalliptera_gene_ensembl             Eastern happy genes (fAstCal1.2)
## 2 acarolinensis_gene_ensembl               Anole lizard genes (AnoCar2.0)
## 3  acchrysaetos_gene_ensembl              Golden eagle genes (bAquChr1.2)
## 4  acitrinellus_gene_ensembl               Midas cichlid genes (Midas_v5)
## 5  amelanoleuca_gene_ensembl                        Panda genes (ailMel1)
## 6    amexicanus_gene_ensembl Mexican tetra genes (Astyanax_mexicanus-2.0)
##                  version
## 1             fAstCal1.2
## 2              AnoCar2.0
## 3             bAquChr1.2
## 4               Midas_v5
## 5                ailMel1
## 6 Astyanax_mexicanus-2.0

biomaRt::searchDatasets(mart = ensembl, pattern = "rnorvegicus")

##                      dataset          description  version
## 160 rnorvegicus_gene_ensembl Rat genes (Rnor_6.0) Rnor_6.0

ensembl <- biomaRt::useMart("ensembl", dataset = "rnorvegicus_gene_ensembl")
ensembl

## Object of class 'Mart':
##   Using the ENSEMBL_MART_ENSEMBL BioMart database
##   Using the rnorvegicus_gene_ensembl dataset

5.1 Anotación de terminos GO

Ahora agregamos los términos GO usando la lista de ENTREZ GEN ID que generamos anteiormete.

head(gene.data.fil)

##          Id     Cy3     Cy5    Symbol     Zscore ENTREZID
## 9  AA874881  2167.2  2103.9           -0.4867235    64622
## 16 AA924509   109.9   117.9            0.6862849   502374
## 40 AB000199  4825.7  4055.9      Cca2 -2.8366111   246211
## 41 AB000215 20732.7 21098.9     Smpd3  0.3974930    94338
## 42 AB000216 12556.3 13169.6      Cca3  0.3892495   171440
## 43 AB000280  9452.5  9306.7 LOC246280 -0.2360655   246280

biomaRt::searchFilters(mart = ensembl, pattern = "entrez")

##                          name
## 17 with_entrezgene_trans_name
## 29            with_entrezgene
## 64      entrezgene_trans_name
## 78       entrezgene_accession
## 79              entrezgene_id
##                                                 description
## 17                    With EntrezGene transcript name ID(s)
## 29               With NCBI gene (formerly Entrezgene) ID(s)
## 64        EntrezGene transcript name ID(s) [e.g. Foxb2-202]
## 78 NCBI gene (formerly Entrezgene) accession(s) [e.g. A1bg]
## 79   NCBI gene (formerly Entrezgene) ID(s) [e.g. 100034253]

goannot <- biomaRt::getBM(attributes = c("entrezgene_id", "go_id", "name_1006"),
                          filters = "entrezgene_id", 
                          values = gene.data.fil$ENTREZID,
                          mart = ensembl)

kable(head(goannot,20))

entrezgene_id	go_id	name_1006
100134871	GO:0046872	metal ion binding
100134871	GO:0020037	heme binding
100134871	GO:0005344	oxygen carrier activity
100134871	GO:0019825	oxygen binding
100134871	GO:0015671	oxygen transport
100134871	GO:0098869	cellular oxidant detoxification
100134871	GO:0004601	peroxidase activity
100134871	GO:0031720	haptoglobin binding
100134871	GO:0043177	organic acid binding
100134871	GO:0042744	hydrogen peroxide catabolic process
100134871	GO:0005833	hemoglobin complex
100134871	GO:0031838	haptoglobin-hemoglobin complex
100134871	GO:0031721	hemoglobin alpha binding
100188944	GO:0016020	membrane
100188944	GO:0016021	integral component of membrane
100188944	GO:0032225	regulation of synaptic transmission, dopaminergic
100188944	GO:0042053	regulation of dopamine metabolic process
100188944	GO:0046929	negative regulation of neurotransmitter secretion
100188944	GO:0050884	neuromuscular process controlling posture
100188944	GO:0016787	hydrolase activity

5.2 Anotación de terminos KEGG

Para

pathannot<- AnnotationDbi::select(org.Rn.eg.db,
                                  keys = as.character(gene.data.fil$ENTREZID),
                                  columns = "PATH",
                                  keytype = "ENTREZID")

## 'select()' returned many:many mapping between keys and columns

kable(head(pathannot,20))

ENTREZID	PATH
64622	NA
502374	00230
502374	00240
502374	01100
502374	03020
502374	05016
246211	00120
246211	01100
94338	00600
94338	01100
171440	NA
246280	NA
81826	NA
84609	04360
29744	04360
85271	NA
171134	NA
116718	04146
89789	NA
113959	04080

5.3 Análisis de sobrerrepresentación

5.3.1 Prueba hipergeométrica usando el paquete GOstats

Primero debemos definir los genes de interes (+/- 2Zscore)

target.gene <- subset(x = gene.data.fil, subset = (Zscore <= -2)|(Zscore >=2))

Primero cargamos el paquete GOstats

## Warning: package 'GOstats' was built under R version 4.0.3

## Loading required package: Category

## Warning: package 'Category' was built under R version 4.0.3

## Loading required package: Matrix

## 
## Attaching package: 'Matrix'

## The following object is masked from 'package:S4Vectors':
## 
##     expand

## Loading required package: graph

## Warning: package 'graph' was built under R version 4.0.3

##

## 
## Attaching package: 'GOstats'

## The following object is masked from 'package:AnnotationDbi':
## 
##     makeGOGraph

Ahora que ya tenemos definido los genes objetivos podemos crear el objeto

paramsGO <- new("GOHyperGParams", 
                geneIds = target.gene$ENTREZID,
                universeGeneIds = gene.data.fil$ENTREZID,
                annotation = "org.Rn.eg.db", 
                ontology = "BP", 
                pvalueCutoff = 0.01, 
                conditional = FALSE, 
                testDirection = "over")

## Warning in makeValidParams(.Object): removing duplicate IDs in universeGeneIds

Over.GO <- hyperGTest(paramsGO)
kable(head(summary(Over.GO),20), caption = "GO TERMS")

GO TERMS
GOBPID	Pvalue	OddsRatio	ExpCount	Count	Size	Term
GO:0033084	0.0001451	27.721393	0.3271028	4	7	regulation of immature T cell proliferation in thymus
GO:0033080	0.0002796	20.786070	0.3738318	4	8	immature T cell proliferation in thymus
GO:0033083	0.0002796	20.786070	0.3738318	4	8	regulation of immature T cell proliferation
GO:0033079	0.0004848	16.624876	0.4205607	4	9	immature T cell proliferation
GO:0038007	0.0009379	31.039604	0.2336449	3	5	netrin-activated signaling pathway
GO:0043407	0.0009649	5.244949	1.6355140	7	35	negative regulation of MAP kinase activity
GO:0009153	0.0021734	Inf	0.0934579	2	2	purine deoxyribonucleotide biosynthetic process
GO:0018960	0.0021734	Inf	0.0934579	2	2	4-nitrophenol metabolic process
GO:0072191	0.0021734	Inf	0.0934579	2	2	ureter smooth muscle development
GO:0072193	0.0021734	Inf	0.0934579	2	2	ureter smooth muscle cell differentiation
GO:0072198	0.0021734	Inf	0.0934579	2	2	mesenchymal cell proliferation involved in ureter development
GO:0072199	0.0021734	Inf	0.0934579	2	2	regulation of mesenchymal cell proliferation involved in ureter development
GO:0009123	0.0023372	6.509375	0.9813084	5	21	nucleoside monophosphate metabolic process
GO:0001906	0.0028932	2.654948	5.4672897	13	117	cell killing
GO:0002320	0.0030609	15.512376	0.3271028	3	7	lymphoid progenitor cell differentiation
GO:0009151	0.0030609	15.512376	0.3271028	3	7	purine deoxyribonucleotide metabolic process
GO:0071315	0.0030609	15.512376	0.3271028	3	7	cellular response to morphine
GO:0033081	0.0032013	8.302488	0.6542056	4	14	regulation of T cell differentiation in thymus
GO:1901381	0.0035812	5.783333	1.0747664	5	23	positive regulation of potassium ion transmembrane transport
GO:0003184	0.0047298	12.406931	0.3738318	3	8	pulmonary valve morphogenesis

Finalmente haremos estee análisis para las vías KEGG

paramsKEGG <- new("KEGGHyperGParams", 
                  geneIds = target.gene$ENTREZID,
                  universeGeneIds = gene.data$ENTREZID,
                  annotation = "org.Rn.eg.db",
                  pvalueCutoff = 0.05, 
                  testDirection ="over")

## Warning in makeValidParams(.Object): removing duplicate IDs in universeGeneIds

Over.KEGG <- hyperGTest(paramsKEGG)
kable(head(summary(Over.KEGG),20), caption = "KEEG PATHWAY")

## 
## KEGG.db contains mappings based on older data because the original
##   resource was removed from the the public domain before the most
##   recent update was produced. This package should now be considered
##   deprecated and future versions of Bioconductor may not have it
##   available.  Users who want more current data are encouraged to look
##   at the KEGGREST or reactome.db packages

KEEG PATHWAY
KEGGID	Pvalue	OddsRatio	ExpCount	Count	Size	Term
04330	0.0058763	6.631884	0.7603168	4	19	Notch signaling pathway
00603	0.0213993	12.228723	0.2401000	2	6	Glycosphingolipid biosynthesis - globo series
00051	0.0377866	4.610887	0.7603168	3	19	Fructose and mannose metabolism

5.3.2 Analisis ORA usando paquete topGO

Primero cargamos el paquete

# BiocManager::install("topGO")
library(topGO)

## Warning: package 'topGO' was built under R version 4.0.3

## Loading required package: SparseM

## 
## Attaching package: 'SparseM'

## The following object is masked from 'package:base':
## 
##     backsolve

## 
## groupGOTerms:    GOBPTerm, GOMFTerm, GOCCTerm environments built.

## 
## Attaching package: 'topGO'

## The following object is masked from 'package:GOstats':
## 
##     testName

## The following objects are masked from 'package:Category':
## 
##     ontology, ontology<-, testName

## The following object is masked from 'package:IRanges':
## 
##     members

Generamos la tabla con los términos usando la base de datos org.Rn.eg.db

goannot2<-select(org.Rn.eg.db,
                 keys = as.character(gene.data.fil$ENTREZID),
                 columns="GO",
                 keytype="ENTREZID")

## 'select()' returned many:many mapping between keys and columns

kable(head(goannot2,20), caption = "GO TERMS")

GO TERMS
ENTREZID	GO	EVIDENCE	ONTOLOGY
64622	GO:0003677	IEA	MF
64622	GO:0003682	ISO	MF
64622	GO:0005507	IDA	MF
64622	GO:0005615	IDA	CC
64622	GO:0005615	ISO	CC
64622	GO:0005634	IDA	CC
64622	GO:0005737	IDA	CC
64622	GO:0007614	IDA	BP
64622	GO:0009743	IEP	BP
64622	GO:0010035	IEP	BP
64622	GO:0010629	IDA	BP
64622	GO:0010835	IDA	BP
64622	GO:0010976	IDA	BP
64622	GO:0019934	IDA	BP
64622	GO:0030424	IDA	CC
64622	GO:0030425	IDA	CC
64622	GO:0031668	IEP	BP
64622	GO:0032091	IDA	BP
64622	GO:0032147	IDA	BP
64622	GO:0033484	IDA	BP

Convertimos estos datos en una estructura aceptable para topGO

a <- 1
gene2GO<-c()
genevec <- c()
for (genid in gene.data.fil$ENTREZID){
  genevec[a] <- genid
  gene2GO[a] <- list(goannot2[goannot2$ENTREZID==genid,"GO"])
  a <- a+1
}
names(gene2GO)<-genevec
head(gene2GO)

## $`64622`
##  [1] "GO:0003677" "GO:0003682" "GO:0005507" "GO:0005615" "GO:0005615"
##  [6] "GO:0005634" "GO:0005737" "GO:0007614" "GO:0009743" "GO:0010035"
## [11] "GO:0010629" "GO:0010835" "GO:0010976" "GO:0019934" "GO:0030424"
## [16] "GO:0030425" "GO:0031668" "GO:0032091" "GO:0032147" "GO:0033484"
## [21] "GO:0042277" "GO:0043025" "GO:0043524" "GO:0043524" "GO:0044849"
## [26] "GO:0045773" "GO:0048487" "GO:0050731" "GO:0050805" "GO:0051965"
## [31] "GO:0003677" "GO:0003682" "GO:0005507" "GO:0005615" "GO:0005615"
## [36] "GO:0005634" "GO:0005737" "GO:0007614" "GO:0009743" "GO:0010035"
## [41] "GO:0010629" "GO:0010835" "GO:0010976" "GO:0019934" "GO:0030424"
## [46] "GO:0030425" "GO:0031668" "GO:0032091" "GO:0032147" "GO:0033484"
## [51] "GO:0042277" "GO:0043025" "GO:0043524" "GO:0043524" "GO:0044849"
## [56] "GO:0045773" "GO:0048487" "GO:0050731" "GO:0050805" "GO:0051965"
## 
## $`502374`
## [1] "GO:0005634" "GO:0005665" "GO:0006366"
## 
## $`246211`
##  [1] "GO:0001558" "GO:0003854" "GO:0005789" "GO:0005811" "GO:0005811"
##  [6] "GO:0006694" "GO:0006707" "GO:0008206" "GO:0016021" "GO:0016491"
## [11] "GO:0016616" "GO:0016853" "GO:0035754" "GO:0035754" "GO:0035754"
## [16] "GO:0043231" "GO:0047016" "GO:0047016" "GO:0047016" "GO:0055114"
## [21] "GO:0097421"
## 
## $`94338`
##   [1] "GO:0000137" "GO:0000137" "GO:0000139" "GO:0001501" "GO:0001503"
##   [6] "GO:0001932" "GO:0001932" "GO:0001958" "GO:0001958" "GO:0002063"
##  [11] "GO:0002244" "GO:0002244" "GO:0002685" "GO:0002685" "GO:0003433"
##  [16] "GO:0003433" "GO:0004620" "GO:0004767" "GO:0004767" "GO:0005515"
##  [21] "GO:0005576" "GO:0005737" "GO:0005737" "GO:0005794" "GO:0005886"
##  [26] "GO:0005886" "GO:0006665" "GO:0006672" "GO:0006672" "GO:0006684"
##  [31] "GO:0006684" "GO:0006685" "GO:0006685" "GO:0007165" "GO:0007568"
##  [36] "GO:0014824" "GO:0015774" "GO:0015774" "GO:0016020" "GO:0016021"
##  [41] "GO:0030072" "GO:0030072" "GO:0030282" "GO:0030282" "GO:0030324"
##  [46] "GO:0030509" "GO:0030509" "GO:0032963" "GO:0032963" "GO:0034614"
##  [51] "GO:0035264" "GO:0035264" "GO:0042802" "GO:0042802" "GO:0043491"
##  [56] "GO:0043491" "GO:0045429" "GO:0045840" "GO:0045840" "GO:0046872"
##  [61] "GO:0048008" "GO:0048008" "GO:0048286" "GO:0048286" "GO:0048661"
##  [66] "GO:0048661" "GO:0051216" "GO:0051481" "GO:0060348" "GO:0060541"
##  [71] "GO:0061035" "GO:0061035" "GO:0061751" "GO:0061751" "GO:0070301"
##  [76] "GO:0070301" "GO:0070314" "GO:0070314" "GO:0071286" "GO:0071286"
##  [81] "GO:0071347" "GO:0071356" "GO:0071356" "GO:0071461" "GO:0071461"
##  [86] "GO:0071897" "GO:0071897" "GO:0085029" "GO:0085029" "GO:0090494"
##  [91] "GO:0090520" "GO:0090520" "GO:0097187" "GO:0097187" "GO:0098868"
##  [96] "GO:0098868" "GO:0140014" "GO:0140014" "GO:0140052" "GO:0140052"
## [101] "GO:1900125" "GO:1900126" "GO:1900126" "GO:1901224" "GO:1901653"
## [106] "GO:1901653" "GO:1903543" "GO:2000304"
## 
## $`171440`
## [1] "GO:0005654" "GO:0005654" "GO:0046982" "GO:0097237" "GO:0097237"
## 
## $`246280`
##  [1] "GO:0005290" "GO:0005290" "GO:0006857" "GO:0015031" "GO:0015293"
##  [6] "GO:0015817" "GO:0015817" "GO:0016021" "GO:0022857" "GO:0055085"
## [11] "GO:0089709"

Ahora generamos la lista donde se define que genes son de interes y cuales no.

geneList<-rep(0,length(gene.data.fil$ENTREZID))
geneList[gene.data.fil$ENTREZID %in% target.gene$ENTREZID]<-1
names(geneList)<-genevec
geneList<-factor(geneList)
head(geneList)

##  64622 502374 246211  94338 171440 246280 
##      0      0      1      0      0      0 
## Levels: 0 1

Con estas estructuras podemos construir el objto topGO

GOdata.MF <- new("topGOdata", 
                 ontology = "MF",
                 description = "MF on up/down regulated genes",
                 allGenes = geneList,
                 annot = annFUN.gene2GO,
                 gene2GO = gene2GO)

## 
## Building most specific GOs .....

##  ( 2984 GO terms found. )

## 
## Build GO DAG topology ..........

##  ( 3433 GO terms and 4418 relations. )

## 
## Annotating nodes ...............

##  ( 4300 genes annotated to the GO terms. )

Ahora hacemos algunos análisis estadísticos

resultFisher.MF.classic <- runTest(GOdata.MF,
                                   algorithm = "classic",
                                   statistic = "fisher")

## 
##           -- Classic Algorithm -- 
## 
##       the algorithm is scoring 868 nontrivial nodes
##       parameters: 
##           test statistic: fisher

allRes.MF.classic <- GenTable(GOdata.MF, 
                              classicFisher = resultFisher.MF.classic,
                              topNodes = 20)
resultFisher.MF.weight <- runTest(GOdata.MF, 
                                  algorithm = "weight",
                                  statistic = "fisher")

## 
##           -- Weight Algorithm -- 
## 
##       The algorithm is scoring 868 nontrivial nodes
##       parameters: 
##           test statistic: fisher : ratio

## 
##   Level 12:  9 nodes to be scored.

## 
##   Level 11:  15 nodes to be scored.

## 
##   Level 10:  20 nodes to be scored.

## 
##   Level 9:   58 nodes to be scored.

## 
##   Level 8:   77 nodes to be scored.

## 
##   Level 7:   102 nodes to be scored.

## 
##   Level 6:   194 nodes to be scored.

## 
##   Level 5:   199 nodes to be scored.

## 
##   Level 4:   140 nodes to be scored.

## 
##   Level 3:   44 nodes to be scored.

## 
##   Level 2:   9 nodes to be scored.

## 
##   Level 1:   1 nodes to be scored.

allRes.MF.weight <- GenTable(GOdata.MF, 
                             classicFisher = resultFisher.MF.weight,
                             topNodes = 20)
resultFisher.MF.parentchild <- runTest(GOdata.MF, 
                                       algorithm = "parentchild", 
                                       statistic = "fisher")

## 
##           -- Parent-Child Algorithm -- 
## 
##       the algorithm is scoring 868 nontrivial nodes
##       parameters: 
##           test statistic: fisher : joinFun = union

## 
##   Level 12:  9 nodes to be scored.

## 
##   Level 11:  15 nodes to be scored.

## 
##   Level 10:  20 nodes to be scored.

## 
##   Level 9:   58 nodes to be scored.

## 
##   Level 8:   77 nodes to be scored.

## 
##   Level 7:   102 nodes to be scored.

## 
##   Level 6:   194 nodes to be scored.

## 
##   Level 5:   199 nodes to be scored.

## 
##   Level 4:   140 nodes to be scored.

## 
##   Level 3:   44 nodes to be scored.

## 
##   Level 2:   9 nodes to be scored.

allRes.MF.parentchild <- GenTable(GOdata.MF, 
                                  classicFisher = resultFisher.MF.parentchild,
                                  topNodes = 20)
kable(head(allRes.MF.classic,20), caption = "FISHER TEST")

FISHER TEST
GO.ID	Term	Annotated	Significant	Expected	classicFisher
GO:0019206	nucleoside kinase activity	2	2	0.09	0.0021
GO:0047134	protein-disulfide reductase activity	2	2	0.09	0.0021
GO:0016830	carbon-carbon lyase activity	23	5	1.06	0.0033
GO:0004332	fructose-bisphosphate aldolase activity	3	2	0.14	0.0061
GO:0005042	netrin receptor activity	3	2	0.14	0.0061
GO:0043237	laminin-1 binding	3	2	0.14	0.0061
GO:0030159	signaling receptor complex adaptor activ…	20	4	0.92	0.0118
GO:0016832	aldehyde-lyase activity	4	2	0.18	0.0119
GO:0005547	phosphatidylinositol-3,4,5-trisphosphate…	12	3	0.55	0.0155
GO:0016668	oxidoreductase activity, acting on a sul…	5	2	0.23	0.0192
GO:0043325	phosphatidylinositol-3,4-bisphosphate bi…	5	2	0.23	0.0192
GO:0050693	LBD domain binding	5	2	0.23	0.0192
GO:0019205	nucleobase-containing compound kinase ac…	14	3	0.64	0.0239
GO:0005068	transmembrane receptor protein tyrosine …	6	2	0.28	0.0279
GO:0008236	serine-type peptidase activity	82	8	3.77	0.0335
GO:0016829	lyase activity	82	8	3.77	0.0335
GO:0035259	glucocorticoid receptor binding	16	3	0.74	0.0344
GO:0035591	signaling adaptor activity	28	4	1.29	0.0374
GO:0005092	GDP-dissociation inhibitor activity	7	2	0.32	0.0379
GO:0016831	carboxy-lyase activity	17	3	0.78	0.0404

kable(head(allRes.MF.weight,20), caption = "FISHER WEIGHTED")

FISHER WEIGHTED
GO.ID	Term	Annotated	Significant	Expected	classicFisher
GO:0047134	protein-disulfide reductase activity	2	2	0.09	0.0021
GO:0004332	fructose-bisphosphate aldolase activity	3	2	0.14	0.0061
GO:0043237	laminin-1 binding	3	2	0.14	0.0061
GO:0005042	netrin receptor activity	3	2	0.14	0.0061
GO:0030159	signaling receptor complex adaptor activ…	20	4	0.92	0.0118
GO:0005547	phosphatidylinositol-3,4,5-trisphosphate…	12	3	0.55	0.0155
GO:0043325	phosphatidylinositol-3,4-bisphosphate bi…	5	2	0.23	0.0192
GO:0050693	LBD domain binding	5	2	0.23	0.0192
GO:0008236	serine-type peptidase activity	82	8	3.77	0.0335
GO:0035259	glucocorticoid receptor binding	16	3	0.74	0.0344
GO:0005092	GDP-dissociation inhibitor activity	7	2	0.32	0.0379
GO:0016831	carboxy-lyase activity	17	3	0.78	0.0404
GO:0019206	nucleoside kinase activity	2	2	0.09	0.0458
GO:0047381	dodecanoyl-[acyl-carrier-protein] hydrol…	1	1	0.05	0.0460
GO:0001587	Gq/11-coupled serotonin receptor activit…	1	1	0.05	0.0460
GO:0004878	complement component C5a receptor activi…	1	1	0.05	0.0460
GO:0004136	deoxyadenosine kinase activity	1	1	0.05	0.0460
GO:0004137	deoxycytidine kinase activity	1	1	0.05	0.0460
GO:0004138	deoxyguanosine kinase activity	1	1	0.05	0.0460
GO:0042164	interleukin-12 alpha subunit binding	1	1	0.05	0.0460

kable(head(allRes.MF.parentchild,20), caption = "FISHER PARENTCHILD")

FISHER PARENTCHILD
GO.ID	Term	Annotated	Significant	Expected	classicFisher
GO:0005042	netrin receptor activity	3	2	0.14	0.0017
GO:0047134	protein-disulfide reductase activity	2	2	0.09	0.0095
GO:0005547	phosphatidylinositol-3,4,5-trisphosphate…	12	3	0.55	0.0112
GO:0043325	phosphatidylinositol-3,4-bisphosphate bi…	5	2	0.23	0.0150
GO:0050693	LBD domain binding	5	2	0.23	0.0160
GO:0019205	nucleobase-containing compound kinase ac…	14	3	0.64	0.0160
GO:0035472	choriogonadotropin hormone receptor acti…	1	1	0.05	0.0272
GO:0004964	luteinizing hormone receptor activity	1	1	0.05	0.0272
GO:0038106	choriogonadotropin hormone binding	1	1	0.05	0.0274
GO:0016668	oxidoreductase activity, acting on a sul…	5	2	0.23	0.0289
GO:1905538	polysome binding	1	1	0.05	0.0303
GO:0016500	protein-hormone receptor activity	11	2	0.51	0.0303
GO:0019206	nucleoside kinase activity	2	2	0.09	0.0330
GO:0004471	malate dehydrogenase (decarboxylating) (…	1	1	0.05	0.0333
GO:0004473	malate dehydrogenase (decarboxylating) (…	1	1	0.05	0.0333
GO:0047016	cholest-5-ene-3-beta,7-alpha-diol 3-beta…	1	1	0.05	0.0333
GO:0016830	carbon-carbon lyase activity	23	5	1.06	0.0360
GO:0005251	delayed rectifier potassium channel acti…	21	3	0.97	0.0370
GO:0016774	phosphotransferase activity, carboxyl gr…	1	1	0.05	0.0383
GO:0004001	adenosine kinase activity	1	1	0.05	0.0395

Tambien podemos generar un analisis de los Procesos Biológicos,usando otro algoritmo para el análisis de Fisher

GOdata.BP <- new("topGOdata", 
                 ontology = "BP",
                 description = "BP on up/down regulated genes",
                 allGenes = geneList,
                 annot = annFUN.gene2GO,
                 gene2GO = gene2GO)

## 
## Building most specific GOs .....

##  ( 9766 GO terms found. )

## 
## Build GO DAG topology ..........

##  ( 13682 GO terms and 32061 relations. )

## 
## Annotating nodes ...............

##  ( 4387 genes annotated to the GO terms. )

resultFisher <- runTest(GOdata.BP,
                        algorithm = "elim", 
                        statistic = "fisher")

## 
##           -- Elim Algorithm -- 
## 
##       the algorithm is scoring 4893 nontrivial nodes
##       parameters: 
##           test statistic: fisher
##           cutOff: 0.01

## 
##   Level 19:  1 nodes to be scored    (0 eliminated genes)

## 
##   Level 18:  3 nodes to be scored    (0 eliminated genes)

## 
##   Level 17:  7 nodes to be scored    (0 eliminated genes)

## 
##   Level 16:  13 nodes to be scored   (0 eliminated genes)

## 
##   Level 15:  37 nodes to be scored   (0 eliminated genes)

## 
##   Level 14:  80 nodes to be scored   (3 eliminated genes)

## 
##   Level 13:  153 nodes to be scored  (3 eliminated genes)

## 
##   Level 12:  251 nodes to be scored  (43 eliminated genes)

## 
##   Level 11:  416 nodes to be scored  (45 eliminated genes)

## 
##   Level 10:  556 nodes to be scored  (82 eliminated genes)

## 
##   Level 9:   677 nodes to be scored  (89 eliminated genes)

## 
##   Level 8:   693 nodes to be scored  (111 eliminated genes)

## 
##   Level 7:   704 nodes to be scored  (111 eliminated genes)

## 
##   Level 6:   601 nodes to be scored  (142 eliminated genes)

## 
##   Level 5:   385 nodes to be scored  (151 eliminated genes)

## 
##   Level 4:   199 nodes to be scored  (211 eliminated genes)

## 
##   Level 3:   96 nodes to be scored   (256 eliminated genes)

## 
##   Level 2:   20 nodes to be scored   (256 eliminated genes)

## 
##   Level 1:   1 nodes to be scored    (256 eliminated genes)

resultKS <- runTest(GOdata.BP,
                    algorithm = "elim", 
                    statistic = "KS")

## 
##           -- Elim Algorithm -- 
## 
##       the algorithm is scoring 13682 nontrivial nodes
##       parameters: 
##           test statistic: ks
##           cutOff: 0.01
##           score order: increasing

## 
##   Level 19:  4 nodes to be scored    (0 eliminated genes)

## 
##   Level 18:  18 nodes to be scored   (0 eliminated genes)

## 
##   Level 17:  40 nodes to be scored   (0 eliminated genes)

## 
##   Level 16:  84 nodes to be scored   (0 eliminated genes)

## 
##   Level 15:  178 nodes to be scored  (12 eliminated genes)

## 
##   Level 14:  346 nodes to be scored  (12 eliminated genes)

## 
##   Level 13:  636 nodes to be scored  (60 eliminated genes)

## 
##   Level 12:  1089 nodes to be scored (188 eliminated genes)

## 
##   Level 11:  1515 nodes to be scored (192 eliminated genes)

## 
##   Level 10:  1859 nodes to be scored (230 eliminated genes)

## 
##   Level 9:   1982 nodes to be scored (365 eliminated genes)

## 
##   Level 8:   1892 nodes to be scored (634 eliminated genes)

## 
##   Level 7:   1653 nodes to be scored (1298 eliminated genes)

## 
##   Level 6:   1224 nodes to be scored (1658 eliminated genes)

## 
##   Level 5:   691 nodes to be scored  (2144 eliminated genes)

## 
##   Level 4:   326 nodes to be scored  (3153 eliminated genes)

## 
##   Level 3:   121 nodes to be scored  (3248 eliminated genes)

## 
##   Level 2:   23 nodes to be scored   (3435 eliminated genes)

## 
##   Level 1:   1 nodes to be scored    (4295 eliminated genes)

allRes.BP <- GenTable(GOdata.BP, 
                      classic = resultFisher, 
                      elimKS = resultKS, 
                      orderBy = "elimKS", 
                      topNodes=20)
allRes.BP

##         GO.ID                                        Term Annotated Significant
## 1  GO:0008150                          biological_process      4387         205
## 2  GO:0009987                            cellular process      4216         194
## 3  GO:0045471                         response to ethanol       192           6
## 4  GO:0048546               digestive tract morphogenesis        22           2
## 5  GO:0051384                  response to glucocorticoid       227          13
## 6  GO:0015711                     organic anion transport       261          16
## 7  GO:1904646           cellular response to amyloid-beta        32           0
## 8  GO:1901700      response to oxygen-containing compound      1259          50
## 9  GO:0046854        phosphatidylinositol phosphorylation        22           0
## 10 GO:0007608                 sensory perception of smell        33           0
## 11 GO:0042755                             eating behavior        37           0
## 12 GO:0045039 protein insertion into mitochondrial inn...         4           0
## 13 GO:0042493                            response to drug       450          20
## 14 GO:0006814                        sodium ion transport       147           9
## 15 GO:0007568                                       aging       316           9
## 16 GO:0001975                     response to amphetamine        40           3
## 17 GO:0051984 positive regulation of chromosome segreg...         9           0
## 18 GO:0032094                            response to food        41           0
## 19 GO:0062197        cellular response to chemical stress       216           9
## 20 GO:1904045            cellular response to aldosterone         6           0
##    Expected Rank in elimKS classic  elimKS
## 1    202.63              1    0.06 < 1e-30
## 2    194.73              2    0.79 7.8e-05
## 3      8.87              3    0.88 0.00075
## 4      1.02              4    0.27 0.00097
## 5     10.48              5    0.24 0.00100
## 6     12.06              6    0.14 0.00110
## 7      1.48              7    1.00 0.00114
## 8     58.15              8    0.95 0.00145
## 9      1.02              9    1.00 0.00158
## 10     1.52             10    1.00 0.00165
## 11     1.71             11    1.00 0.00173
## 12     0.18             12    1.00 0.00177
## 13    20.78             13    0.96 0.00203
## 14     6.79             14    0.23 0.00211
## 15    14.60             15    0.96 0.00232
## 16     1.85             16    0.28 0.00253
## 17     0.42             17    1.00 0.00253
## 18     1.89             18    1.00 0.00253
## 19     9.98             19    0.67 0.00259
## 20     0.28             20    1.00 0.00272

# BiocManager::install("Rgraphviz")
library(Rgraphviz)

## Warning: package 'Rgraphviz' was built under R version 4.0.3

## Loading required package: grid

## 
## Attaching package: 'grid'

## The following object is masked from 'package:topGO':
## 
##     depth

## 
## Attaching package: 'Rgraphviz'

## The following objects are masked from 'package:IRanges':
## 
##     from, to

## The following objects are masked from 'package:S4Vectors':
## 
##     from, to

showSigOfNodes(GOdata.BP,
               score(resultKS), 
               firstSigNodes = 5,
               useInfo = 'all')

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 26 
## Number of Edges = 36 
## 
## $complete.dag
## [1] "A graph with 26 nodes."

Analisis previos base de datos

Guzman-Ruiz,M.A. and Cardenas-Rivera,A.

28 de diciembre de 2020

ANALISIS PRELIMINAR

revisando bases de datos