提供药物-疾病-靶基因数据 use all backgroud database
http://ctdbase.org/tools/batchQuery.go search COVID-19 批量下载,ALL, Infered, Curated. 数量对不上。可能需要根据InferenceScore筛选。
http://ctdbase.org/detail.go?type=disease&acc=MESH%3AD000086382&view=gene 下载回来没有GENE信息
最后使用 http://ctdbase.org/downloads/#cd 完整后台数据库下载
药物ID使用MeshID与PubChem相同, 疾病ID使用MeshID 或 omimID,但是对于COVID-19没有记录
## [1] 14448
https://www.disgenet.org/covid/genes/summary/
病毒与宿主蛋白相互作用(但没有提及具体是哪个病毒蛋白,似乎没有),没有下载链接,由Download页面下载回来的数据是统计之后的结果,例如疾病关联基因的数目等。
TO DISCUSS.
该数据库提供 https://www.disgenet.org/static/disgenet_ap1/files/downloads/readme.txt For Gene-Disease associations:
disease_associations.tsv.gz => Diseases associated to genes from DisGeNET
The columns in the files are: diseaseId -> UMLS concept unique
identifier diseaseName -> Name of the disease
diseaseType -> The DisGeNET disease type: disease, phenotype and
group diseaseClass -> The MeSH disease class(es) diseaseSemanticType
-> The UMLS Semantic Type(s) of the disease NofGenes -> Number of
genes associated to the disease NofPmids -> Number of publications
associated to the disease
gene_associations.tsv.gz => Genes associated to Diseases from DisGeNET
The columns in the files are: geneId -> NCBI Entrez Gene
Identifier geneSymbol -> Official Gene Symbol DSI -> The Disease
Specificity Index for the gene DPI -> The Disease Pleiotropy Index
for the gene PLI -> The probability for the gene of being
loss-of-function intolerant, provided by the GNOMAD consortium
protein_class -> Protein Class identifier according to the Drug
Target Ontology
protein_class_name -> Protein Class according to the Drug Target
Ontology
NofDiseases -> Number of diseases associated to the gene NofPmids
-> Number of publications associated to the gene
For Variant-Disease associations:
disease_associations.tsv.gz => Diseases associated to variants from DisGeNET
The columns in the files are: diseaseId -> UMLS concept unique
identifier diseaseName -> Name of the disease
diseaseType -> The DisGeNET disease type: disease, phenotype and
group diseaseClass -> The MeSH disease class(es) diseaseSemanticType
-> The UMLS Semantic Type(s) of the disease NofSnps -> Number of
variants associated to the disease NofPmids -> Total number of
publications reporting the Variant-Disease association
variant_associations.tsv.gz => Variants associated to diseases from DisGeNET
The columns in the files are: snpId -> dbSNP variant Identifier chromosome -> Chromosome of the variant position -> Position in chromosome DSI -> The Disease Specificity Index for the variant DPI -> The Disease Pleiotropy Index for the variant NofDiseases -> Number of diseases associated to the variant NofPmids -> Total number of publications reporting the Variant-Disease association
disease_mappings.tsv.gz => Mappings from UMLS concept unique identifier to disease vocabularies: DO, EFO, HPO, ICD9CM, MSH, NCI, OMIM, and ORDO
variant_to_gene_mappings.tsv.gz => Variant mapped to their corresponding genes, according to dbSNP.
The columns in the files are: snpId -> dbSNP variant Identifier geneId -> NCBI Entrez Gene Identifier geneSymbol -> Official Gene Symbol
https://www.genecards.org/Search/Keyword?queryString=covid-19
只显示了关联人的基因列表,亦无相互作用信息,不用?
http://db.idrblab.net/ttd/search/ttd/covid19-target?search_api_fulltext=COVID-19
包括的人的Taget蛋白对应药物的信息,可是只知道是target,而不知道是病毒哪个蛋白的target?
可以作为Drugbank drug-target 的扩展补充
https://www.ncbi.nlm.nih.gov/gene/?term=covid-19
仅病毒蛋白的列表,似乎应用就是不同数据库的蛋白列表汇总到一起,有可能个别数据库不全? 不用
OMIM中新增COVID记录,仅为人基因列表,未提供与病毒蛋白的联系信息。可作为扩展网络
https://www.pharmgkb.org/disease/PA166197121/related#genes
仅提供人的基因列表,亦无具体与病毒蛋白的联系信息。同上?
可下载的其他数据
Clinical annotation summaries.
Variant annotation summary.
Relationships summarized from PharmGKB annotations.
Detailed clinical guideline annotations in JSON(opens in new window) format:
Drug label annotations in TSV(opens in new window) format:
Pathways data in BioPax XML(opens in new window) and TSV(opens in new window) formats:
This file contains a list of variant-drug pairs and level of evidence for all clinical annotations in TSV(opens in new window) format:
Protein-Protein Interactions derived from Reactome pathways full dataset.
目前已写好代码将BioPax 3 转换为wide table形式 pathway has_component ; reaction has_input …; reaction has_output …
## [1] 103285 9
a pathway Regulation of Apoptosis Biopax file parse
https://reactome.org/content/detail/R-HSA-169911
##
indexed 0B in 0s, 0B/s
indexed 1.00TB in 0s, 1.69PB/s
indexed 0B in 0s, 0B/s
indexed 1.00TB in 0s, 597.05TB/s
## [1] "3"
## [1] "Regulation of ApoptosisA regulated balance between cell survival and apoptosis is essential for normal\r\ndevelopment and homeostasis of multicellular organisms (see Matsuzawa, 2001). Defects in control of this balance may contribute to autoimmune disease, neurodegeneration and cancer. Protein ubiquitination and degradation is one of the major mechanisms that regulate apoptotic cell death (reviewed in Yang and Yu 2003).Authored: Jakobi, R, 2008-02-05 11:04:14Reviewed: Chang, E, 2008-05-21 00:05:41Edited: Matthews, L, 2008-02-12 16:13:24Edited: Matthews, L, 2008-06-12 00:23:53"
## [2] "Regulation of activated PAK-2p34 by proteasome mediated degradationStimulation of cell death by PAK-2 requires the generation and stabilization of the caspase-activated form, PAK-2p34 (Walter et al., 1998;Jakobi et al., 2003). Levels of proteolytically activated PAK-2p34 protein are controlled by ubiquitin-mediated proteolysis. PAK-2p34 but not full-length PAK-2 is degraded by the 26 S proteasome (Jakobi et al., 2003). It is not known whether ubiquitination and degradation of PAK-2p34 occurs in the cytoplasm or in the nucleus.Authored: Jakobi, R, 2008-02-05 11:04:14Reviewed: Chang, E, 2008-05-21 00:05:41Edited: Matthews, L, 2008-02-03 20:50:13Edited: Matthews, L, 2008-06-12 00:23:53"
## [3] "Regulation of PAK-2p34 activity by PS-GAP/RHG10PS-GAP (RGH10) interacts specifically with caspase-activated PAK-2p34 reducing the ability of PAK-2p34 to induce cell death. This interaction inhibits the kinase activity of PAK-2p34 and changes the localization of PAK-2p34 from the nucleus to the perinuclear region (Koeppel et al., 2004).Authored: Jakobi, R, 2008-02-05 11:04:14Reviewed: Chang, E, 2008-05-21 00:05:41Edited: Matthews, L, 2008-02-03 20:50:13Edited: Matthews, L, 2008-06-12 00:23:53"
## [1] "A regulated balance between cell survival and apoptosis is essential for normal\r\ndevelopment and homeostasis of multicellular organisms (see Matsuzawa, 2001). Defects in control of this balance may contribute to autoimmune disease, neurodegeneration and cancer. Protein ubiquitination and degradation is one of the major mechanisms that regulate apoptotic cell death (reviewed in Yang and Yu 2003)."
## [2] "Authored: Jakobi, R, 2008-02-05 11:04:14"
## [3] "Reviewed: Chang, E, 2008-05-21 00:05:41"
## [4] "Edited: Matthews, L, 2008-02-12 16:13:24"
## [5] "Edited: Matthews, L, 2008-06-12 00:23:53"
## [6] "Stimulation of cell death by PAK-2 requires the generation and stabilization of the caspase-activated form, PAK-2p34 (Walter et al., 1998;Jakobi et al., 2003). Levels of proteolytically activated PAK-2p34 protein are controlled by ubiquitin-mediated proteolysis. PAK-2p34 but not full-length PAK-2 is degraded by the 26 S proteasome (Jakobi et al., 2003). It is not known whether ubiquitination and degradation of PAK-2p34 occurs in the cytoplasm or in the nucleus."
## [7] "Authored: Jakobi, R, 2008-02-05 11:04:14"
## [8] "Reviewed: Chang, E, 2008-05-21 00:05:41"
## [9] "Edited: Matthews, L, 2008-02-03 20:50:13"
## [10] "Edited: Matthews, L, 2008-06-12 00:23:53"
## [11] "PS-GAP (RGH10) interacts specifically with caspase-activated PAK-2p34 reducing the ability of PAK-2p34 to induce cell death. This interaction inhibits the kinase activity of PAK-2p34 and changes the localization of PAK-2p34 from the nucleus to the perinuclear region (Koeppel et al., 2004)."
## [12] "Authored: Jakobi, R, 2008-02-05 11:04:14"
## [13] "Reviewed: Chang, E, 2008-05-21 00:05:41"
## [14] "Edited: Matthews, L, 2008-02-03 20:50:13"
## [15] "Edited: Matthews, L, 2008-06-12 00:23:53"
## class id property property_attr
## 1: BioSource BioSource1 name rdf:datatype
## 2: BioSource BioSource1 xref rdf:resource
## 3: BiochemicalReaction BiochemicalReaction1 comment rdf:datatype
## 4: BiochemicalReaction BiochemicalReaction1 comment rdf:datatype
## 5: BiochemicalReaction BiochemicalReaction1 comment rdf:datatype
## 6: BiochemicalReaction BiochemicalReaction1 comment rdf:datatype
## property_attr_value
## 1: http://www.w3.org/2001/XMLSchema#string
## 2: #UnificationXref2
## 3: http://www.w3.org/2001/XMLSchema#string
## 4: http://www.w3.org/2001/XMLSchema#string
## 5: http://www.w3.org/2001/XMLSchema#string
## 6: http://www.w3.org/2001/XMLSchema#string
## property_value
## 1: Homo sapiens
## 2:
## 3: PAK-2p34 is ubiquitinated prior to degradation (Jakobi et al., 2003). Here, ubiquitination of PAK-2p34 is described as occurring in the cytosol. However, to date it is not known whether this occurs in the nucleus or in the cytoplasm. Evidence for this reaction comes from experiments using both human and rabbit proteins. The polyubiquitin synthesized in the reaction is inferred to contain lysine-48 (K48) linkages because the modified protein is targeted to the proteasome (Komander 2009).
## 4: Authored: Jakobi, R, 2008-02-05 11:04:14
## 5: Reviewed: Chang, E, 2008-05-21 00:05:41
## 6: Edited: Matthews, L, 2008-02-03 20:50:13
## [1] "Regulation of Apoptosis"
## [1] "BiochemicalReaction1" "BiochemicalReaction2" "BiochemicalReaction3"
## [4] "BiochemicalReaction4" "BiochemicalReaction5"
## property_value
## 1: OMA1 hydrolyses OPA1
## class id property property_attr
## 1: Protein Protein17 cellularLocation rdf:resource
## 2: Protein Protein17 comment rdf:datatype
## 3: Protein Protein17 dataSource rdf:resource
## 4: Protein Protein17 displayName rdf:datatype
## 5: Protein Protein17 entityReference rdf:resource
## 6: Protein Protein17 feature rdf:resource
## 7: Protein Protein17 feature rdf:resource
## 8: Protein Protein17 feature rdf:resource
## 9: Protein Protein17 name rdf:datatype
## 10: Protein Protein17 xref rdf:resource
## 11: Protein Protein17 xref rdf:resource
## property_attr_value property_value
## 1: #CellularLocationVocabulary1
## 2: http://www.w3.org/2001/XMLSchema#string Reactome DB_ID: 3095873
## 3: #Provenance1
## 4: http://www.w3.org/2001/XMLSchema#string K48polyUb-p-T402-PAK-2p43
## 5: #ProteinReference1
## 6: #ModificationFeature1
## 7: #ModificationFeature2
## 8: #FragmentFeature16
## 9: http://www.w3.org/2001/XMLSchema#string K48polyUb-phospho-PAK-2p34(Thr402)
## 10: #UnificationXref42
## 11: #UnificationXref43
## property_value
## 1: OPA1
## property_value
## 1: OPA1
## 2: H2O
## property_value
## 1: OPA1(88-194)
## 2: Dynamin-like 120 kDa protein, mitochondrial ecNumber3.6.5.5/ecNumber
## 3: OPA1(195-960)
## 4: Dynamin-like 120 kDa protein, mitochondrial ecNumber3.6.5.5/ecNumber
/etc/init.d/neo4j restart http://127.0.0.1:7474
not provide full dataset download
a single pathway Apoptosis KGML to data.frame
可以形成包括激活 抑制信息的蛋白相互作用表
download hsa 347 kegg kgml then in to interaction network
## [1] 124718 5
ConsensusPathDB_human_PPI network in tab-delimited format and PSI-MI (level 2.5) format CPDB_pathways_genes txt file http://cpdb.molgen.mpg.de/
## [1] 554471 9
## [1] 4681 4
Biogrid-all-mitab PSI-MI 2.5 XML to table
仅相互作用没有方向
## [1] 2407713 15
BIOGRID-CORONAVIRUS-4.4.212.tab3.txt
## [1] 35076 37
## [1] 29710 37
https://www.ebi.ac.uk/intact/home
https://www.ebi.ac.uk/intact/search?query=annot:%22dataset:coronavirus%22
页面呈现与PSI-MI TAB 一致,但是不能下载,仅提供PSIMI XML3格式
use PSI-MI TAB format FULL dataset
## [1] 1194447 42
The data primarily covers protein-protein and several RNA-protein interactions involving SARS-CoV2 and SARS-CoV. All interactions from the relevant publications are covered in this dataset, including interactions with other organism.
## 1 Entries found
## Parsing entry 1
## Parsing experiments: ..
## Parsing interactors:
##
14% ======>
29% ============>
43% =================>
57% =======================>
71% ============================>
86% ==================================>
100% ========================================>
## Parsing interactions:
## ......
## 1 Entries found
## Parsing entry 1
## Parsing experiments: ..
## Parsing interactors:
##
14% ======>
29% ============>
43% =================>
57% =======================>
71% ============================>
86% ==================================>
100% ========================================>
## Parsing interactions:
## ......