1 Introduction

Mining ClinicalTrials.gov for Target Hypotheses: AACT Drug-Targets Analysis

ClinicalTrials.gov was first released in 2000. As of March 2019, ClinicalTrials.gov includes 300,676 research studies in all 50 states and in 208 countries. The the CTTI AACT project and database provides a harmonizing schema and convenient access. However, there remain major challenges to knowledge discovery using these data, such as lack of standard terminology. To address this, for the use case of elucidating drug target hypotheses, we have used state of the art domain specialized text mining with synonym resolution for specific classes of entities: (1) chemicals and (2) diseases. Chemicals are identified and resolved using NextMove Leadmine. Diseases, indications and other phenotypic terms are mined via JensenLab Tagger with Disease Ontology dictionary, and NLM supplied MeSH terms. Protein targets are associated via ChEMBL bioactivities on molecular structure cross-referencing. Another fundamental challenge is to assess confidence of inferences from noisy and disparate data. We propose a scoring system for assessing confidence for target hypotheses inferred from aggregated clinical trials, with emphasis on higher confidence, novel predictions with the potential to illuminate the understudied druggable genome.

1.1 Issues

  • Prior belief is that target NER is not likely to be useful, since clinical trials descriptive text is not generally written to communicate molecular mechanisms to research scientists, but with focus on clinical efficacy and safety. In due diligence we perform target NER, and to quantify concordance or refutation with our prior belief, we also perform target NER on arbitrary non-biomedical text, tweets from the Twitter API for #brexit (26 Nov 2019). We find that 8.64 target entities per 1000 chars in the tweets, vs. 6.63 in the clinical trials descriptions. While not proof this does support our belief and less direct method via chemical NER.

1.2 To do

  • We intend to calculate multivariate μ scores to assess and rank disease-target associations. So identification of suitable evidence variables are needed.

1.3 Identifier mappings:

NCT_ID →(JensenLab:Tagger)→ DOID
NCT_ID →(AACT)→ MeSH
NCT_ID →(NextMove:LeadMine)→ SMILES
SMILES →(PubChem)→ CID
CID →(PubChem)→ INCHIKEY
INCHIKEY →(ChEMBL)→ MOLECULE_CHEMBL_ID
MOLECULE_CHEMBL_ID →(ChEMBL)→ ACTIVITY_ID
ACTIVITY_ID →(ChEMBL)→ TARGET_CHEMBL_ID
TARGET_CHEMBL_ID →(ChEMBL)→ COMPONENT_ID
COMPONENT_ID →(ChEMBL)→ UNIPROT
ACTIVITY_ID →(ChEMBL)→ DOCUMENT_CHEMBL_ID
DOCUMENT_CHEMBL_ID →(ChEMBL)→ PUBMED_ID

1.4 Input files:

  • (CTTI AACT) aact_studies.tsv
  • (CTTI AACT) aact_drugs.tsv
  • (CTTI AACT) aact_descriptions.tsv
  • (NextMove LeadMine) aact_drugs_leadmine.tsv
  • (PubChem) aact_drugs_smi_pubchem_cid.tsv
  • (PubChem) aact_drugs_smi_pubchem_cid2ink.tsv
  • (ChEMBL) aact_drugs_ink2chembl.tsv
  • (ChEMBL) aact_drugs_chembl_activity.tsv
  • (ChEMBL) aact_drugs_chembl_target_component.tsv
  • (ChEMBL) aact_drugs_chembl_document.tsv
  • (IDG TCRD/Pharos) pharos_targets.tsv
  • (JensenLab Tagger) aact_descriptions_tagger_disease_matches.tsv
  • (JensenLab Dictionary) diseases_entities.tsv

nct_id is the study ID.

## [1] "Mon Jun 27 14:43:52 2022"
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'tibble'
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'pillar'
## Warning: replacing previous import 'vctrs::data_frame' by 'tibble::data_frame'
## when loading 'dplyr'

2 AACT Db Timestamp

## [1] "AACT database timestamp: 2021-05-18"

3 Input studies and drugs

3.1 Studies

Read file of all studies in AACT.

## [1] "Total studies: 418604 ; unique NCT_IDs: 418604"

3.1.1 Study references

Reference type results_reference may offer greater evidence, confidence.

## [1] "references: 726627; NCT_IDs: 147606; PMIDs: 524644; results_references: 0"

3.2 Drugs

Read file of all drugs in AACT.

  • id is AACT INTERVENTION_ID, corresponding with an instance of a drug, dose, delivery, etc. in a study.
  • Note that one study may involve multiple drugs.
  • At this point a “drug” is imprecisely identified by name, generally one of many synonyms.
## [1] "Unique drug names: 116161 ; unique intervention IDs: 320883"

3.3 Studies: Interventional drug studies only

Select only Interventional studies (study_type) associated with drugs (via NCT_ID). No Observational this analysis.

## [1] "Interventional studies: 324012 (77.4%)"
## [1] "Interventional drug studies: 156191 ; unique NCT_IDs: 156191"
Drug studies and drugs, by phase
phase N_studies N_drugs
Early Phase 1 2707 4467
Not Applicable 15774 27328
Phase 1 30223 62067
Phase 1/Phase 2 8818 17623
Phase 2 42801 86301
Phase 2/Phase 3 4345 8627
Phase 3 27607 59641
Phase 4 23913 44081
NA 3 10748
Drug studies and drugs, by overall_status
overall_status N_studies N_drugs
Completed 89594 180172
Recruiting 19414 39005
Unknown status 14589 26626
Terminated 13490 26209
Active, not recruiting 6950 15457
Not yet recruiting 5573 10403
Withdrawn 5381 10112
Enrolling by invitation 604 1006
Suspended 596 1152

3.4 Drug studies by Phase and Status

3.5 Drug studies and drugs by start_year

## Warning: Ignoring 1 observations

## Warning: Ignoring 1 observations

4 NextMove Leadmine Chemical NER

AACT drug names resolved to standard names and structures via SMILES. Note that one name may include multiple chemicals. Now we can use cheminformatically rigorous counts for drugs as active pharmaceutical ingredients (APIs).

## [1] "Drug unique SMILES resolved by LeadMine: 5080 ; unique intervention IDs: 207236; unique names: 16114"

4.1 Chemical NER mentions

4.1.1 Totals by merging of synonyms to resolved structure (locally canonical SMILES)

Top 20 drugs by total mentions
smi2img N_mentions names
3411 Abraxane; PACLITAXEL; PACLitaxel; Paclitaxel; Taxol; abraxane; paclitaxel; taxol
3018 CYCLOPHOSPHAMIDE; Ciclophosphamide; Cyclophosphamid; Cyclophosphamide; Cylophosphamide; ciclophosphamide; cyclo phosphamide; cyclophosphamide
2932 CISPLATIN; Cis Platinum; Cis-platinum; Cisplatin; Cisplatine; Cisplatinum; cis Platinum; cis-platinum; cisplatin; cisplatine; cisplatinum
2679 DEXAMETHASONE; Dexamethason; Dexamethasone; Dexamethosone; Maxitrol; OZURDEX; Oradexon; Ozurdex; dexamethason; dexamethasone; dexamethosone
2577 CARBOPLATIN; Carboplatin; Carboplatine; Paraplatin; carboplatin; carboplatine
2042 DOCETAXEL; Docetaxel; docetaxel
1883 METFORMIN; MetFORMIN; Metformin; Metformine; metformin; metformine
1857 GEMCITABINE; Gemcitabin; Gemcitabine; gemcitabin; gemcitabine
1634 CAPECITABINE; Capecitabin; Capecitabine; XELODA; Xeloda; capecitabine; xeloda
1485 BUPIVACAINE; Bupivacain; Bupivacaine; EXPAREL; Exparel; SKY0402; bupivacain; bupivacaine
1409 Cortancyl; Lodotra; Meticorten; PREDNISON; Prednison; Prednisone; RAYOS; prednison; prednisone
1389 0xaliplatin; Eloxatin; OXALIPLATIN; OXAliplatin; Oxaliplatin; Oxaliplatine; eloxatin; oxaliplatin; oxaliplatine
1315 LIDOCAINE; LMX 4; LMX4; Lidocain; Lidocaine; Lidoderm; Lignocain; Lignocaine; Oraqix; lidocain; lidocaine; lignocaine
1301 METHOTREXATE; Methotrexat; Methotrexate; Metoject; methotrexate
1286 NORMAL SALINE; Normal Saline; Normal saline; normal Saline; normal salin; normal saline
1205 ETOPOSIDE; Etoposid; Etoposide; etoposide
1122 DEXMEDETOMIDINE; Dexmedetomidin; Dexmedetomidine; dexmedetomidin; dexmedetomidine
1108 Diprivan; PROPOFOL; Propofol; propofol
1034 CYTARABINE; Cytarabine; Cytosar; DepoCyt; DepoCyte; Depocyt; Depocyte; cytarabine; cytosar
1004 FK-506; FK506; TACROLIMUS; Tacrolimus; tacrolimus

4.1.2 Chemical NER mentions resolved to structures (SMILES)

4.1.3 Chemical NER mentions by intervention ID.

## [1] "Mentions by intervention ID: 190323 / 207236 (91.8%)"

4.1.4 Chemical NER mentions by trial (NCT ID).

## [1] "Mentions by study: 112906 / 121369 (93.0%)"

4.1.5 Chemical NER mentions by drug, i.e. name in AACT.

## [1] "Mentions by drug name: 12128 / 70593 (17.2%)"

5 PubChem:

5.1 Intervention IDs to CIDs from PubChem

SMILES mapping normally problematic without canonicalization, but we use the same SMILES strings, generated by NextMove, and associated with CIDs via PubChem REST API.

## [1] "PubChem CIDs with InChIKeys: 4257"
## [1] "PubChem SMILES2CID hits: 4429 / 4429 (100.0%)"
## [1] "Intervention IDs mapped to PubChem CIDs (via SMILES): 174236"

6 IDG/TCRD:

For Target Development Level (TDL) and other metadata.

7 ChEMBL:

7.1 ChEMBL molecule IDs, and properties (via InChIKeys)

Perhaps should instead use PubChem CIDs and UniChem.

## [1] "ChEMBL compounds mapped via InChIKeys: 3759"

7.2 ChEMBL activities for mapped compounds

Select only activities with pChembl values for relevance to protein targets and confidence.

## Warning: 13 parsing failures.
##    row        col           expected         actual                                  file
## 210259 text_value 1/0/T/F/TRUE/FALSE Active         'data/aact_drugs_chembl_activity.tsv'
## 210260 text_value 1/0/T/F/TRUE/FALSE Active         'data/aact_drugs_chembl_activity.tsv'
## 367962 text_value 1/0/T/F/TRUE/FALSE Not Determined 'data/aact_drugs_chembl_activity.tsv'
## 367964 text_value 1/0/T/F/TRUE/FALSE Not Determined 'data/aact_drugs_chembl_activity.tsv'
## 368986 text_value 1/0/T/F/TRUE/FALSE Active         'data/aact_drugs_chembl_activity.tsv'
## ...... .......... .................. .............. .....................................
## See problems(...) for more details.
## [1] "ChEMBL activities: 1253035"
## [1] "ChEMBL activities molecules: 3311 ; canonical_smiles: 3311 ; targets: 7457 ; documents: 39931"
## Warning: Ignoring 1090041 observations

7.2.1 Activity and molecule counts by assay types

Activity and molecule counts by assay types
assay_type N_molecule N_activity
F:Functional 3107 437502
B:Binding 2615 332825
A:ADMET 2395 93824
T:Toxicity 1546 378589
P:Physicochemical 1411 9188
U:Unclassified 301 1107

7.2.2 Activity and molecule counts by assay source

Activity and molecule counts by assay sources
Source N_molecule N_activity
LITERATURE 3007 582095
SARS_COV_2 2126 11352
PUBCHEM_BIOASSAY 1790 65875
CARE 1343 3020
DRUGMATRIX 539 305785
ASTRAZENECA 533 1503
COADD 474 3799
TP_TRANSPORTER 420 3038
PATENT 344 2766
BINDINGDB 199 712
TUM_PROTEOMIC_KUSTER 178 51635
EUBOPEN_CGL 104 49902
TG_GATES 97 135146
DRUG_PK 69 860
DNDI 60 419
SANGER 58 31730
K4DD 25 151
HESI 24 826
MMV_PBOX 22 156
NOVARTIS 19 95
HARVARD 13 42
FDA_APPROVAL 13 620
GSK_TCMDC 10 84
ST_JUDE_LEISH 8 33
OSM 6 17
WINZ_PLASMO 4 24
SUPPLEMENTARY 3 15
GATES_LIBRARY 3 9
METABOLISM 2 3
DONATED_PROBES 2 1176
MMV_MBOX 1 109
GSK_TCAKS 1 13
ST_JUDE 1 16
PKIS2 1 1
SALVENSIS_LSHTM 1 8

7.3 ChEMBL targets (via activities)

## [1] "ChEMBL target proteins: 5968"
## [1] "ChEMBL target proteins mapped to TCRD (human): 3583"

7.4 ChEMBL targets by organism:

## [1] "Organisms: 350"
Targets by organism (top 10)
organism N_targets Types
Homo sapiens 3597 CHIMERIC PROTEIN; NUCLEIC-ACID; PROTEIN COMPLEX; PROTEIN COMPLEX GROUP; PROTEIN FAMILY; PROTEIN NUCLEIC-ACID COMPLEX; PROTEIN-PROTEIN INTERACTION; SELECTIVITY GROUP; SINGLE PROTEIN
Rattus norvegicus 817 PROTEIN COMPLEX; PROTEIN COMPLEX GROUP; PROTEIN FAMILY; PROTEIN-PROTEIN INTERACTION; SELECTIVITY GROUP; SINGLE PROTEIN
Mus musculus 559 CHIMERIC PROTEIN; PROTEIN COMPLEX; PROTEIN COMPLEX GROUP; PROTEIN FAMILY; PROTEIN-PROTEIN INTERACTION; SELECTIVITY GROUP; SINGLE PROTEIN
Bos taurus 151 PROTEIN COMPLEX; PROTEIN COMPLEX GROUP; PROTEIN FAMILY; SINGLE PROTEIN
Bacteria 67 PROTEIN COMPLEX; PROTEIN NUCLEIC-ACID COMPLEX; SINGLE PROTEIN
Sus scrofa 58 PROTEIN COMPLEX; PROTEIN FAMILY; SINGLE PROTEIN
Escherichia coli (strain K12) 51 PROTEIN COMPLEX; SINGLE PROTEIN
Escherichia coli K-12 40 PROTEIN FAMILY; SINGLE PROTEIN
Escherichia coli 39 PROTEIN COMPLEX; SINGLE PROTEIN
Mycobacterium tuberculosis 39 PROTEIN COMPLEX; SINGLE PROTEIN

7.5 ChEMBL human single-protein targets only, by IDG family.

## [1] "Human targets: 3597"
## [1] "Human single-protein targets: 2102 ; unique UniProts: 1932"
idgFamily N
Enzyme 692
Kinase 491
NA 335
GPCR 238
Transporter 97
IC 93
Epigenetic 72
TF 38
NR 38
TF; Epigenetic 6
oGPCR 2
## Warning: Ignoring 1 observations

7.6 Targets by IDG TDL:

## Warning: Ignoring 1 observations

8 JensenLab Tagger Diseases NER

With JensenLab DOID entities dictionary. On descriptions from detailed_descriptions table.

## [1] "Total disease mentions: 618528 (in 156191 studies)"

8.1 Disease mention totals by merging to resolved Disease Ontology term (DOID).

Top 20 diseases by total mentions
doid N_mentions terms
DOID:4137 25406 Cerebro vascular disorders; Cerebro-Vascular Accident; Cerebrovascular Accident; Cerebrovascular Accidents; Cerebrovascular Disease; Cerebrovascular accident; Cerebrovascular accidents; Cerebrovasc…
DOID:6193 24725 DIABETES; DIABETES MELLITUS; DIAbetes; DIabetes; Diabetes; Diabetes Mellitus; Diabetes mellitus; diabetes; diabetes Mellitus; diabetes mellitus; diabetes-mellitus
DOID:11203 15551 BREAST CANCER; BReast CAncer; BReast Cancer; BrEast cAncer; Breast Cancer; Breast Cancers; Breast cancer; Breast cancers; Breast tumor; Breast tumors; Breast-cancer; Mammary Tumors; Primary Breast …
DOID:6787 13598 OBESITY; OBesity; Obesity; obEsity; obe-sity; obesities; obesity
DOID:12733 13288 ASTHMA; Asthma; BHR; Bronchial hyper-reactivity; Bronchial hyperreactivity; EIA; Exercise induced asthma; Exercise-induced asthma; asthma; bronchial hyper reactivity; bronchial hyper-reactivity; br…
DOID:0110987 12967 HBP; HTN; HYPERTENSION; High Blood Pressure; High Blood pressure; High blood pressure; High-blood pressure; Hypertension; Hypertensive disease; Hypertensive diseases; high blood Pressure; high bloo…
DOID:0060145 9590 ANALGESIA; Analgesia; an-algesia; analgeSia; analgesia; analgesias
DOID:6195 7730 Diabetes Mellitus Type 2; Diabetes Mellitus Type II; Diabetes Mellitus type 2; Diabetes Mellitus, Type II; Diabetes mellitus Type 2; Diabetes mellitus non-insulin-dependent; Diabetes mellitus type …
DOID:13580 7548 C-HD; CAD; CORONARY ARTERY DISEASE; CORONARY SYNDROME; CORONARY syndrome; ChD; Coronary Heart Disease; Coronary ARtery DIsease; Coronary Artery Disease; Coronary Artery Diseases; Coronary Artery d…
DOID:14308 7058 NSCLC; NSCLCs; Non Small Cell Lung Cancer; Non Small Cell Lung Carcinoma; Non Small Cell Lung cancer; Non small cell lung cancer; Non small-cell lung cancer; Non- small cell lung cancer; Non-Small …
DOID:0110740 6722 Familial Prostate Cancer; HPC; Hereditary prostate cancer; PRostate Cancer; Prostate CAncer; Prostate Cancers; Prostate cancer; Prostatic Neoplasms; Prostatic cancer; hereditary prostate cancer; pr…
DOID:5461 5882 Influenza; inFLUenza; influenza; influenzas
DOID:0112146 5623 DementiA; Dementia; Dementias; dementia; dementias
DOID:2959 5574 SCHIZOPHRENIA; Schizophrenia; schizophrenia
DOID:11212 5276 CANCER; CANcer; CanCer; Cancer s; Cancers; MALIGNANT TUMORS; Malignant Tumor; Malignant neoplasm; Malignant neoplasms; Malignant tumor; Malignant tumors; Primary Cancer; Primary cancer; cancer s; c…
DOID:3068 5117 PNEUMONIA; Pneumonia; Pneumonias; acute pneumonia; pneumonia; pneumonias
DOID:3405 5068 Heart Attack; Heart attack; Heart attacks; MYOCARDIAL INFARCTION; Myocardial Infarct; Myocardial Infarction; Myocardial Infarctions; Myocardial infarct; Myocardial infarction; Myocardial infarction…
DOID:0111853 5056 MALARIA; Malaria; malaria; malarias
DOID:0111670 4748 Allergic Disease; Allergic Diseases; Allergic Syndrome; Allergic disease; Allergic diseases; Allergies; Allergy; Hyper-sensitivity; Hypersensitivity; allergic disease; allergic diseases; allergic d…
DOID:0111089 4710 ADD; ADHD; Attention Deficit Disorder; Attention Deficit Hyperactivity Disorder; Attention Deficit Hyperactivity Disorders; Attention Deficit Hyperactivity disorder; Attention Deficit disorder; Att…

8.2 Disease mentions by study.

Sort synonyms terms by frequency.

Disease mentions by study (Random sample of studies)
nct_id doid N_mentions disease_terms
NCT01411644 DOID:10561 2 ovarian dysfunction
NCT01411644 DOID:10368 1 amenorrhea
NCT01411644 DOID:2964 2 premature ovarian failure;POF
NCT02086630 DOID:3852 2 AIDS
NCT02535923 DOID:1219 4 psychotic disorders
NCT02627976 DOID:11203 1 breast cancer
NCT03369418 DOID:11723 3 PTSD;post-traumatic stress disorder
NCT03369418 DOID:1171 1 Anxiety
NCT03370458 DOID:0112239 1 diarrhea
NCT03370458 DOID:0110924 1 lactose intolerance
NCT03370458 DOID:0050127 1 sinusitis
NCT03370458 DOID:6575 1 irritable bowel syndrome
NCT03370458 DOID:3674 1 bronchitis
NCT03370458 DOID:3068 1 pneumonia
NCT03748550 DOID:11203 1 breast cancer
NCT04158609 DOID:0111149 1 meconium aspiration syndrome
NCT04216316 DOID:14308 6 NSCLC;non-small cell lung cancer
NCT04216316 DOID:0111988 1 ataxia telangiectasia
NCT05017025 DOID:14308 3 NSCLC;non-small cell lung cancer

9 Enumerate study-drug-disease-target links.

And include references.

Since each study may be associated with multiple drugs, targets and diseases, we build a table of all associated combinations, then aggregate by study (NCT_ID). For DOIDs with multiple terms, keep only most common term for simplicity.

## [1] "study-disease links: 312435"

9.3 PubChem molecules to ChEMBL targets

CID →(PubChem)→ INCHIKEY
INCHIKEY →(ChEMBL)→ MOLECULE_CHEMBL_ID
MOLECULE_CHEMBL_ID →(ChEMBL)→ ACTIVITY_ID

## [1] "CIDs: 4257 ; INCHIKEYs: 4253 ; pairs: 4257"
## [1] "INCHIKEYs: 0 ; MOLECULE_CHEMBL_IDs: 3756 ; pairs: 3759"
## [1] "MOLECULE_CHEMBL_IDs: 3311 ; TARGET_CHEMBL_IDs: 7457 ; ACTIVITY_IDs: 1253035 ; DOCUMENT_CHEMBL_IDs: 39931"
## [1] "CID2UNIPROT links: 174625 ; CIDs: 2985 ; UNIPROTs: 4562"

9.3.2 TDL counts

## [1] "study-drug-disease-target links: 18158275"
## [1] "studies: 36628 ; drugs: 2230 ; diseases: 2107 ; targets: 4347"
TDL counts
idgTDL N
Tchem 974
NA 7
Tbio 448
Tclin 420
Tdark 17

Sample Tdark for interesting-ness.

Sample of study-drug-disease-target links (N_total = 15300063)
nct_id drug_name CID disease_term doid gene_symbol uniprot idgTDL
NCT05104723 tofacitinib 9926791 lung disease DOID:5489 MDN1 Q9NU22 Tdark
NCT04626024 Nilotinib 3062316 leukemia DOID:0111875 CSNK1A1L Q8N752 Tdark
NCT04012827 doxorubicin 11315474 sarcoma DOID:0111213 ADCK5 Q3MIX3 Tdark
NCT01682083 Trametinib 11707110 melanoma DOID:11516 NA Q6ZSR9 Tdark
NCT04591431 Entrectinib 25141092 intestinal tumors DOID:1984 MDN1 Q9NU22 Tdark
NCT01578668 erlotinib 176870 lung adenocarcinoma DOID:14312 MDN1 Q9NU22 Tdark
NCT04158362 Abemaciclib 46220502 breast cancer DOID:11203 ADCK5 Q3MIX3 Tdark
NCT04626024 Nilotinib 3062316 peripheral artery disease DOID:0050830 NA Q6ZSR9 Tdark
NCT01309607 carboplatin 208908 ovarian cancer DOID:12132 CSNK1A1L Q8N752 Tdark
NCT02605746 ceritinib 57379345 brain cancer DOID:0112206 ADCK1 Q86TW2 Tdark
NCT00495872 Sorafenib 216239 multiple myeloma DOID:6354 NA Q6ZSR9 Tdark
NCT04055220 Regorafenib 11167602 Chordomas DOID:1343 MDN1 Q9NU22 Tdark
NCT00282100 Iressa 123631 liver cirrhosis DOID:2574 ADCK5 Q3MIX3 Tdark
NCT01646450 Icotinib 22024915 NSCLC DOID:14308 ADCK1 Q86TW2 Tdark
NCT00779389 Erlotinib 176870 NSCLC DOID:14308 HSP90AB2P Q58FF8 Tdark
NCT03515200 Ruxolitinib 25126798 ALL DOID:6764 HSP90AB2P Q58FF8 Tdark
NCT01690390 Icotinib 22024915 non-small cell lung cancer DOID:14308 HSP90AB2P Q58FF8 Tdark
NCT02034981 Crizotinib 11626560 Cholangiocarcinoma DOID:2430 CSNK1A1L Q8N752 Tdark
NCT04218071 Ruxolitinib 25126798 myelofibrosis DOID:2462 MDN1 Q9NU22 Tdark
NCT04047758 Letrozole 5330286 cancers DOID:11212 NA Q6ZSR9 Tdark
NCT00863122 lapatinib 208908 facial palsy DOID:10367 ADCK5 Q3MIX3 Tdark
NCT01306058 Sorafenib 216239 varices DOID:5123 CSNK1A1L Q8N752 Tdark
NCT04227327 Abemaciclib 46220502 breast cancer DOID:11203 ADCK5 Q3MIX3 Tdark
NCT03820830 Palbociclib 5330286 breast cancer DOID:11203 MDN1 Q9NU22 Tdark
NCT00314873 Gleevec 5291 thymic tumors DOID:134 NA Q6ZSR9 Tdark

9.4 PubMed references from AACT studies.

## [1] "Study references: 726627 ; PMIDs: 524644 ; studies: 147606"

9.5 PubMed references from ChEMBL activities.

ACTIVITY_ID →(ChEMBL)→ DOCUMENT_CHEMBL_ID
DOCUMENT_CHEMBL_ID →(ChEMBL)→ PUBMED_ID

## [1] "DOCUMENT_CHEMBL_IDs:: 39930 ; PMIDs: 37292"

10 Aggregating, scoring and ranking disease, target associations.

Evidence variables: