Summary

In order to better understand what is the level and the type of information needed for getting an R01 for an ‘unstudied’ target, the Illuminating the Druggable Genome Knowledge Management Center (IDG KMC) has started annotating the protein targets that are the main focus of the study in 1482 funded NIH R01 projects with start dates between January 1st and May 20, 2015. A preliminary analysis of some partial annotation results of this data shows that most of the projects focus their efforts on well-studied proteins (Tbio and above). We will provide updated results as our annotation progresses, both in volume and in depth.

Data retrieval

The list of selected projects was retrieved and downloaded on June 4th, 2015 using NIH Reporter and the following search criteria:

  1. Award Type: New
  2. Activity Code: R01
  3. Project Start Date: >= 01/01/2015

The results of rerunning this query can be retrieved here. As the NIH Reporter does not allow to also query for projects that started before a certain date, accessing this link now would retrieve additional projects that started after May 20, 2015. However this selection can be done offline after downloading the csv file with the selected projects, and could also allow us to add additional projects if we need to increase the size of our original selection with more recent ones (until June 30, for example).

Data processing

After reading the description page of a funded project (Title, Abstract, Relevance, Project Terms), a trained biologist (JHK) is currently annotating the projects which focus on “illuminating” one single target, and adds the relevant IDs (UniProt, gene symbol) to that project. Additional information for each of the annotated targets - name, family, target development level (TDL), novelty score - is retrieved and added from the Target Central Resource Database. The following results are very preliminary, as they are based on only 62 annotated projects out of ~300 reviewed.

Results

Figure 1. Plot of number of funded targets by TDL (please note the lack of Tdark targets, so far)

Figure 2. Histogram of number of funded targets by novelty score

As expected most of the targets have low novelty score and appear in hundreds or thousands of published papers as seen in figure 2. An example of a recently funded novel target is SNX10, and a less novel one is MYC.


Annex 1: List of target-annotated R01 projects

Project IC FOA Years Cost Target TDL Novelty
1R01AR064793-01A1 NIAMS PA-13-302 5 3395190 SNX10 Tgray 0.2691090
1R01AR067726-01 NIAMS PA-13-302 4 1408000 P2RX5 Tchem 0.0720210
1R01GM115241-01 NIGMS PA-13-302 4 1219296 WEE2 Tbio 0.0708661
1R01DK103746-01A1 NIDDK PA-13-302 5 2160000 IP6K1 Tgray 0.0493393
1R01AR066741-01A1 NIAMS PA-13-302 5 2022265 ESRP1 Tbio 0.0393137
1R01GM112591-01A1 NIGMS PA-13-302 4 1675600 Pitpna Tgray 0.0277818
1R01GM114075-01 NIGMS PA-13-302 4 1206008 FCER1A Tbio 0.0241786
1R01HL127640-01 NHLBI PA-13-302 4 2535144 TBX20 Tbio 0.0239279
1R01AR063709-01A1 NIAMS PA-13-302 5 1927710 CDH11 Tbio 0.0203741
1R01GM114358-01 NIGMS PA-13-302 5 1534240 TEP1 Tbio 0.0170867
1R01MH104656-01A1 NIMH PA-13-302 5 1758250 GSK3A Tclin+ 0.0164948
1R01AA022414-01A1 NIAAA PA-13-302 5 1998235 UNC13A Tbio 0.0159734
1R01CA193698-01 NCI PA-13-302 5 1778530 SLC25A1 Tbio 0.0153664
1R01AR066003-01A1 NIAMS PA-13-302 5 2017085 MAPKAPK5 Tchem 0.0145794
1R01CA184090-01A1 NCI PA-13-302 5 1812845 USP7 Tchem 0.0107759
1R01CA182435-01A1 NCI PA-13-302 5 1721345 NRP2 Tbio 0.0100418
1R01GM115185-01 NIGMS PA-13-302 4 1609784 CKAP5 Tbio 0.0100370
1R01DA038964-01 NIDA PA-13-302 5 2400000 ARRB2 Tbio 0.0079780
1R01HL122578-01A1 NHLBI PA-13-302 4 1535000 FOXC2 Tbio 0.0077390
1R01CA187975-01A1 NCI PA-13-302 5 1885530 HAVCR2 Tbio 0.0055188
1R01HL126897-01 NHLBI PA-13-302 4 1535000 IL33 Tbio 0.0052793
1R01GM112686-01 NIGMS PA-13-302 5 1482250 POLR2B Tbio 0.0048842
1R01HL126668-01 NHLBI PA-13-302 5 2399900 IL1R1 Tbio 0.0046602
1R01GM113004-01A1 NIGMS PA-13-302 5 1533985 KCNT2 Tgray 0.0043565
1R01MH104488-01A1 NIMH PA-13-302 5 2628365 OTX2 Tbio 0.0041400
1R01CA172105-01A1 NCI PA-13-302 5 2007280 IRF8 Tbio 0.0039329
1R01CA190717-01 NCI PA-13-302 5 1454890 F3 Tchem 0.0028662
1R01GM108908-01A1 NIGMS PA-13-302 4 1736672 H2AFZ Tbio 0.0027880
1R01HL126705-01 NHLBI PA-13-302 4 1560000 TGFBR1 Tchem 0.0025227
1R01HL122662-01A1 NHLBI PA-13-302 4 1579568 NOX4 Tclin 0.0024811
1R01HL127283-01 NHLBI PA-13-302 4 1580960 ITGAV Tchem+ 0.0024554
1R01ES024915-01 NIEHS RFA-ES-13-014 5 2394985 DNMT3A Tclin+ 0.0024428
1R01GM112793-01 NIGMS PA-13-302 4 1230320 TOP2A Tclin+ 0.0017872
1R01AR067925-01 NIAMS PA-13-302 5 1625250 MC1R Tclin+ 0.0017608
1R01EY024929-01 NEI PA-13-302 4 1540000 HSF4 Tbio 0.0017561
1R01CA188520-01A1 NCI PA-13-302 5 1704190 HDAC1 Tclin+ 0.0014524
1R01EY024031-01A1 NEI PA-13-302 5 2531825 BSG Tbio 0.0013430
1R01GM115189-01 NIGMS PA-13-302 4 1128404 KCNQ1 Tclin+ 0.0012237
1R01NS092570-01 NINDS PA-13-302 5 1667970 KCNQ1 Tclin+ 0.0012237
1R01HL127764-01 NHLBI PA-13-302 4 2458836 ADRB2 Tclin+ 0.0011621
1R01GM112690-01 NIGMS PA-13-302 4 1250504 GSK3B Tclin+ 0.0010326
1R01NS092917-01 NINDS PA-13-302 5 1624220 ABCA1 Tclin+ 0.0009236
1R01CA177828-01A1 NCI PA-13-302 5 2850935 IDH1 Tbio 0.0009088
1R01MH106469-01 NIMH PA-13-216 5 2785880 GRM5 Tchem+ 0.0008925
1R01HL127339-01 NHLBI PA-13-302 5 1987500 NF2 Tbio 0.0008236
1R01CA184728-01A1 NCI PA-13-302 5 2007280 GZMB Tchem 0.0006461
1R01HL118334-01A1 NHLBI PA-13-302 5 1890625 CASP1 Tchem+ 0.0006248
1R01NS086301-01A1 NINDS PA-13-302 5 1684375 SLC2A4 Tbio 0.0005521
1R01HL122582-01A1 NHLBI PA-13-302 4 1525000 MYB Tbio 0.0005496
1R01AA022986-01A1 NIAAA PA-13-302 5 2673185 CYP2E1 Tbio 0.0005170
1R01NS091367-01 NINDS PA-13-302 5 1162500 APP Tchem+ 0.0003148
1R01DK102945-01A1 NIDDK PA-13-034 4 1525500 VDR Tclin+ 0.0003004
1R01MH107126-01 NIMH PAR-13-048 5 4578250 COMT Tclin+ 0.0002934
1R01HL128063-01 NHLBI PA-13-302 4 2903260 PTEN Tbio 0.0002879
1R01GM112844-01 NIGMS PA-13-302 4 1489952 HSP90AA1 Tchem 0.0002778
1R01AI116834-01 NIAID PA-13-302 5 2118750 FOXP3 Tbio 0.0002506
1R01AI113009-01A1 NIAID PA-13-302 5 1098335 FOXP3 Tbio 0.0002506
1R01CA184089-01 NCI PA-11-260 5 2292750 KRAS Tbio 0.0001667
1R01CA184510-01A1 NCI PA-13-302 5 1809985 HRAS Tclin 0.0001615
1R01CA188646-01A1 NCI PA-13-302 5 1721345 PGR Tclin+ 0.0001365
1R01CA186707-01A1 NCI PA-13-302 5 1739880 MYC Tbio 0.0000951
1R01HL126732-01 NHLBI PA-13-302 4 1888592 MYC Tbio 0.0000951