Nathan Byers
October 26, 2017
Singhal A, Simmons M, and Lu Z. “Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature.” Journal of the American Medical Informatics Association (2016) 23: 766-772.
Objective
Manually annotated text for breast and prostate cancers from Doughty et al. for training the model
tmVar tool used to identify mutations in the text (Wei et al.)
DNorm tool used to identify the disease names in the text (Leaman et al.)
Weka for prediction
EMU* | tmVar+ML | |
---|---|---|
Precision | 0.729 | 0.904 |
Recall | 0.803 | 0.856 |
F-measure | 0.764 | 0.880 |
EMU | tmVar+ML | |
---|---|---|
Precision | 0.806 | 0.878 |
Recall | 0.852 | 0.813 |
F-measure | 0.828 | 0.845 |
Doughty E, Kertesz-Farkas A, Bodenreider O, et al. “Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature.” Bioinformatics. 2011 27(3):408–415
Singhal A, Simmons M, and Lu Z. “Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature.” Journal of the American Medical Informatics Association 2016 23: 766-772
Leaman R, Doğan RI, Lu Z. “DNorm: disease name normalization with pairwise learning to rank.” Bioinformatics 2013 29(22):2909–2917
Wei C-H, Harris BR, Kao H-Y, Lu Z. “tmVar: a text mining approach for extracting sequence variants in biomedical literature.” Bioinformatics 2013 29(11):1433–1439