Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies

被引:10
作者
Pakhomov, S. [1 ]
McInnes, B. T. [1 ]
Lamba, J. [1 ]
Liu, Y. [1 ]
Melton, G. B. [2 ]
Ghodke, Y. [1 ]
Bhise, N. [1 ]
Lamba, V. [1 ]
Birnbaum, A. K. [1 ]
机构
[1] Univ Minnesota, Coll Pharm, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Inst Hlth Informat, Minneapolis, MN 55455 USA
关键词
Pharmacogenomics; Text mining; Support vector machine; Pathway-driven analysis; Gene-drug associations; PharmGKB;
D O I
10.1016/j.jbi.2012.04.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The main objective of this study was to investigate the feasibility of using PharmGKB, a pharmacogenomic database, as a source of training data in combination with text of MEDLINE abstracts for a text mining approach to identification of potential gene targets for pathway-driven pharmacogenomics research. We used the manually curated relations between drugs and genes in PharmGKB database to train a support vector machine predictive model and applied this model prospectively to MEDLINE abstracts. The gene targets suggested by this approach were subsequently manually reviewed. Our quantitative analysis showed that a support vector machine classifiers trained on MEDLINE abstracts with single words (unigrams) used as features and PharmGKB relations used for supervision, achieve an overall sensitivity of 85% and specificity of 69%. The subsequent qualitative analysis showed that gene targets "suggested" by the automatic classifier were not anticipated by expert reviewers but were subsequently found to be relevant to the three drugs that were investigated: carbamazepine, lamivudine and zidovudine. Our results show that this approach is not only feasible but may also find new gene targets not identifiable by other methods thus making it a valuable tool for pathway-driven pharmacogenomics research. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:862 / 869
页数:8
相关论文
共 14 条
[1]  
[Anonymous], 1997, ICML
[2]   Using text to build semantic networks for pharmacogenomics [J].
Coulet, Adrien ;
Shah, Nigam H. ;
Garten, Yael ;
Musen, Mark ;
Altman, Russ B. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (06) :1009-1019
[3]  
Garten V, 2010, PACIFIC S BIOCOMPUTI
[4]  
Garten Y, 2010, PHARMACOGENOMICS, V11, P1467, DOI [10.2217/pgs.10.136, 10.2217/PGS.10.136]
[5]   Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text [J].
Garten, Yael ;
Altman, Russ B. .
BMC BIOINFORMATICS, 2009, 10 :S6
[6]   Generating Genome-Scale Candidate Gene Lists for Pharmacogenomics [J].
Hansen, N. T. ;
Brunak, S. ;
Altman, R. B. .
CLINICAL PHARMACOLOGY & THERAPEUTICS, 2009, 86 (02) :183-189
[7]  
Joshi Mahesh, 2006, AMIA Annu Symp Proc, P399
[8]   Hospital admissions associated with adverse drug reactions: A systematic review of prospective observational studies [J].
Kongkaew, Chuenjid ;
Noyce, Peter R. ;
Ashcroft, Darren M. .
ANNALS OF PHARMACOTHERAPY, 2008, 42 (7-8) :1017-1025
[9]  
Liu HF, 2002, AMIA 2002 SYMPOSIUM, PROCEEDINGS, P464
[10]   Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS [J].
Liu, HF ;
Johnson, SB ;
Friedman, C .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2002, 9 (06) :621-636