Text Detective: a rule-based system for gene annotation in biomedical texts

被引:19
作者
Tamames, J [1 ]
机构
[1] Alma Bioinformat SL, Madrid 28750, Spain
关键词
D O I
10.1186/1471-2105-6-S1-S10
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The identification of mentions of gene or gene products in biomedical texts is a critical step in the development of text mining applications in biosciences. The complexity and ambiguity of gene nomenclature makes this a very difficult task. Methods: Here we present a novel approach based on a combination of carefully designed rules and several lexicons of biological concepts, implemented in the Text Detective system. Text Detective is able to normalize the results of gene mentions found by offering the appropriate database reference. Results: In BioCreAtIvE evaluation, Text Detective achieved results of 84% precision, 71% recall for task 1A, and 79% precision, 71% recall for mouse genes in task 1B.
引用
收藏
页数:8
相关论文
共 11 条
[1]   An algorithm that learns what's in a name [J].
Bikel, DM ;
Schwartz, R ;
Weischedel, RM .
MACHINE LEARNING, 1999, 34 (1-3) :211-231
[2]  
Cristianini N., 2000, Intelligent Data Analysis: An Introduction, DOI 10.1017/CBO9780511801389
[3]   Tough mining [J].
Dickman, S .
PLOS BIOLOGY, 2003, 1 (02) :144-147
[4]  
Fukuda K, 1998, Pac Symp Biocomput, P707
[5]  
Manning C., 1999, FDN STAT NATURAL LAN
[6]  
Petsko GA, 2002, GENOME BIOL, V3
[7]  
TAKEUCHI K, 2003, P ACL 2003 WORKSH NA, P57
[8]   Tagging gene and protein names in biomedical text [J].
Tanabe, L ;
Wilbur, WJ .
BIOINFORMATICS, 2002, 18 (08) :1124-1132
[9]  
Vapnik V, 1999, NATURE STAT LEARNING
[10]   Extracting synonymous gene and protein terms from biological literature [J].
Yu, Hong ;
Agichtein, Eugene .
BIOINFORMATICS, 2003, 19 :i340-I349