LAITOR - Literature Assistant for Identification of Terms co-Occurrences and Relationships

被引:22
作者
Barbosa-Silva, Adriano [1 ,2 ,3 ]
Soldatos, Theodoros G. [3 ,4 ]
Magalhaes, Ivan L. F. [1 ]
Pavlopoulos, Georgios A. [3 ]
Fontaine, Jean-Fred [2 ]
Andrade-Navarro, Miguel A. [2 ]
Schneider, Reinhard [3 ]
Ortega, J. Miguel [1 ]
机构
[1] Univ Fed Minas Gerais, ICB, Lab Biodados, Dpto Bioquim & Imunol, BR-31270901 Belo Horizonte, MG, Brazil
[2] Max Delbruck Ctr Mol Med, Computat Biol & Data Min Grp, D-13125 Berlin, Germany
[3] EMBL Heidelberg, D-69117 Heidelberg, Germany
[4] LIFE Biosyst GmbH, D-69115 Heidelberg, Germany
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
MOLECULAR-BIOLOGY; SALICYLIC-ACID; INFORMATION; STRESS; EXTRACTION; NETWORKS; VIEW;
D O I
10.1186/1471-2105-11-70
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such bioentities are often related to biological concepts of interest that are specific of a determined research field. Therefore, the study of the current literature about a selected topic deposited in public databases, facilitates the generation of novel hypotheses associating a set of bioentities to a common context. Results: We created a text mining system (LAITOR: Literature Assistant for Identification of Terms co-Occurrences and Relationships) that analyses co-occurrences of bioentities, biointeractions, and other biological terms in MEDLINE abstracts. The method accounts for the position of the co-occurring terms within sentences or abstracts. The system detected abstracts mentioning protein-protein interactions in a standard test (BioCreative II IAS test data) with a precision of 0.82-0.89 and a recall of 0.48-0.70. We illustrate the application of LAITOR to the detection of plant response genes in a dataset of 1000 abstracts relevant to the topic. Conclusions: Text mining tools combining the extraction of interacting bioentities and biological concepts with network displays can be helpful in developing reasonable hypotheses in different scientific backgrounds.
引用
收藏
页数:10
相关论文
共 40 条
[31]   Distribution of information in biomedical abstracts and full-text publications [J].
Schuemie, MJ ;
Weeber, M ;
Schijvenaars, BJA ;
van Mulligen, EM ;
van der Eijk, CC ;
Jelier, R ;
Mons, B ;
Kors, JA .
BIOINFORMATICS, 2004, 20 (16) :2597-2604
[32]   Regulatory metabolic networks in drought stress responses [J].
Seki, Motoaki ;
Umezawa, Taishi ;
Urano, Kaoru ;
Shinozaki, Kazuo .
CURRENT OPINION IN PLANT BIOLOGY, 2007, 10 (03) :296-302
[33]   The molecular biology of the low-temperature response in plants [J].
Sharma, P ;
Sharma, N ;
Deswal, R .
BIOESSAYS, 2005, 27 (10) :1048-1059
[34]   ROCR: visualizing classifier performance in R [J].
Sing, T ;
Sander, O ;
Beerenwinkel, N ;
Lengauer, T .
BIOINFORMATICS, 2005, 21 (20) :3940-3941
[35]  
SWANSON DR, 1986, PERSPECT BIOL MED, V30, P7
[36]  
Tari L, 2009, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2009, P87
[37]  
THU PH, 2008, TR08004 AR STAT U
[38]   Mechanisms of high salinity tolerance in plants [J].
Tuteja, Narendra .
OSMOSENSING AND OSMOSIGNALING, 2007, 428 :419-+
[39]   Jasmonates: An update on biosynthesis, signal transduction and action in plant stress response, growth and development [J].
Wasternack, C. .
ANNALS OF BOTANY, 2007, 100 (04) :681-697
[40]  
Yu H, 2002, AMIA 2002 SYMPOSIUM, PROCEEDINGS, P919