LAITOR - Literature Assistant for Identification of Terms co-Occurrences and Relationships

被引:22
作者
Barbosa-Silva, Adriano [1 ,2 ,3 ]
Soldatos, Theodoros G. [3 ,4 ]
Magalhaes, Ivan L. F. [1 ]
Pavlopoulos, Georgios A. [3 ]
Fontaine, Jean-Fred [2 ]
Andrade-Navarro, Miguel A. [2 ]
Schneider, Reinhard [3 ]
Ortega, J. Miguel [1 ]
机构
[1] Univ Fed Minas Gerais, ICB, Lab Biodados, Dpto Bioquim & Imunol, BR-31270901 Belo Horizonte, MG, Brazil
[2] Max Delbruck Ctr Mol Med, Computat Biol & Data Min Grp, D-13125 Berlin, Germany
[3] EMBL Heidelberg, D-69117 Heidelberg, Germany
[4] LIFE Biosyst GmbH, D-69115 Heidelberg, Germany
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
MOLECULAR-BIOLOGY; SALICYLIC-ACID; INFORMATION; STRESS; EXTRACTION; NETWORKS; VIEW;
D O I
10.1186/1471-2105-11-70
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such bioentities are often related to biological concepts of interest that are specific of a determined research field. Therefore, the study of the current literature about a selected topic deposited in public databases, facilitates the generation of novel hypotheses associating a set of bioentities to a common context. Results: We created a text mining system (LAITOR: Literature Assistant for Identification of Terms co-Occurrences and Relationships) that analyses co-occurrences of bioentities, biointeractions, and other biological terms in MEDLINE abstracts. The method accounts for the position of the co-occurring terms within sentences or abstracts. The system detected abstracts mentioning protein-protein interactions in a standard test (BioCreative II IAS test data) with a precision of 0.82-0.89 and a recall of 0.48-0.70. We illustrate the application of LAITOR to the detection of plant response genes in a dataset of 1000 abstracts relevant to the topic. Conclusions: Text mining tools combining the extraction of interacting bioentities and biological concepts with network displays can be helpful in developing reasonable hypotheses in different scientific backgrounds.
引用
收藏
页数:10
相关论文
共 40 条
[11]   Medusa: a simple tool for interaction graph analysis [J].
Hooper, SD ;
Bork, P .
BIOINFORMATICS, 2005, 21 (24) :4432-4433
[12]   Response and adaptation by plants to flooding stress - Preface [J].
Jackson, MB ;
Colmer, TD .
ANNALS OF BOTANY, 2005, 96 (04) :501-505
[13]   STRING 8-a global view on proteins and their functional interactions in 630 organisms [J].
Jensen, Lars J. ;
Kuhn, Michael ;
Stark, Manuel ;
Chaffron, Samuel ;
Creevey, Chris ;
Muller, Jean ;
Doerks, Tobias ;
Julien, Philippe ;
Roth, Alexander ;
Simonovic, Milan ;
Bork, Peer ;
von Mering, Christian .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D412-D416
[14]  
Kachroo Aardra, 2007, Genet Eng (N Y), V28, P55, DOI 10.1007/978-0-387-34504-8_4
[15]   Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era [J].
Kell, DB ;
Oliver, SG .
BIOESSAYS, 2004, 26 (01) :99-105
[16]   Extracting information from the literature by text mining [J].
Kostoff, RN ;
DeMarco, RA .
ANALYTICAL CHEMISTRY, 2001, 73 (13) :370A-378A
[17]   Text-mining and information-retrieval services for molecular biology [J].
Krallinger, M ;
Valencia, A .
GENOME BIOLOGY, 2005, 6 (07)
[18]   PLAN2L: a web tool for integrated text mining and literature-based bioentity relation extraction [J].
Krallinger, Martin ;
Rodriguez-Penagos, Carlos ;
Tendulkar, Ashish ;
Valencia, Alfonso .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W160-W165
[19]   Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge [J].
Krallinger, Martin ;
Morgan, Alexander ;
Smith, Larry ;
Leitner, Florian ;
Tanabe, Lorraine ;
Wilbur, John ;
Hirschman, Lynette ;
Valencia, Alfonso .
GENOME BIOLOGY, 2008, 9
[20]   Salicylic acid in plant defence-the players and protagonists [J].
Loake, Gary ;
Grant, Murray .
CURRENT OPINION IN PLANT BIOLOGY, 2007, 10 (05) :466-472