A hybrid approach to extract protein-protein interactions

被引:62
作者
Bui, Quoc-Chinh [1 ]
Katrenko, Sophia [1 ]
Sloot, Peter M. A.
机构
[1] Univ Amsterdam, Inst Informat, Amsterdam, Netherlands
关键词
EVENT EXTRACTION;
D O I
10.1093/bioinformatics/btq620
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein-protein interactions (PPIs) play an important role in understanding biological processes. Although recent research in text mining has achieved a significant progress in automatic PPI extraction from literature, performance of existing systems still needs to be improved. Results: In this study, we propose a novel algorithm for extracting PPIs from literature which consists of two phases. First, we automatically categorize the data into subsets based on its semantic properties and extract candidate PPI pairs from these subsets. Second, we apply support vector machines (SVMs) to classify candidate PPI pairs using features specific for each subset. We obtain promising results on five benchmark datasets: AIMed, BioInfer, HPRD50, IEPA and LLL with F-scores ranging from 60% to 84%, which are comparable with the state-of-the-art PPI extraction systems. Furthermore, our system achieves the best performance on cross-corpora evaluation and comparative performance in terms of computational efficiency.
引用
收藏
页码:259 / 265
页数:7
相关论文
共 27 条
[1]   All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning [J].
Airola, Antti ;
Pyysalo, Sampo ;
Bjoerne, Jari ;
Pahikkala, Tapio ;
Ginter, Filip ;
Salakoski, Tapio .
BMC BIOINFORMATICS, 2008, 9 (Suppl 11)
[2]   Text mining and its potential applications in systems biology [J].
Ananiadou, Sophia ;
Kell, Douglas B. ;
Tsujii, Jun-ichi .
TRENDS IN BIOTECHNOLOGY, 2006, 24 (12) :571-579
[3]   Complex event extraction at PubMed scale [J].
Bjorne, Jari ;
Ginter, Filip ;
Pyysalo, Sampo ;
Tsujii, Jun'ichi ;
Salakoski, Tapio .
BIOINFORMATICS, 2010, 26 (12) :i382-i390
[4]   Extracting causal relations on HIV drug resistance from literature [J].
Bui, Quoc-Chinh ;
Nuallain, Breanndan O. ;
Boucher, Charles A. ;
Sloot, Peter M. A. .
BMC BIOINFORMATICS, 2010, 11
[5]  
BUNESCU R, 2005, P 19 C NEUR INF PROC
[6]   Bayesian inference of protein-protein interactions from biological literature [J].
Chowdhary, Rajesh ;
Zhang, Jinfeng ;
Liu, Jun S. .
BIOINFORMATICS, 2009, 25 (12) :1536-1542
[7]  
Cusick ME, 2009, NAT METHODS, V6, P39, DOI [10.1038/NMETH.1284, 10.1038/nmeth.1284]
[8]   Linguistic feature analysis for protein interaction extraction [J].
Fayruzov, Timur ;
De Cock, Martine ;
Cornelis, Chris ;
Hoste, Veronique .
BMC BIOINFORMATICS, 2009, 10
[9]   RelEx -: Relation extraction using dependency parse trees [J].
Fundel, Katrin ;
Kueffner, Robert ;
Zimmer, Ralf .
BIOINFORMATICS, 2007, 23 (03) :365-371
[10]   Large-scale directional relationship extraction and resolution [J].
Giles, Cory B. ;
Wren, Jonathan D. .
BMC BIOINFORMATICS, 2008, 9 (Suppl 9)