A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks

被引:167
作者
Wang, Jianxin [1 ]
Li, Min [1 ]
Chen, Jianer [2 ]
Pan, Yi [3 ]
机构
[1] Cent S Univ, Dept Comp Sci, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China
[2] Texas A&M Univ, Dept Comp Sci, College Stn, TX 77843 USA
[3] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30302 USA
基金
美国国家卫生研究院; 中国国家自然科学基金; 美国国家科学基金会;
关键词
Protein interaction network; functional module; hierarchical clustering algorithm; Gene Ontology; COMMUNITY STRUCTURE; ORGANIZATION; COMPLEXES; SUBUNITS; CLEAVAGE;
D O I
10.1109/TCBB.2010.75
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
As advances in the technologies of predicting protein interactions, huge data sets portrayed as networks have been available. Identification of functional modules from such networks is crucial for understanding principles of cellular organization and functions. However, protein interaction data produced by high-throughput experiments are generally associated with high false positives, which makes it difficult to identify functional modules accurately. In this paper, we propose a fast hierarchical clustering algorithm HC-PIN based on the local metric of edge clustering value which can be used both in the unweighted network and in the weighted network. The proposed algorithm HC-PIN is applied to the yeast protein interaction network, and the identified modules are validated by all the three types of Gene Ontology (GO) Terms: Biological Process, Molecular Function, and Cellular Component. The experimental results show that HC-PIN is not only robust to false positives, but also can discover the functional modules with low density. The identified modules are statistically significant in terms of three types of GO annotations. Moreover, HC-PIN can uncover the hierarchical organization of functional modules with the variation of its parameter's value, which is approximatively corresponding to the hierarchical structure of GO annotations. Compared to other previous competing algorithms, our algorithm HC-PIN is faster and more accurate.
引用
收藏
页码:607 / 620
页数:14
相关论文
共 49 条
[1]   CFinder:: locating cliques and overlapping modules in biological networks [J].
Adamcsek, B ;
Palla, G ;
Farkas, IJ ;
Derényi, I ;
Vicsek, T .
BIOINFORMATICS, 2006, 22 (08) :1021-1023
[2]   Development and implementation of an algorithm for detection of protein complexes in large interaction networks [J].
Altaf-Ul-Amin, Md ;
Shinbo, Yoko ;
Mihara, Kenji ;
Kurokawa, Ken ;
Kanaya, Shigehiko .
BMC BIOINFORMATICS, 2006, 7 (1)
[3]  
[Anonymous], 2000, A cluster algorithm for graphs, DOI DOI 10.1016/J.COSREV.2007.05.001
[4]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[5]   Network biology:: Understanding the cell's functional organization [J].
Barabási, AL ;
Oltvai, ZN .
NATURE REVIEWS GENETICS, 2004, 5 (02) :101-U15
[6]   Evaluation of clustering algorithms for protein-protein interaction networks [J].
Brohee, Sylvain ;
van Helden, Jacques .
BMC BIOINFORMATICS, 2006, 7 (1)
[7]   Clustering proteins from interaction networks for the prediction of cellular functions -: art. no. 95 [J].
Brun, C ;
Herrmann, C ;
Guénoche, A .
BMC BIOINFORMATICS, 2004, 5 (1)
[8]   Semantic integration to identify overlapping functional modules in protein interaction networks [J].
Cho, Young-Rae ;
Hwang, Woochang ;
Ramanathan, Murali ;
Zhang, Aidong .
BMC BIOINFORMATICS, 2007, 8 (1)
[9]   An efficient algorithm for large-scale detection of protein families [J].
Enright, AJ ;
Van Dongen, S ;
Ouzounis, CA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (07) :1575-1584
[10]   Inferring topology from clustering coefficients in protein-protein interaction networks [J].
Friedel, Caroline C. ;
Zimmer, Ralf .
BMC BIOINFORMATICS, 2006, 7 (1)