New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence

被引:105
作者
Cao, Mengfei [1 ]
Pietras, Christopher M. [1 ]
Feng, Xian [1 ]
Doroschak, Kathryn J. [2 ]
Schaffner, Thomas [1 ]
Park, Jisoo [1 ]
Zhang, Hao [1 ]
Cowen, Lenore J. [1 ]
Hescott, Benjamin J. [1 ]
机构
[1] Tufts Univ, Dept Comp Sci, Medford, MA 02155 USA
[2] Univ Minnesota, Dept Comp Sci, Minneapolis, MN 55455 USA
关键词
GENE ONTOLOGY; INTERACTOME; DATABASE; TOOL;
D O I
10.1093/bioinformatics/btu263
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Motivation: It has long been hypothesized that incorporating models of network noise as well as edge directions and known pathway information into the representation of protein-protein interaction (PPI) networks might improve their utility for functional inference. However, a simple way to do this has not been obvious. We find that diffusion state distance (DSD), our recent diffusion-based metric for measuring dissimilarity in PPI networks, has natural extensions that incorporate confidence, directions and can even express coherent pathways by calculating DSD on an augmented graph. Results: We define three incremental versions of DSD which we term cDSD, caDSD and capDSD, where the capDSD matrix incorporates confidence, known directed edges, and pathways into the measure of how similar each pair of nodes is according to the structure of the PPI network. We test four popular function prediction methods (majority vote, weighted majority vote, multi-way cut and functional flow) using these different matrices on the Baker's yeast PPI network in cross-validation. The best performing method is weighted majority vote using capDSD. We then test the performance of our augmented DSD methods on an integrated heterogeneous set of protein association edges from the STRING database. The superior performance of capDSD in this context confirms that treating the pathways as probabilistic units is more powerful than simply incorporating pathway edges independently into the network.
引用
收藏
页码:219 / 227
页数:9
相关论文
共 35 条
[1]
Iterative cluster analysis of protein interaction data [J].
Arnau, V ;
Mars, S ;
Marín, I .
BIOINFORMATICS, 2005, 21 (03) :364-378
[2]
Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]
Bader GD, 2003, NUCLEIC ACIDS RES, V31, P248, DOI 10.1093/nar/gkg056
[4]
Protein function prediction via graph kernels [J].
Borgwardt, KM ;
Ong, CS ;
Schönauer, S ;
Vishwanathan, SVN ;
Smola, AJ ;
Kriegel, HP .
BIOINFORMATICS, 2005, 21 :I47-I56
[5]
Amelioration of IFN-γ and TNF-α-Induced Intestinal Epithelial Barrier Dysfunction by Berberine via Suppression of MLCK-MLC Phosphorylation Signaling Pathway [J].
Cao, Min ;
Wang, Pei ;
Sun, Chunhong ;
He, Wen ;
Wang, Fengjun .
PLOS ONE, 2013, 8 (05)
[6]
Disease candidate gene identification and prioritization using protein interaction networks [J].
Chen, Jing ;
Aronow, Bruce J. ;
Jegga, Anil G. .
BMC BIOINFORMATICS, 2009, 10
[7]
Protein function prediction by massive integration of evolutionary analyses and multiple data sources [J].
Cozzetto, Domenico ;
Buchan, Daniel W. A. ;
Bryson, Kevin ;
Jones, David T. .
BMC BIOINFORMATICS, 2013, 14
[8]
An automated decision-tree approach to predicting protein interaction hot spots [J].
Darnell, Steven J. ;
Page, David ;
Mitchell, Julie C. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 68 (04) :813-823
[9]
Mapping gene ontology to proteins based on protein-protein interaction data [J].
Deng, MH ;
Tu, ZD ;
Sun, FZ ;
Chen, T .
BIOINFORMATICS, 2004, 20 (06) :895-902
[10]
Deng Minghua, 2003, Pac Symp Biocomput, P140