A regression framework incorporating quantitative and negative interaction data improves quantitative prediction of PDZ domain-peptide interaction from primary sequence

被引:24
作者
Shao, Xiaojian [1 ,2 ]
Tan, Chris S. H. [2 ,3 ]
Voss, Courtney [4 ]
Li, Shawn S. C. [4 ]
Deng, Naiyang [1 ]
Bader, Gary D. [2 ,3 ]
机构
[1] China Agr Univ, Dept Appl Math, Coll Sci, Beijing 100083, Peoples R China
[2] Univ Toronto, Donnelly Ctr Cellular & Biomol Res, Banting & Best Dept Med Res, Toronto, ON M5S 3E1, Canada
[3] Univ Toronto, Dept Mol Genet, Toronto, ON M5S 3E1, Canada
[4] Univ Western Ontario, Dept Biochem, London, ON N6A 5B8, Canada
基金
加拿大健康研究院;
关键词
PROTEIN-INTERACTION NETWORKS; SPECIFICITY; SELECTIVITY; KNOWLEDGE; SITES;
D O I
10.1093/bioinformatics/btq657
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Predicting protein interactions involving peptide recognition domains is essential for understanding the many important biological processes they mediate. It is important to consider the binding strength of these interactions to help us construct more biologically relevant protein interaction networks that consider cellular context and competition between potential binders. Results: We developed a novel regression framework that considers both positive (quantitative) and negative (qualitative) interaction data available for mouse PDZ domains to quantitatively predict interactions between PDZ domains, a large peptide recognition domain family, and their peptide ligands using primary sequence information. First, we show that it is possible to learn from existing quantitative and negative interaction data to infer the relative binding strength of interactions involving previously unseen PDZ domains and/or peptides given their primary sequence. Performance was measured using cross-validated hold out testing and testing with previously unseen PDZ domain-peptide interactions. Second, we find that incorporating negative data improves quantitative interaction prediction. Third, we show that sequence similarity is an important prediction performance determinant, which suggests that experimentally collecting additional quantitative interaction data for underrepresented PDZ domain subfamilies will improve prediction.
引用
收藏
页码:383 / 390
页数:8
相关论文
共 43 条
[1]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[2]   High-energy water sites determine peptide binding affinity and specificity of PDZ domains [J].
Beuming, Thijs ;
Farid, Ramy ;
Sherman, Woody .
PROTEIN SCIENCE, 2009, 18 (08) :1609-1619
[3]   Selectivity and promiscuity in the interaction network mediated by protein recognition modules [J].
Castagnoli, L ;
Costantini, A ;
Dall'armi, C ;
Gonfloni, S ;
Montecchi-Palazzi, L ;
Panni, S ;
Paoluzi, S ;
Santonico, E ;
Cesareni, G .
FEBS LETTERS, 2004, 567 (01) :74-79
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]   Activity motifs reveal principles of timing in transcriptional control of the yeast metabolic network [J].
Chechik, Gal ;
Oh, Eugene ;
Rando, Oliver ;
Weissman, Jonathan ;
Regev, Aviv ;
Koller, Daphne .
NATURE BIOTECHNOLOGY, 2008, 26 (11) :1251-1259
[6]  
Chen JR, 2008, NAT BIOTECHNOL, V26, P1041, DOI 10.1038/nbt.1489
[7]   The relative binding affinities of PDZ partners for CFTR:: A biochemical basis for efficient Endocytic recycling [J].
Cushing, Patrick R. ;
Fellows, Abigail ;
Villone, Daniel ;
Boisguerin, Prisca ;
Madden, Dean R. .
BIOCHEMISTRY, 2008, 47 (38) :10084-10098
[8]   Rapid Evolution of Functional Complexity in a Domain Family [J].
Ernst, Andreas ;
Sazinsky, Stephen L. ;
Hui, Shirley ;
Currell, Bridget ;
Dharsee, Moyez ;
Seshagiri, Somasekar ;
Bader, Gary D. ;
Sidhu, Sachdev S. .
SCIENCE SIGNALING, 2009, 2 (87) :ra50
[9]   A novel structure-based encoding for machine-learning applied to the inference of SH3 domain specificity [J].
Ferraro, E. ;
Via, A. ;
Ausiello, G. ;
Helmer-Citterich, M. .
BIOINFORMATICS, 2006, 22 (19) :2333-2339
[10]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3