Joint Evolutionary Trees: A Large-Scale Method To Predict Protein Interfaces Based on Sequence Sampling

被引:59
作者
Engelen, Stefan [1 ,4 ]
Trojan, Ladislas A. [1 ,4 ]
Sacquin-Mora, Sophie [2 ]
Lavery, Richard [3 ]
Carbone, Alessandra [1 ,4 ]
机构
[1] Univ Paris 06, UMR S511, Paris, France
[2] IBPC, Lab Biochim Theor, Paris, France
[3] Univ Lyon, Inst Biol & Chim Prot, CNRS, IFR 128,UMR 5086, Lyon, France
[4] INSERM, U511, Paris, France
关键词
SUBSTITUTION MATRICES; STATISTICAL-ANALYSIS; BINDING SURFACES; TRACE ANALYSIS; RESIDUES; SITES; TOOL; CONSERVATION; ALIGNMENT; DOCKING;
D O I
10.1371/journal.pcbi.1000267
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The Joint Evolutionary Trees (JET) method detects protein interfaces, the core residues involved in the folding process, and residues susceptible to site-directed mutagenesis and relevant to molecular recognition. The approach, based on the Evolutionary Trace (ET) method, introduces a novel way to treat evolutionary information. Families of homologous sequences are analyzed through a Gibbs-like sampling of distance trees to reduce effects of erroneous multiple alignment and impacts of weakly homologous sequences on distance tree construction. The sampling method makes sequence analysis more sensitive to functional and structural importance of individual residues by avoiding effects of the overrepresentation of highly homologous sequences and improves computational efficiency. A carefully designed clustering method is parametrized on the target structure to detect and extend patches on protein surfaces into predicted interaction sites. Clustering takes into account residues' physical-chemical properties as well as conservation. Large-scale application of JET requires the system to be adjustable for different datasets and to guarantee predictions even if the signal is low. Flexibility was achieved by a careful treatment of the number of retrieved sequences, the amino acid distance between sequences, and the selective thresholds for cluster identification. An iterative version of JET (iJET) that guarantees finding the most likely interface residues is proposed as the appropriate tool for large-scale predictions. Tests are carried out on the Huang database of 62 heterodimer, homodimer, and transient complexes and on 265 interfaces belonging to signal transduction proteins, enzymes, inhibitors, antibodies, antigens, and others. A specific set of proteins chosen for their special functional and structural properties illustrate JET behavior on a large variety of interactions covering proteins, ligands, DNA, and RNA. JET is compared at a large scale to ET and to Consurf, Rate4Site, siteFiNDER vertical bar 3D, and SCORECONS on specific structures. A significant improvement in performance and computational efficiency is shown.
引用
收藏
页数:17
相关论文
共 34 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   ConSurf: An algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information [J].
Armon, A ;
Graur, D ;
Ben-Tal, N .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (01) :447-463
[3]   A dissection of specific and non-specific protein - Protein interfaces [J].
Bahadur, RP ;
Chakrabarti, P ;
Rodier, F ;
Janin, J .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 336 (04) :943-955
[4]   Asymmetric mutation rates at enzyme-inhibitor interfaces: Implications for the protein-protein docking problem [J].
Bradford, JR ;
Westhead, DR .
PROTEIN SCIENCE, 2003, 12 (09) :2099-2103
[5]   Are protein-protein interfaces more conserved in sequence than the rest of the protein surface? [J].
Caffrey, DR ;
Somaroo, S ;
Hughes, JD ;
Mintseris, J ;
Huang, ES .
PROTEIN SCIENCE, 2004, 13 (01) :190-202
[6]   Dissecting protein-protein recognition sites [J].
Chakrabarti, P ;
Janin, J .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 47 (03) :334-343
[7]   Prediction of interface residues in protein-protein complexes by a consensus neural network method: Test against NMR data [J].
Chen, HL ;
Zhou, HX .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 (01) :21-35
[8]   Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design [J].
Cheng, G ;
Qian, B ;
Samudrala, R ;
Baker, D .
NUCLEIC ACIDS RESEARCH, 2005, 33 (18) :5861-5867
[9]   EXHAUSTIVE MATCHING OF THE ENTIRE PROTEIN-SEQUENCE DATABASE [J].
GONNET, GH ;
COHEN, MA ;
BENNER, SA .
SCIENCE, 1992, 256 (5062) :1443-1445
[10]   AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS [J].
HENIKOFF, S ;
HENIKOFF, JG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (22) :10915-10919