Periodic distributions of hydrophobic amino acids allows the definition of fundamental building blocks to align distantly related proteins

被引:11
作者
Baussand, J.
Deremble, C.
Carbone, A.
机构
[1] Univ Paris 06, INSERM, UMRS 511, F-75013 Paris, France
[2] IBPC, Lab Biochim Theor, F-75005 Paris, France
关键词
sequence alignment; evolution; remote proteins; protein homology; substitution matricers; gaps; secondary structures; hydrophobic blocks;
D O I
10.1002/prot.21319
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Several studies on large and small families of proteins proved in a general manner that hydrophobic amino acids are globally conserved even if they are subjected to high rate substitution. Statistical analysis of amino acids evolution within blocks of hydrophobic amino acids detected in sequences suggests their usage as a basic structural pattern to align pairs of proteins of less than 25% sequence identity, with no need of knowing their 3D structure. The authors present a new global alignment method and an automatic tool for Proteins with HYdrophobic Blocks ALignment (PHYBAL) based on the combinatorics of overlapping hydrophobic blocks. Two substitution matrices modeling a different selective pressure inside and outside hydrophobic blocks are constructed, the Inside Hydrophobic Blocks Matrix and the Outside Hydrophobic Blocks Matrix, and. a 4D space of gap values is explored. PHYBAL performance is evaluated against Needleman and Wunsch algorithm run with Blosum 30, Blosum. 45, Blosum 62, Gonnet, HSDM, PAM250, Johnson and Remote Homo matrices. PHYBAL behavior is analyzed on eight randomly selected pairs (of proteins of < 30% sequence identity that cover a large spectrum of structural properties. It is also validated on two large datasets, the 127 pairs of the Domingues dataset with < 30% sequence identity, and 181 pairs issued from BAliBASE 2.0 and ranked by percentage of identity from 7 to 25%. Results confirm the importance of considering substitution matrices modeling hydrophobic contexts and a 4D space of gap values in aligning distantly related proteins. Two new notions of local and global stability are defined to assess the robustness of an alignment algorithm and the accuracy of PHYBAIL. A new notion, the SAD-coefficient, to assess the difficulty of structural alignment is also introduced. PHYBAL has been compared with Hydrophobic Cluster Analysis and HMMSUM methods.
引用
收藏
页码:695 / 708
页数:14
相关论文
共 52 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   DETERMINANTS OF A PROTEIN FOLD - UNIQUE FEATURES OF THE GLOBIN AMINO-ACID-SEQUENCES [J].
BASHFORD, D ;
CHOTHIA, C ;
LESK, AM .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (01) :199-216
[4]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[5]   DECIPHERING THE MESSAGE IN PROTEIN SEQUENCES - TOLERANCE TO AMINO-ACID SUBSTITUTIONS [J].
BOWIE, JU ;
REIDHAAROLSON, JF ;
LIM, WA ;
SAUER, RT .
SCIENCE, 1990, 247 (4948) :1306-1310
[6]   HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins [J].
Bystroff, C ;
Thorsson, V ;
Baker, D .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 301 (01) :173-190
[7]   Prediction of the general transcription factors associated with RNA polymerase II in Plasmodium falciparum:: conserved features and differences relative to other eukaryotes -: art. no. 100 [J].
Callebaut, I ;
Prat, K ;
Meurice, E ;
Mornon, JP ;
Tomavo, S .
BMC GENOMICS, 2005, 6 (1)
[8]   Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives [J].
Callebaut, I ;
Labesse, G ;
Durand, P ;
Poupon, A ;
Canard, L ;
Chomilier, J ;
Henrissat, B ;
Mornon, JP .
CELLULAR AND MOLECULAR LIFE SCIENCES, 1997, 53 (08) :621-645
[9]   AN IMMUNOPHILIN THAT BINDS M(R) 90,000 HEAT-SHOCK PROTEIN - MAIN STRUCTURAL FEATURES OF A MAMMALIAN P59 PROTEIN [J].
CALLEBAUT, I ;
RENOIR, JM ;
LEBEAU, MC ;
MASSOL, N ;
BURNY, A ;
BAULIEU, EE ;
MORNON, JP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (14) :6270-6274
[10]   Characterization and molecular cloning of an adenosine kinase from Babesia canis rossi [J].
Carret, C ;
Delbecq, S ;
Labesse, G ;
Carcy, B ;
Precigout, E ;
Moubri, K ;
Schetters, TPM ;
Gorenflot, A .
EUROPEAN JOURNAL OF BIOCHEMISTRY, 1999, 265 (03) :1015-1021