Principal eigenvector of contact matrices and hydrophobicity profiles in proteins

被引:60
作者
Bastolla, U
Porto, M
Roman, HE
Vendruscolo, M
机构
[1] CSIC, INTA, Ctr Astrobiol, E-28850 Torrejon De Ardoz, Madrid, Spain
[2] Max Planck Inst Phys Komplexer Syst, D-01187 Dresden, Germany
[3] Tech Univ Dresden, Inst Theoret Phys, D-01062 Dresden, Germany
[4] Univ Milan, Dipartimento Fis, I-20133 Milan, Italy
[5] Univ Milan, Ist Nazl Fis Nucl, I-20133 Milan, Italy
[6] Univ Cambridge, Dept Chem, Cambridge CB2 1EW, England
关键词
vectorial representation of proteins; protein folding; hydrophobicity; contact maps; vectorial protein space;
D O I
10.1002/prot.20240
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
With the aim of studying the relationship between protein sequences and their native structures, we adopted vectorial representations for both sequence and structure. The structural representation was based on the principal eigenvector of the fold's contact matrix (PE). As has been recently shown, the latter encodes sufficient information for reconstructing the whole contact matrix. The sequence was represented through a hydrophobicity profile (HP), using a generalized hydrophobicity scale that we obtained from the principal eigenvector of a residue-residue interaction matrix, and denoted as interactivity scale. Using this novel scale, we defined the optimal HP of a protein fold, and, by means of stability arguments, predicted to be strongly correlated with the PE of the fold's contact matrix. This prediction was confirmed through an evolutionary analysis, which showed that the PE correlates with the HP of each individual sequence adopting the same fold and, even more strongly, with the average HP of this set of sequences. Thus, protein sequences evolve in such a way that their average HP is close to the optimal one, implying that neutral evolution can be viewed as a kind of motion in sequence space around the optimal HP. Our results indicate that the correlation coefficient between N-dimensional vectors constitutes a natural metric in the vectorial space in which we represent both protein sequences and protein structures, which we call vectorial protein space. In this way, we define a unified framework for sequence-to-sequence, sequence-to-structure and structure-to-structure alignments. We show that the interactivity scale is nearly optimal both for the comparison of sequences to sequences and sequences to structures. (C) 2004 Wiley-Liss, Inc.
引用
收藏
页码:22 / 30
页数:9
相关论文
共 40 条
[1]   FREE-ENERGY LANDSCAPE FOR PROTEIN-FOLDING KINETICS - INTERMEDIATES, TRAPS, AND MULTIPLE PATHWAYS IN THEORY AND LATTICE MODEL SIMULATIONS [J].
ABKEVICH, VI ;
GUTIN, AM ;
SHAKHNOVICH, EI .
JOURNAL OF CHEMICAL PHYSICS, 1994, 101 (07) :6052-6062
[2]   Statistical properties of neutral evolution [J].
Bastolla, U ;
Porto, M ;
Roman, HE ;
Vendruscolo, M .
JOURNAL OF MOLECULAR EVOLUTION, 2003, 57 (Suppl 1) :S103-S119
[3]   Lack of self-averaging in neutral evolution of proteins [J].
Bastolla, U ;
Porto, M ;
Roman, HE ;
Vendruscolo, M .
PHYSICAL REVIEW LETTERS, 2002, 89 (20) :208101-208101
[4]   Connectivity of neutral networks, overdispersion, and structural conservation in protein evolution [J].
Bastolla, U ;
Porto, M ;
Roman, HE ;
Vendruscolo, M .
JOURNAL OF MOLECULAR EVOLUTION, 2003, 56 (03) :243-254
[5]   How to guarantee optimal stability for most representative structures in the protein data bank [J].
Bastolla, U ;
Farwer, J ;
Knapp, EW ;
Vendruscolo, M .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 44 (02) :79-96
[6]   A statistical mechanical method to optimize energy functions for protein folding [J].
Bastolla, U ;
Vendruscolo, M ;
Knapp, EW .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (08) :3977-3981
[7]  
BASTOLLA U, UNPUB GENOMIC DETERM
[8]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[9]  
Betancourt MR, 1999, PROTEIN SCI, V8, P361
[10]  
Bollobas B., 1998, Modern graph theory