Relationship between multiple sequence alignments and quality of protein comparative models

被引:43
作者
Cozzetto, D [1 ]
Tramontano, A [1 ]
机构
[1] Univ Roma La Sapienza, Dept Biochem Sci, I-00185 Rome, Italy
关键词
comparative modeling; CASP experiments; multiple sequence alignment; protein structure evolution;
D O I
10.1002/prot.20284
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Comparative modeling is the method of choice, whenever applicable, for protein structure prediction, not only because of its higher accuracy compared to alternative methods, but also because it is possible to estimate a priori the quality of the models that it can produce, thereby allowing the usefulness of a model for a given application to be assessed beforehand. By and large, the quality of a comparative model depends on two factors: the extent of structural divergence between the target and the template and the quality of the sequence alignment between the two protein sequences. The latter is usually derived from a multiple sequence alignment (MSA) of as many proteins of the family as possible, and its accuracy depends on the number and similarity distribution of the sequences of the protein family. Here we describe a method to evaluate the expected difficulty, and by extension accuracy, of a comparative model on the basis of the MSA used to build it. The parameter that we derive is used to compare the results obtained in the last two editions of the Critical Assessment of Methods for Structure Prediction (CASP) experiment as a function of the difficulty of the modeling exercise. Our analysis demonstrates that the improvement in the scope and quality of comparative models between the two experiments is largely due to the increased number of available protein sequences and to the consequent increased chance that a large and appropriately spaced set of protein sequences homologous to the proteins of interest is available. (C) 2004 Wiley-Liss, Inc.
引用
收藏
页码:151 / 157
页数:7
相关论文
共 25 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Iterated profile searches with PSI-BLAST - a tool for discovery in protein databases [J].
Altschul, SF ;
Koonin, EV .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (11) :444-447
[3]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
EUROPEAN JOURNAL OF BIOCHEMISTRY, 1977, 80 (02) :319-324
[4]   THE EVOLUTION OF PROTEIN STRUCTURES [J].
CHOTHIA, C ;
LESK, AM .
COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 1987, 52 :399-405
[5]   THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS [J].
CHOTHIA, C ;
LESK, AM .
EMBO JOURNAL, 1986, 5 (04) :823-826
[6]  
Higgins DG, 1996, METHOD ENZYMOL, V266, P383
[7]   SCOP: A structural classification of proteins database [J].
Hubbard, TJP ;
Murzin, AG ;
Brenner, SE ;
Chothia, C .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :236-239
[8]   Extraction of well-fitting substructures: Root-mean-square deviation and the difference distance matrix [J].
Lesk, AM .
FOLDING & DESIGN, 1997, 2 (03) :S12-S14
[9]   SIZE-INDEPENDENT COMPARISON OF PROTEIN 3-DIMENSIONAL STRUCTURES [J].
MAIOROV, VN ;
CRIPPEN, GM .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1995, 22 (03) :273-283
[10]   PROTEIN-STRUCTURE COMPARISONS USING A COMBINATION OF A GENETIC ALGORITHM, DYNAMIC-PROGRAMMING AND LEAST-SQUARES MINIMIZATION [J].
MAY, ACW ;
JOHNSON, MS .
PROTEIN ENGINEERING, 1994, 7 (04) :475-485