AN ASSESSMENT OF AMINO-ACID EXCHANGE MATRICES IN ALIGNING PROTEIN SEQUENCES - THE TWILIGHT ZONE REVISITED

被引:142
作者
VOGT, G [1 ]
ETZOLD, T [1 ]
ARGOS, P [1 ]
机构
[1] EUROPEAN MOLEC BIOL LAB,D-69012 HEIDELBERG,GERMANY
关键词
SEQUENCE ALIGNMENT; RESIDUE EXCHANGE WEIGHTS; GAP PENALTIES; PROTEIN FAMILIES;
D O I
10.1006/jmbi.1995.0340
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The sensitivity of most protein sequence alignment methods depends strongly on the quality of the comparison matrices used. These matrices, which assign weights or similarity scores to every possible amino acid substitution pair, are utilized to differentiate amongst the various possible alignments of two or more sequences. There are many ways to generate these exchange weights and new matrices are constantly published. There has been no overall assessment of these various matrices when applied in different alignment techniques and over many protein folds and families, both close and distant and with the use of several gap penalty values. In this work, a set of amino acid sequences matched by superposition of known protein tertiary topologies is used to test the alignment accuracy of the different method/matrix/penalty combinations. The comparisons show relatively similar results for the top scoring matrices, a preference for the global alignment method of Needleman and Wunsch, and the importance of matrix modification and optimized gap penalties. The relationship between the percentage identity in a resulting alignment and the level of correctness to be expected are given for the top-performing matrix, resulting in a better definition of the so-called ''twilight zone''. Estimates are made for the probability that two sequences, aligned at a certain level of residue percentage identity, are in fact unrelated.
引用
收藏
页码:816 / 831
页数:16
相关论文
共 40 条
  • [1] THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK, RECENT DEVELOPMENTS
    BAIROCH, A
    BOECKMANN, B
    [J]. NUCLEIC ACIDS RESEARCH, 1993, 21 (13) : 3093 - 3096
  • [2] BARTON GJ, 1987, PROTEIN ENG, V1, P88
  • [3] AMINO-ACID SUBSTITUTION DURING FUNCTIONALLY CONSTRAINED DIVERGENT EVOLUTION OF PROTEIN SEQUENCES
    BENNER, SA
    COHEN, MA
    GONNET, GH
    [J]. PROTEIN ENGINEERING, 1994, 7 (11): : 1323 - 1332
  • [4] PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES
    BERNSTEIN, FC
    KOETZLE, TF
    WILLIAMS, GJB
    MEYER, EF
    BRICE, MD
    RODGERS, JR
    KENNARD, O
    SHIMANOUCHI, T
    TASUMI, M
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) : 535 - 542
  • [5] THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS
    CHOTHIA, C
    LESK, AM
    [J]. EMBO JOURNAL, 1986, 5 (04) : 823 - 826
  • [6] DAYHOFF MO, 1983, METHOD ENZYMOL, V91, P524
  • [7] Dayhoff MO., 1978, ATLAS PROTEIN SEQ ST, V5, P345
  • [8] SIMILAR AMINO-ACID-SEQUENCES - CHANCE OR COMMON ANCESTRY
    DOOLITTLE, RF
    [J]. SCIENCE, 1981, 214 (4517) : 149 - 159
  • [9] SIMILAR AMINO-ACID SEQUENCES REVISITED
    DOOLITTLE, RF
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 1989, 14 (07) : 244 - 245
  • [10] ALIGNING AMINO-ACID SEQUENCES - COMPARISON OF COMMONLY USED METHODS
    FENG, DF
    JOHNSON, MS
    DOOLITTLE, RF
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1985, 21 (02) : 112 - 125