A STRUCTURAL BASIS FOR SEQUENCE COMPARISONS - AN EVALUATION OF SCORING METHODOLOGIES

被引:242
作者
JOHNSON, MS
OVERINGTON, JP
机构
[1] The Imperial Cancer Research Fund Unit of Structural Molecular Biology, Department of Crystallography, Birkbeck College University of London, London WC1E 7HX, Malet Street
[2] Pfizer Central Research, Sandwich, Kent
关键词
AMINO ACID SCORING MATRICES; STRUCTURAL ALIGNMENTS; SEQUENCE ALIGNMENTS; DATA BANK SEARCHES; GLOBINS;
D O I
10.1006/jmbi.1993.1548
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A residue-exchange matrix has been derived that is suitable for comparison of amino acid sequences. This matrix is based on the tabulation of 207,795 amino acid replacements observed in 65 homologous sets of structurally aligned three-dimensional structures (235 proteins). The majority of the data is from structural comparisons where there is between 15 and 40% sequence identity. As a result, a scoring matrix such as the one devised here should provide a sensitive basis for the comparison of amino acid sequences and the search for homologous sequences in amino acid databases. In order to assess the value of this matrix we have made a comparative analysis with 12 other published scoring matrices that have been used for the alignment of protein amino acid sequences. We find that the matrix derived here is among the better performers in terms of alignment significance, detection of homologous sequences and the accuracy of alignments. © 1993 Academic Press Limited.
引用
收藏
页码:716 / 738
页数:23
相关论文
共 68 条
  • [1] ABOLA EE, 1987, CRYSTALLOGRAPHIC DAT, P107
  • [2] AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE
    ALTSCHUL, SF
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) : 555 - 565
  • [3] PROTEIN-SEQUENCE COMPARISON - METHODS AND SIGNIFICANCE
    ARGOS, P
    VINGRON, M
    VOGT, G
    [J]. PROTEIN ENGINEERING, 1991, 4 (04): : 375 - 383
  • [4] A SENSITIVE PROCEDURE TO COMPARE AMINO-ACID-SEQUENCES
    ARGOS, P
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (02) : 385 - 396
  • [5] THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK
    BAIROCH, A
    BOECKMANN, B
    [J]. NUCLEIC ACIDS RESEARCH, 1991, 19 : 2247 - 2248
  • [6] EVALUATION AND IMPROVEMENTS IN THE AUTOMATIC ALIGNMENT OF PROTEIN SEQUENCES
    BARTON, GJ
    STERNBERG, MJE
    [J]. PROTEIN ENGINEERING, 1987, 1 (02): : 89 - 94
  • [7] BARTON GJ, 1990, METHOD ENZYMOL, V183, P403
  • [8] FLEXIBLE PROTEIN-SEQUENCE PATTERNS - A SENSITIVE METHOD TO DETECT WEAK STRUCTURAL SIMILARITIES
    BARTON, GJ
    STERNBERG, MJE
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 212 (02) : 389 - 402
  • [9] PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES
    BERNSTEIN, FC
    KOETZLE, TF
    WILLIAMS, GJB
    MEYER, EF
    BRICE, MD
    RODGERS, JR
    KENNARD, O
    SHIMANOUCHI, T
    TASUMI, M
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) : 535 - 542
  • [10] CONSTRUCTION OF VALIDATED, NONREDUNDANT COMPOSITE PROTEIN-SEQUENCE DATABASES
    BLEASBY, AJ
    WOOTTON, JC
    [J]. PROTEIN ENGINEERING, 1990, 3 (03): : 153 - 159