Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins

被引:297
作者
Tomii, K [1 ]
Kanehisa, M [1 ]
机构
[1] KYOTO UNIV, INST CHEM RES, UJI, KYOTO 611, JAPAN
来源
PROTEIN ENGINEERING | 1996年 / 9卷 / 01期
关键词
cluster analysis; database; PAM; sequence alignment; similarity matrix;
D O I
10.1093/protein/9.1.27
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An amino acid index is a set of 20 numerical values representing any of the different physicochemical and biochemical properties of amino acids, As a follow-up to the previous study, we have increased the size of the database, which currently contains 402 published indices, and re-performed the single-linkage cluster analysis, The results basically confirmed the previous findings, Another important feature of amino acids that can be represented numerically is the similarity between them, Thus, a similarity matrix, also called a mutation matrix, is a set of 20x20 numerical values used for protein sequence alignments and similarity searches, We have collected 42 published matrices, performed hierarchical cluster analyses and identified several clusters corresponding to the nature of the data set and the method used for constructing the mutation matrix, Further, we have tried to reproduce each mutation matrix by the combination of amino acid indices in order to understand which properties of amino acids are reflected most. There was a relationship between the PAM units of Dayhoff's mutation matrix and the volume and hydrophobicity of amino acids, The database of 402 amino acid indices and 42 amino acid mutation matrices is made publicly available on the Internet.
引用
收藏
页码:27 / 36
页数:10
相关论文
共 53 条
[11]   ALIGNING AMINO-ACID SEQUENCES - COMPARISON OF COMMONLY USED METHODS [J].
FENG, DF ;
JOHNSON, MS ;
DOOLITTLE, RF .
JOURNAL OF MOLECULAR EVOLUTION, 1985, 21 (02) :112-125
[12]   AN IMPROVED METHOD OF TESTING FOR EVOLUTIONARY HOMOLOGY [J].
FITCH, WM .
JOURNAL OF MOLECULAR BIOLOGY, 1966, 16 (01) :9-&
[14]   AMINO-ACID PREFERENCES FOR SECONDARY STRUCTURE VARY WITH PROTEIN CLASS [J].
GEISOW, MJ ;
ROBERTS, RDB .
INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 1980, 2 (06) :387-389
[15]  
GEORGE DG, 1990, METHOD ENZYMOL, V183, P333
[16]   EXHAUSTIVE MATCHING OF THE ENTIRE PROTEIN-SEQUENCE DATABASE [J].
GONNET, GH ;
COHEN, MA ;
BENNER, SA .
SCIENCE, 1992, 256 (5062) :1443-1445
[17]   AMINO-ACID DIFFERENCE FORMULA TO HELP EXPLAIN PROTEIN EVOLUTION [J].
GRANTHAM, R .
SCIENCE, 1974, 185 (4154) :862-864
[18]   PROFILE ANALYSIS - DETECTION OF DISTANTLY RELATED PROTEINS [J].
GRIBSKOV, M ;
MCLACHLAN, AD ;
EISENBERG, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (13) :4355-4358
[19]   PERFORMANCE EVALUATION OF AMINO-ACID SUBSTITUTION MATRICES [J].
HENIKOFF, S ;
HENIKOFF, JG .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 17 (01) :49-61
[20]   AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS [J].
HENIKOFF, S ;
HENIKOFF, JG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (22) :10915-10919