Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins

被引:297
作者
Tomii, K [1 ]
Kanehisa, M [1 ]
机构
[1] KYOTO UNIV, INST CHEM RES, UJI, KYOTO 611, JAPAN
来源
PROTEIN ENGINEERING | 1996年 / 9卷 / 01期
关键词
cluster analysis; database; PAM; sequence alignment; similarity matrix;
D O I
10.1093/protein/9.1.27
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An amino acid index is a set of 20 numerical values representing any of the different physicochemical and biochemical properties of amino acids, As a follow-up to the previous study, we have increased the size of the database, which currently contains 402 published indices, and re-performed the single-linkage cluster analysis, The results basically confirmed the previous findings, Another important feature of amino acids that can be represented numerically is the similarity between them, Thus, a similarity matrix, also called a mutation matrix, is a set of 20x20 numerical values used for protein sequence alignments and similarity searches, We have collected 42 published matrices, performed hierarchical cluster analyses and identified several clusters corresponding to the nature of the data set and the method used for constructing the mutation matrix, Further, we have tried to reproduce each mutation matrix by the combination of amino acid indices in order to understand which properties of amino acids are reflected most. There was a relationship between the PAM units of Dayhoff's mutation matrix and the volume and hydrophobicity of amino acids, The database of 402 amino acid indices and 42 amino acid mutation matrices is made publicly available on the Internet.
引用
收藏
页码:27 / 36
页数:10
相关论文
共 53 条
[1]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[2]  
[Anonymous], ATLAS PROTEIN SEQUEN
[3]  
[Anonymous], 1978, Atlas of protein sequence and structure
[4]   AMINO-ACID SUBSTITUTION DURING FUNCTIONALLY CONSTRAINED DIVERGENT EVOLUTION OF PROTEIN SEQUENCES [J].
BENNER, SA ;
COHEN, MA ;
GONNET, GH .
PROTEIN ENGINEERING, 1994, 7 (11) :1323-1332
[5]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[6]  
Chou P Y, 1978, Adv Enzymol Relat Areas Mol Biol, V47, P45
[7]   NEW ALIGNMENT STRATEGY FOR TRANSMEMBRANE PROTEINS [J].
CSERZO, M ;
BERNASSAU, JM ;
SIMON, I ;
MAIGRET, B .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 243 (03) :388-396
[8]  
CSERZO M, 1989, INT J PEPT PROT RES, V34, P184
[9]  
Dayhoff M., 1978, ATLAS PROTEIN SEQ ST, V5, P353
[10]  
DAYHOFF MO, 1978, ATLAS PROTEIN SEQ S3, V5, P363