AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS

被引:4268
作者
HENIKOFF, S
HENIKOFF, JG
机构
[1] Basic Sciences Division, Fred Hutchinson Cancer Research Ctr., Howard Hughes Medical Institute, Seattle
关键词
AMINO ACID SEQUENCE; ALIGNMENT ALGORITHMS; DATA BASE SEARCHING;
D O I
10.1073/pnas.89.22.10915
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.
引用
收藏
页码:10915 / 10919
页数:5
相关论文
共 26 条
[1]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   MULTIPLE SEQUENCE ALIGNMENT OF PROTEIN FAMILIES SHOWING LOW SEQUENCE HOMOLOGY - A METHODOLOGICAL APPROACH USING DATABASE PATTERN-MATCHING DISCRIMINATORS FOR G-PROTEIN-LINKED RECEPTORS [J].
ATTWOOD, TK ;
ELIOPOULOS, EE ;
FINDLAY, JBC .
GENE, 1991, 98 (02) :153-159
[4]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1991, 19 :2247-2248
[5]   PROSITE - A DICTIONARY OF SITES AND PATTERNS IN PROTEINS [J].
BAIROCH, A .
NUCLEIC ACIDS RESEARCH, 1991, 19 :2241-2245
[6]  
Blahut R.E., 1987, PRINCIPLES PRACTICE
[7]   MULTIPLE SEQUENCE ALIGNMENT WITH HIERARCHICAL-CLUSTERING [J].
CORPET, F .
NUCLEIC ACIDS RESEARCH, 1988, 16 (22) :10881-10890
[8]  
Dayhoff MO, 1968, ATLAS PROTEIN SEQUEN, P33
[9]  
Dayhoff MO., 1978, ATLAS PROTEIN SEQ ST, V5, P345
[10]  
DOOLITTLE RF, 1990, METHOD ENZYMOL, V183, P99