AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE

被引:417
作者
ALTSCHUL, SF
机构
[1] National Center for Biotechnology Information National Library, Medicine National Institutes of Health Bethesda
关键词
HOMOLOGY; SEQUENCE COMPARISON; STATISTICAL SIGNIFICANCE; ALIGNMENT ALGORITHMS; PATTERN RECOGNITION;
D O I
10.1016/0022-2836(91)90193-A
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a "substitution score matrix" that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a "log-odds" matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human α1B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins. © 1991.
引用
收藏
页码:555 / 565
页数:11
相关论文
共 59 条
[11]  
COLLINS JF, 1988, COMPUT APPL BIOSCI, V4, P67
[12]   FHUC AND FHUD GENES FOR IRON(III)-FERRICHROME TRANSPORT INTO ESCHERICHIA-COLI K-12 [J].
COULTON, JW ;
MASON, P ;
ALLATT, DD .
JOURNAL OF BACTERIOLOGY, 1987, 169 (08) :3844-3849
[13]   CRYSTALLOGRAPHIC REFINEMENT OF HUMAN SERUM RETINOL BINDING-PROTEIN AT 2A RESOLUTION [J].
COWAN, SW ;
NEWCOMER, ME ;
JONES, TA .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1990, 8 (01) :44-61
[14]   COMPARISON OF SEQUENCES FROM THE MALB REGIONS OF SALMONELLA-TYPHIMURIUM AND ENTEROBACTER-AEROGENES WITH ESCHERICHIA-COLI-K12 - A POTENTIAL NEW REGULATORY SITE IN THE INTEROPERONIC REGION [J].
DAHL, MK ;
FRANCOZ, E ;
SAURIN, W ;
BOOS, W ;
MANSON, MD ;
HOFNUNG, M .
MOLECULAR AND GENERAL GENETICS, 1989, 218 (02) :199-207
[15]  
Dayhoff MO, 1978, ATL PROTEIN SEQ STRU, V5, P345
[16]  
DEMBO A, 1991, IN PRESS ANN PROB
[17]   HUMAN APOLIPOPROTEIN-D GENE - GENE SEQUENCE, CHROMOSOME LOCALIZATION, AND HOMOLOGY TO THE ALPHA-2U-GLOBULIN SUPERFAMILY [J].
DRAYNA, DT ;
MCLEAN, JW ;
WION, KL ;
TRENT, JM ;
DRABKIN, HA ;
LAWN, RM .
DNA-A JOURNAL OF MOLECULAR & CELLULAR BIOLOGY, 1987, 6 (03) :199-204
[18]   ALIGNING AMINO-ACID SEQUENCES - COMPARISON OF COMMONLY USED METHODS [J].
FENG, DF ;
JOHNSON, MS ;
DOOLITTLE, RF .
JOURNAL OF MOLECULAR EVOLUTION, 1985, 21 (02) :112-125
[19]   PATTERN-RECOGNITION IN NUCLEIC-ACID SEQUENCES .1. A GENERAL-METHOD FOR FINDING LOCAL HOMOLOGIES AND SYMMETRIES [J].
GOAD, WB ;
KANEHISA, MI .
NUCLEIC ACIDS RESEARCH, 1982, 10 (01) :247-263
[20]   PROFILE ANALYSIS - DETECTION OF DISTANTLY RELATED PROTEINS [J].
GRIBSKOV, M ;
MCLACHLAN, AD ;
EISENBERG, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (13) :4355-4358