Representation of amino acids as five-bit or three-bit patterns for filtering protein databases

被引:7
作者
Coghlan, A
Mac Dónaill, DA [1 ]
Buttimore, NH
机构
[1] Univ Dublin Trinity Coll, Dept Chem, Dublin 2, Ireland
[2] Univ Dublin Trinity Coll, Dept Genet, Dublin 2, Ireland
[3] Univ Dublin Trinity Coll, Sch Math, Dublin 2, Ireland
关键词
D O I
10.1093/bioinformatics/17.8.676
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: We propose representing amino acids by bit-patterns so they may be used in a filter algorithm for similarity searches over protein databases, to rapidly eliminate non-homologous regions of database sequences. The filter algorithm would be based on dynamic programming optimization. It would have the advantage over previous filter algorithms that its substitution scoring function distinguishes between conservative and non-conservative amino acid substitutions. Results: Simulated annealing was used to search for the best five-bit or three-bit patterns to represent amino acids, where similar amino acids were given similar bit-patterns. The similarity between amino acids was estimated from the BLOSUM45 matrix. Representing amino acids by these five-bit and three-bit patterns, the Escherichia coli PhoE precursor and the bacteriophage PA2 LC precursor were aligned. The alignments were nearly the same as that obtained when BLOSUM45 was used to score substitutions.
引用
收藏
页码:676 / 685
页数:10
相关论文
共 29 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2000, Intelligent Optimisation Techniques
[4]   The exploitation of assembly language instructions in biological text manipulation .2. Amino acid sequences [J].
Buttimore, NH ;
MacDonaill, DA .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1996, 32 (11) :39-45
[5]   EFFICIENT ALGORITHMS FOR FOLDING AND COMPARING NUCLEIC-ACID SEQUENCES [J].
DUMAS, JP ;
NINIO, J .
NUCLEIC ACIDS RESEARCH, 1982, 10 (01) :197-206
[6]  
Felsenstein J., 1993, PHYLIP PHYLOGENY INF
[7]  
GREEN P, 1996, SWAT EFFICIENT IMPLE
[8]   AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS [J].
HENIKOFF, S ;
HENIKOFF, JG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (22) :10915-10919
[9]   PROTEIN IDENTIFICATION BY MASS PROFILE FINGERPRINTING [J].
JAMES, P ;
QUADRONI, M ;
CARAFOLI, E ;
GONNET, G .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 1993, 195 (01) :58-64
[10]   A STRUCTURAL BASIS FOR SEQUENCE COMPARISONS - AN EVALUATION OF SCORING METHODOLOGIES [J].
JOHNSON, MS ;
OVERINGTON, JP .
JOURNAL OF MOLECULAR BIOLOGY, 1993, 233 (04) :716-738