DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches

被引:149
作者
Thompson, JD [1 ]
Plewniak, F [1 ]
Thierry, JC [1 ]
Poch, O [1 ]
机构
[1] ULP, INSERM, CNRS, Inst Genet & Biol Mol & Cellulaire,Lab Biol & Gen, F-67404 Illkirch Graffenstaden, France
关键词
D O I
10.1093/nar/28.15.2919
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
DbClustal addresses the important problem of the automatic multiple alignment of the top scoring full-length sequences detected by a database homology search. By combining the advantages of both local and global alignment algorithms into a single system, DbClustal is able to provide accurate global alignments of highly divergent, complex sequence sets. Local alignment information is incorporated into a ClustalW global alignment in the form of a list of anchor points between pairs of sequences. The method is demonstrated using anchors supplied by the Blast post-processing program, Ballast. The rapidity and reliability of DbClustal have been demonstrated using the recently annotated Pyrococcus abyssi proteome where the number of alignments with totally misaligned sequences was reduced from 20% to <2%. A web site has been implemented proposing BlastP database searches with automatic alignment of the top hits by DbClustal.
引用
收藏
页码:2919 / 2926
页数:8
相关论文
共 26 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Histone Sequence Database: new histone fold family members
    Baxevanis, AD
    Landsman, D
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 372 - 375
  • [3] A symmetric-iterated multiple alignment of protein sequences
    Brocchieri, L
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 276 (01) : 249 - 264
  • [4] Combining many multiple alignments in one improved alignment
    Bucka-Lassen, K
    Caprani, O
    Hein, J
    [J]. BIOINFORMATICS, 1999, 15 (02) : 122 - 130
  • [5] Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments
    Gotoh, O
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1996, 264 (04) : 823 - 838
  • [6] Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment
    Gracy, J
    Argos, P
    [J]. BIOINFORMATICS, 1998, 14 (02) : 164 - 173
  • [7] EbEST: An automated tool using expressed sequence tags to delineate gene structure
    Jiang, J
    Jacob, HJ
    [J]. GENOME RESEARCH, 1998, 8 (03): : 268 - 275
  • [8] Hidden Markov models for detecting remote protein homologies
    Karplus, K
    Barrett, C
    Hughey, R
    [J]. BIOINFORMATICS, 1998, 14 (10) : 846 - 856
  • [9] Koretke KK, 1999, PROTEINS, P141
  • [10] DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT
    LAWRENCE, CE
    ALTSCHUL, SF
    BOGUSKI, MS
    LIU, JS
    NEUWALD, AF
    WOOTTON, JC
    [J]. SCIENCE, 1993, 262 (5131) : 208 - 214