DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches

被引:149
作者
Thompson, JD [1 ]
Plewniak, F [1 ]
Thierry, JC [1 ]
Poch, O [1 ]
机构
[1] ULP, INSERM, CNRS, Inst Genet & Biol Mol & Cellulaire,Lab Biol & Gen, F-67404 Illkirch Graffenstaden, France
关键词
D O I
10.1093/nar/28.15.2919
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
DbClustal addresses the important problem of the automatic multiple alignment of the top scoring full-length sequences detected by a database homology search. By combining the advantages of both local and global alignment algorithms into a single system, DbClustal is able to provide accurate global alignments of highly divergent, complex sequence sets. Local alignment information is incorporated into a ClustalW global alignment in the form of a list of anchor points between pairs of sequences. The method is demonstrated using anchors supplied by the Blast post-processing program, Ballast. The rapidity and reliability of DbClustal have been demonstrated using the recently annotated Pyrococcus abyssi proteome where the number of alignments with totally misaligned sequences was reduced from 20% to <2%. A web site has been implemented proposing BlastP database searches with automatic alignment of the top hits by DbClustal.
引用
收藏
页码:2919 / 2926
页数:8
相关论文
共 26 条
  • [11] Multiple DNA and protein sequence alignment based on segment-to-segment comparison
    Morgenstern, B
    Dress, A
    Werner, T
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (22) : 12098 - 12103
  • [12] A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS
    NEEDLEMAN, SB
    WUNSCH, CD
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) : 443 - +
  • [13] Extracting protein alignment models from the sequence database
    Neuwald, AF
    Liu, JS
    Lipman, DJ
    Lawrence, CE
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (09) : 1665 - 1677
  • [14] COFFEE: An objective function for multiple sequence alignments
    Notredame, C
    Holm, L
    Higgins, DG
    [J]. BIOINFORMATICS, 1998, 14 (05) : 407 - 422
  • [15] Pearson W R, 2000, Methods Mol Biol, V132, P185
  • [16] PLEWNIAK F, 2000, IN PRESS BIOINFORMAT, V16
  • [17] DATABASE OF HOMOLOGY-DERIVED PROTEIN STRUCTURES AND THE STRUCTURAL MEANING OF SEQUENCE ALIGNMENT
    SANDER, C
    SCHNEIDER, R
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1991, 9 (01): : 56 - 68
  • [18] A WORKBENCH FOR MULTIPLE ALIGNMENT CONSTRUCTION AND ANALYSIS
    SCHULER, GD
    ALTSCHUL, SF
    LIPMAN, DJ
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1991, 9 (03): : 180 - 190
  • [19] PIR-ALN: a database of protein sequence alignments
    Srinivasarao, GY
    Yeh, LSL
    Marzec, CR
    Orcutt, BC
    Barker, WC
    [J]. BIOINFORMATICS, 1999, 15 (05) : 382 - 390
  • [20] Dynamic sequence databank searching with templates and multiple alignment
    Taylor, WR
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 280 (03) : 375 - 406