BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark

被引:295
作者
Thompson, JD
Koehl, P
Ripp, R
Poch, O
机构
[1] Univ Strasbourg 1, INSERM, CNRS, Inst Genet & Biol Mol & Cellulaire,Dept Biol & Ge, F-67404 Illkirch Graffenstaden, France
[2] Univ Calif Davis, Genome Ctr, Davis, CA 95616 USA
[3] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
关键词
alignment accuracy; alignment reliability; reference alignment; program evaluation; program comparison; structure superposition;
D O I
10.1002/prot.20527
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignment is one of the cornerstones of modern molecular biology. It is used to identify conserved motifs, to determine protein domains, in 2D/3D structure prediction by homology and in evolutionary studies. Recently, high-throughput technologies such as genome sequencing and structural proteomics have lead to an explosion in the amount of sequence and structure information available. In response, several new multiple alignment methods have been developed that improve both the efficiency and the quality of protein alignments. Consequently, the benchmarks used to evaluate and compare these methods must also evolve. We present here the latest release of the most widely used multiple alignment benchmark, BAliBASE, which provides high quality, manually refined, reference alignments based on 3D structural superpositions. Version 3.0 of BAliBASE includes new, more challenging test cases, representing the real problems encountered when aligning large sets of complex sequences. Using a novel, semiautomatic update protocol, the number of protein families in the benchmark has been increased and representative test cases are now available that cover most of the protein fold space. The total number of proteins in BAliBASE has also been significantly increased from 1444 to 6255 sequences. In addition, full-length sequences are now provided for all test cases, which represent difficult cases for both global and local alignment programs. Finally, the BAliBASE Web site (http://www-bio3d-igbmc.ustrasbg.frfbalibase) has been completely redesigned to provide a more user-friendly, interactive interface for the visualization of the BAliBASE reference alignments and the associated annotations.
引用
收藏
页码:127 / 136
页数:10
相关论文
共 57 条
  • [51] BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs
    Thompson, JD
    Plewniak, F
    Poch, O
    [J]. BIOINFORMATICS, 1999, 15 (01) : 87 - 88
  • [52] Towards a reliable objective function for multiple sequence alignments
    Thompson, JD
    Plewniak, F
    Ripp, R
    Thierry, JC
    Poch, O
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 314 (04) : 937 - 951
  • [53] CLUSTAL-W - IMPROVING THE SENSITIVITY OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT THROUGH SEQUENCE WEIGHTING, POSITION-SPECIFIC GAP PENALTIES AND WEIGHT MATRIX CHOICE
    THOMPSON, JD
    HIGGINS, DG
    GIBSON, TJ
    [J]. NUCLEIC ACIDS RESEARCH, 1994, 22 (22) : 4673 - 4680
  • [54] DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches
    Thompson, JD
    Plewniak, F
    Thierry, JC
    Poch, O
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (15) : 2919 - 2926
  • [55] A comprehensive comparison of multiple sequence alignment programs
    Thompson, JD
    Plewniak, F
    Poch, O
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (13) : 2682 - 2690
  • [56] WALLE IV, 2004, BIOINFORMATICS
  • [57] An adaptive and iterative algorithm for refining multiple sequence alignment
    Wang, Y
    Li, KB
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2004, 28 (02) : 141 - 148