Align-m - a new algorithm for multiple alignment of highly divergent sequences

被引:52
作者
Van Walle, I
Lasters, I
Wyns, L
机构
[1] Free Univ Brussels, Dept Ultrastruct, B-1050 Brussels, Belgium
[2] Algon NV, B-9052 Ghent, Belgium
关键词
D O I
10.1093/bioinformatics/bth116
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Multiple alignment of highly divergent sequences is a challenging problem for which available programs tend to show poor performance. Generally, this is due to a scoring function that does not describe biological reality accurately enough or a heuristic that cannot explore solution space efficiently enough. In this respect, we present a new program, Align-m, that uses a non-progressive local approach to guide a global alignment. Results: Two large test sets were used that represent the entire SCOP classification and cover sequence similarities between 0 and 50% identity. Performance was compared with the publicly available algorithms ClustalW, T-Coffee and DiAlign. In general, Align-m has comparable or slightly higher accuracy in terms of correctly aligned residues, especially for distantly related sequences. Importantly, it aligns much fewer residues incorrectly, with average differences of over 15% compared with some of the other algorithms.
引用
收藏
页码:1428 / 1435
页数:8
相关论文
共 38 条
  • [1] Abdeddaïm S, 1997, LECT NOTES COMPUT SC, V1264, P167
  • [2] A STRATEGY FOR THE RAPID MULTIPLE ALIGNMENT OF PROTEIN SEQUENCES - CONFIDENCE LEVELS FROM TERTIARY STRUCTURE COMPARISONS
    BARTON, GJ
    STERNBERG, MJE
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1987, 198 (02) : 327 - 337
  • [3] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [4] OPTIMAL PROTEIN-STRUCTURE ALIGNMENTS BY MULTIPLE LINKAGE CLUSTERING - APPLICATION TO DISTANTLY RELATED PROTEINS
    BOUTONNET, NS
    ROOMAN, MJ
    OCHAGAVIA, ME
    RICHELLE, J
    WODAK, SJ
    [J]. PROTEIN ENGINEERING, 1995, 8 (07): : 647 - 662
  • [5] The ASTRAL compendium for protein structure and sequence analysis
    Brenner, SE
    Koehl, P
    Levitt, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 254 - 256
  • [6] A symmetric-iterated multiple alignment of protein sequences
    Brocchieri, L
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 276 (01) : 249 - 264
  • [7] CARILLO H, 1988, SIAM J APPL MATH, V48, P1073
  • [8] DEPIEREUX E, 1992, COMPUT APPL BIOSCI, V8, P501
  • [9] THE DEAD-END ELIMINATION THEOREM AND ITS USE IN PROTEIN SIDE-CHAIN POSITIONING
    DESMET, J
    DEMAEYER, M
    HAZES, B
    LASTERS, I
    [J]. NATURE, 1992, 356 (6369) : 539 - 542
  • [10] Fast and Accurate Side-Chain Topology and Energy Refinement (FASTER) as a new method for protein structure optimization
    Desmet, J
    Spriet, J
    Lasters, I
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2002, 48 (01): : 31 - 43