Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics

被引:12
作者
Althaus, E
Caprara, A
Lenhof, HP
Reinert, K
机构
[1] Int Comp Sci Inst, Berkeley, CA 94704 USA
[2] Univ Bologna, DEIS, I-40136 Bologna, Italy
[3] Univ Saarland, Zentrum Bioinformat, D-66123 Saarbrucken, Germany
[4] Celera Genom, Informat Res, Rockville, MD 20850 USA
关键词
D O I
10.1093/bioinformatics/18.suppl_2.S4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignment is one of the dominant problems in computational molecular biology. Numerous scoring functions and methods have been proposed, most of which result in NP-hard problems. In this paper we propose for the first time a general formulation for multiple alignment with arbitrary gap-costs based on an integer linear program (ILP). In addition we describe a branch-and-cut algorithm to effectively solve the ILP to optimality. We evaluate the performances of our approach in terms of running time and quality of the alignments using the BAliBase database of reference alignments. The results show that our implementation ranks amongst the best programs developed so far.
引用
收藏
页码:S4 / S16
页数:13
相关论文
共 26 条
  • [1] ALTHAUS E, 2000, P 4 ANN INT C COMP M, P15
  • [2] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [3] [Anonymous], DIMACS SERIES DISCRE
  • [4] [Anonymous], 1978, Atlas of protein sequence and structure
  • [5] Christof T., 1997, P 1 ANN INT C COMP M, P84
  • [6] Alignment of whole genomes
    Delcher, AL
    Kasif, S
    Fleischmann, RD
    Peterson, J
    White, O
    Salzberg, SL
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (11) : 2369 - 2376
  • [7] SEQUENCE COMPARISON WITH MIXED CONVEX AND CONCAVE COSTS
    EPPSTEIN, D
    [J]. JOURNAL OF ALGORITHMS, 1990, 11 (01) : 85 - 101
  • [8] AN IMPROVED ALGORITHM FOR MATCHING BIOLOGICAL SEQUENCES
    GOTOH, O
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (03) : 705 - 708
  • [9] Gupta S K, 1995, J Comput Biol, V2, P459, DOI 10.1089/cmb.1995.2.459
  • [10] Gusfield D, 1997, ALGORITHMS STRINGS T