EXACT AND APPROXIMATION ALGORITHMS FOR SORTING BY REVERSALS, WITH APPLICATION TO GENOME REARRANGEMENT

被引:182
作者
KECECIOGLU, J [1 ]
SANKOFF, D [1 ]
机构
[1] UNIV MONTREAL,CTR RECH MATH,MONTREAL H3C 3J7,PQ,CANADA
关键词
COMPUTATIONAL BIOLOGY; APPROXIMATION ALGORITHMS; BRANCH-AND-BOUND ALGORITHMS; EXPERIMENTAL ANALYSIS OF ALGORITHMS; EDIT DISTANCE; PERMUTATIONS; SORTING BY REVERSALS; CHROMOSOME INVERSIONS; GENOME REARRANGEMENTS;
D O I
10.1007/BF01188586
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and a reversal takes an arbitrary substring of elements, and reverses their order. For this problem, we develop two algorithms: a greedy approximation algorithm, that finds a solution provably close to optimal in O(n(2)) time and O(n) space for n-element permutations, and a branch-and-bound exact algorithm, that finds an optimal solution in O(mL(n, n)) time and O(n(2)) space, where m is the size of the branch-and-bound search tree, and L(n, n) is the time tc, solve a linear program of n variables and n constraints. The greedy algorithm is the first to come within a constant factor of the optimum; it guarantees a solution that uses no more than twice the minimum number of reversals. The lower and upper bounds of the branch-and-bound algorithm are a novel application of maximum-weight matchings, shortest paths, and linear programming. In a series of experiments, we study the performance of an implementation on random permutations, and permutations generated by random reversals. For permutations differing by k random reversals, we find that the average upper bound on reversal distance estimates k to within one reversal for k < 1/2n and n less than or equal to 100. For the difficult case of random permutations, we find that the average difference between the upper and lower bounds is less than three reversals for n less than or equal to 50. Due to the tightness of these bounds, we can solve, to optimality, problems on 30 elements in a few minutes of computer time. This approaches the scale of mitochondrial genomes.
引用
收藏
页码:180 / 210
页数:31
相关论文
共 28 条
  • [21] EVOLUTIONARY SIGNIFICANCE OF INVERSIONS IN LEGUME CHLOROPLAST DNAS
    PALMER, JD
    OSORIO, B
    THOMPSON, WF
    [J]. CURRENT GENETICS, 1988, 14 (01) : 65 - 74
  • [22] GENE ORDER COMPARISONS FOR PHYLOGENETIC INFERENCE - EVOLUTION OF THE MITOCHONDRIAL GENOME
    SANKOFF, D
    LEDUC, G
    ANTOINE, N
    PAQUIN, B
    LANG, BF
    CEDERGREN, R
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (14) : 6575 - 6579
  • [23] A LOCAL ALGORITHM FOR DNA-SEQUENCE ALIGNMENT WITH INVERSIONS
    SCHONIGER, M
    WATERMAN, MS
    [J]. BULLETIN OF MATHEMATICAL BIOLOGY, 1992, 54 (04) : 521 - 536
  • [24] Sessions Stanley K., 1990, P156
  • [25] THE STRING-TO-STRING CORRECTION PROBLEM WITH BLOCK MOVES
    TICHY, WF
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1984, 2 (04): : 309 - 321
  • [26] WAGNER RA, 1983, TIME WARPS STRING ED, P215
  • [27] THE CHROMOSOME INVERSION PROBLEM
    WATTERSON, GA
    EWENS, WJ
    HALL, TE
    MORGAN, A
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 1982, 99 (01) : 1 - 7
  • [28] BIZARRE TRANSFER-RNAS INFERRED FROM DNA-SEQUENCES OF MITOCHONDRIAL GENOMES OF NEMATODE WORMS
    WOLSTENHOLME, DR
    MACFARLANE, JL
    OKIMOTO, R
    CLARY, DO
    WAHLEITHNER, JA
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (05) : 1324 - 1328