Sorting by transpositions

被引:262
作者
Bafna, V
Pevzner, PA
机构
[1] SmithKline Beecham Pharmaceut, Bioinformat, King Of Prussia, PA 19406 USA
[2] DIMACS, Piscataway, NJ USA
[3] Penn State Univ, University Pk, PA 16802 USA
[4] Univ So Calif, Dept Math & Comp Sci, Los Angeles, CA 90089 USA
关键词
computational molecular biology; genome rearrangements; transpositions; the symmetric group; approximation algorithm;
D O I
10.1137/S089548019528280X
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Sequence comparison in computational molecular biology is a powerful tool for deriving evolutionary and functional relationships between genes. However, classical alignment algorithms handle only local mutations (i.e., insertions, deletions, and substitutions of nucleotides) and ignore global rearrangements (i.e., inversions and transpositions of long fragments). As a result, the applications of sequence alignment to analyze highly rearranged genomes (i.e., herpes viruses or plant mitochondrial DNA) are rather limited. The paper addresses the problem of genome comparison versus classical gene comparison and presents algorithms to analyze rearrangements in genomes evolving by transpositions. In the simplest form the problem corresponds to sorting by transpositions, i.e., sorting of an array using transpositions of arbitrary fragments. We derive lower bounds on transposition distance between permutations and present approximation algorithms for sorting by transpositions. The algorithms also imply a nontrivial upper bound on the transposition diameter of the symmetric group. Finally, we formulate two biological problems in genome rearrangements and describe the first algorithmic steps toward their solution.
引用
收藏
页码:224 / 240
页数:17
相关论文
共 24 条
[1]   SORTING BY INSERTION OF LEADING ELEMENTS [J].
AIGNER, M ;
WEST, DB .
JOURNAL OF COMBINATORIAL THEORY SERIES A, 1987, 45 (02) :306-309
[2]   Genome rearrangements and sorting by reversals [J].
Bafna, V ;
Pevzner, PA .
SIAM JOURNAL ON COMPUTING, 1996, 25 (02) :272-289
[3]  
BAFNA V, 1995, MOL BIOL EVOL, V12, P239
[4]  
COHEN D, 1993, UNPUB IMPROVED BOUND
[5]   A GENETIC-LINKAGE MAP OF THE MOUSE - CURRENT APPLICATIONS AND FUTURE-PROSPECTS [J].
COPELAND, NG ;
JENKINS, NA ;
GILBERT, DJ ;
EPPIG, JT ;
MALTAIS, LJ ;
MILLER, JC ;
DIETRICH, WF ;
WEAVER, A ;
LINCOLN, SE ;
STEEN, RG ;
STEIN, LD ;
NADEAU, JH ;
LANDER, ES .
SCIENCE, 1993, 262 (5130) :57-66
[6]  
DAVISSON M T, 1987, Genomics, V1, P213, DOI 10.1016/0888-7543(87)90047-4
[7]   THE HUMAN PSEUDOAUTOSOMAL GM-CSF RECEPTOR ALPHA-SUBUNIT GENE IS AUTOSOMAL IN MOUSE [J].
DISTECHE, CM ;
BRANNAN, CI ;
LARSEN, A ;
ADLER, DA ;
SCHORDERET, DF ;
GEARING, D ;
COPELAND, NG ;
JENKINS, NA ;
PARK, LS .
NATURE GENETICS, 1992, 1 (05) :333-336
[8]   THE MINIMUM-LENGTH GENERATOR SEQUENCE PROBLEM IS NP-HARD [J].
EVEN, S ;
GOLDREICH, O .
JOURNAL OF ALGORITHMS, 1981, 2 (03) :311-313
[10]   BOUNDS FOR SORTING BY PREFIX REVERSAL [J].
GATES, WH ;
PAPADIMITRIOU, CH .
DISCRETE MATHEMATICS, 1979, 27 (01) :47-57