Whole genome duplications and contracted breakpoint graphs

被引:14
作者
Alekseyev, Max A. [1 ]
Pevzner, Pavel A. [1 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
关键词
genome duplication; genome halving; genome rearrangement; breakpoint graph; de Bruijn graph;
D O I
10.1137/05064727X
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The genome halving problem, motivated by the whole genome duplication events in molecular evolution, was solved by El-Mabrouk and Sankoff in the pioneering paper [SIAM J. Comput., 32 (2003), pp. 754-792]. The El-Mabrouk-Sankoff algorithm is rather complex, inspiring a quest for a simpler solution. An alternative approach to the genome halving problem based on the notion of the contracted breakpoint graph was recently proposed in [M. A. Alekseyev and P. A. Pevzner, IEEE/ACM Trans. Comput. Biol. Bioinformatics, 4 (2007), pp. 98-107]. This new technique reveals that while the El-Mabrouk-Sankoff result is correct in most cases, it does not hold in the case of unichromosomal genomes. This raises a problem of correcting a flaw in the El-Mabrouk-Sankoff analysis and devising an algorithm that deals adequately with all genomes. In this paper we efficiently classify all genomes into two classes and show that while the El-Mabrouk-Sankoff theorem holds for the first class, it is incorrect for the second class. The crux of our analysis is a new combinatorial invariant defined on duplicated permutations. Using this invariant we were able to come up with a full proof of the genome halving theorem and a polynomial algorithm for the genome halving problem.
引用
收藏
页码:1748 / 1763
页数:16
相关论文
共 42 条
[1]  
Alekseyev MA, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P665
[2]   Colored de Bruijn graphs and the genome halving problem [J].
Alekseyev, Max A. ;
Pevzner, Pavel A. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2007, 4 (01) :98-107
[3]   A linear-time algorithm for computing inversion distance between signed permutations with an experimental study [J].
Bader, DA ;
Moret, BME ;
Yan, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (05) :483-491
[4]   Genome rearrangements and sorting by reversals [J].
Bafna, V ;
Pevzner, PA .
SIAM JOURNAL ON COMPUTING, 1996, 25 (02) :272-289
[5]   Genome rearrangement distances and gene order phylogeny in γ-proteobacteria [J].
Belda, E ;
Moya, A ;
Silva, FJ .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (06) :1456-1467
[6]  
Bergeron A, 2004, LECT NOTES COMPUT SC, V3109, P388
[7]  
BERGERON A, 2001, LECT NOTES COMPUTER, V2089, P106
[8]  
Bourque G, 2005, LECT NOTES COMPUT SC, V3678, P21
[9]   Comparative architectures of mammalian and chicken genomes reveal highly variable rates of genomic rearrangements across different lineages [J].
Bourque, G ;
Zdobnov, EM ;
Bork, P ;
Pevzner, PA ;
Tesler, G .
GENOME RESEARCH, 2005, 15 (01) :98-110
[10]   Reconstructing the genomic architecture of ancestral mammals: Lessons from human, mouse, and rat genomes [J].
Bourque, G ;
Pevzner, PA ;
Tesler, G .
GENOME RESEARCH, 2004, 14 (04) :507-516