Genomic distances under deletions and insertions

被引:40
作者
Marron, M [1 ]
Swenson, KM [1 ]
Moret, BME [1 ]
机构
[1] Univ New Mexico, Dept Comp Sci, Albuquerque, NM 87131 USA
关键词
D O I
10.1016/j.tcs.2004.02.039
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As more and more genomes are sequenced, evolutionary biologists are becoming increasingly interested in evolution at the level of whole genomes, in scenarios in which the genome evolves through insertions, deletions, and movements of genes along its chromosomes. In the mathematical model pioneered by Sankoff and others, a unichromosomal genome is represented by a signed permutation of a multiset of genes; Hannenhalli and Pevzner showed that the edit distance between two signed permutations of the same set can be computed in polynomial time when all operations are inversions. El-Mabrouk extended that result to allow deletions (or conversely, a limited form of insertions which forbids duplications). In this paper, we extend El-Mabrouk's work to handle duplications as well as insertions and present an alternate framework for computing (near) minimal edit sequences involving insertions, deletions, and inversions. We derive an error bound for our polynomial-time distance computation under various assumptions and present preliminary experimental results that suggest that performance in practice may be excellent, within a few percent of the actual distance. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:347 / 360
页数:14
相关论文
共 19 条
  • [1] [Anonymous], EL C COMP COMPL
  • [2] A linear-time algorithm for computing inversion distance between signed permutations with an experimental study
    Bader, DA
    Moret, BME
    Yan, M
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (05) : 483 - 491
  • [3] Bryant D., 2000, Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families, DOI DOI 10.1007/978-94-011-4309-7
  • [4] CAPRARA A, 2001, LECT NOTES COMPUTER, V2149, P238
  • [5] Caprara A, 1999, P 3 ANN INT C COMP M, P84
  • [6] Caprara A., 1997, P 1 ANN INT C COMP M, P75, DOI DOI 10.1145/267521.267531
  • [7] Downie SR., 1992, Molecular Systematics of Plants, P14, DOI DOI 10.1007/978-1-4615-3276-7_2
  • [8] El-Mabrouk N, 2000, LECT NOTES COMPUT SC, V1848, P222
  • [9] Hannenhalli S., 1995, Proceedings of the Twenty-Seventh Annual ACM Symposium on the Theory of Computing, P178, DOI 10.1145/225058.225112
  • [10] LAGERGREN J, 1 IMA RECOMB SAT WOR