Cactus: Algorithms for genome multiple sequence alignment

被引:158
作者
Paten, Benedict [1 ]
Earl, Dent [1 ]
Ngan Nguyen [1 ]
Diekhans, Mark [1 ]
Zerbino, Daniel [1 ]
Haussler, David [1 ]
机构
[1] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
关键词
REARRANGEMENTS; VERTEBRATE; ELEMENTS; BROWSER; GRAPHS; DNA;
D O I
10.1101/gr.123356.111
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Much attention has been given to the problem of creating reliable multiple sequence alignments in a model incorporating substitutions, insertions, and deletions. Far less attention has been paid to the problem of optimizing alignments in the presence of more general rearrangement and copy number variation. Using Cactus graphs, recently introduced for representing sequence alignments, we describe two complementary algorithms for creating genomic alignments. We have implemented these algorithms in the new "Cactus'' alignment program. We test Cactus using the Evolver genome evolution simulator, a comprehensive new tool for simulation, and show using these and existing simulations that Cactus significantly outperforms all of its peers. Finally, we make an empirical assessment of Cactus's ability to properly align genes and find interesting cases of intra-gene duplication within the primates.
引用
收藏
页码:1512 / 1528
页数:17
相关论文
共 32 条
  • [1] Reconstructing large regions of an ancestral mammalian genome in silico
    Blanchette, M
    Green, ED
    Miller, W
    Haussler, D
    [J]. GENOME RESEARCH, 2004, 14 (12) : 2412 - 2423
  • [2] Aligning multiple genomic sequences with the threaded blockset aligner
    Blanchette, M
    Kent, WJ
    Riemer, C
    Elnitski, L
    Smit, AFA
    Roskin, KM
    Baertsch, R
    Rosenbloom, K
    Clawson, H
    Green, ED
    Haussler, D
    Miller, W
    [J]. GENOME RESEARCH, 2004, 14 (04) : 708 - 715
  • [3] Fast Statistical Alignment
    Bradley, Robert K.
    Roberts, Adam
    Smoot, Michael
    Juvekar, Sudeep
    Do, Jaeyoung
    Dewey, Colin
    Holmes, Ian
    Pachter, Lior
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (05)
  • [4] MAVID: Constrained ancestral alignment of multiple sequences
    Bray, N
    Pachter, L
    [J]. GENOME RESEARCH, 2004, 14 (04) : 693 - 699
  • [5] LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA
    Brudno, M
    Do, CB
    Cooper, GM
    Kim, MF
    Davydov, E
    Green, ED
    Sidow, A
    Batzoglou, S
    [J]. GENOME RESEARCH, 2003, 13 (04) : 721 - 731
  • [6] Glocal alignment: finding rearrangements during alignment
    Brudno, Michael
    Malde, Sanket
    Poliakov, Alexander
    Do, Chuong B.
    Couronne, Olivier
    Dubchak, Inna
    Batzoglou, Serafim
    [J]. BIOINFORMATICS, 2003, 19 : i54 - i62
  • [7] progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement
    Darling, Aaron E.
    Mau, Bob
    Perna, Nicole T.
    [J]. PLOS ONE, 2010, 5 (06):
  • [8] Mauve: Multiple alignment of conserved genomic sequence with rearrangements
    Darling, ACE
    Mau, B
    Blattner, FR
    Perna, NT
    [J]. GENOME RESEARCH, 2004, 14 (07) : 1394 - 1403
  • [9] DEBRUIJN NG, 1946, VOLKSKUNDE, V1, P758
  • [10] Dewey Colin N., 2007, V395, P221