Mugsy: fast multiple alignment of closely related whole genomes

被引:312
作者
Angiuoli, Samuel V. [1 ,2 ]
Salzberg, Steven L. [1 ]
机构
[1] Univ Maryland, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
[2] Univ Maryland, Sch Med, Inst Genome Sci, Baltimore, MD 21201 USA
基金
美国国家卫生研究院;
关键词
SEQUENCE ALIGNMENT; MOUSE; EVOLUTION; DATABASE; LESSONS;
D O I
10.1093/bioinformatics/btq665
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The relative ease and low cost of current generation sequencing technologies has led to a dramatic increase in the number of sequenced genomes for species across the tree of life. This increasing volume of data requires tools that can quickly compare multiple whole-genome sequences, millions of base pairs in length, to aid in the study of populations, pan-genomes, and genome evolution. Results: We present a new multiple alignment tool for whole genomes named Mugsy. Mugsy is computationally efficient and can align 31 Streptococcus pneumoniae genomes in less than 2 hours producing alignments that compare favorably to other tools. Mugsy is also the fastest program evaluated for the multiple alignment of assembled human chromosome sequences from four individuals. Mugsy does not require a reference sequence, can align mixtures of assembled draft and completed genome data, and is robust in identifying a rich complement of genetic variation including duplications, rearrangements, and large-scale gain and loss of sequence.
引用
收藏
页码:334 / 342
页数:9
相关论文
共 43 条
[1]   The first Korean genome sequence and analysis: Full genome sequencing for a socio-ethnic group [J].
Ahn, Sung-Min ;
Kim, Tae-Hyung ;
Lee, Sunghoon ;
Kim, Deokhoon ;
Ghang, Ho ;
Kim, Dae-Soo ;
Kim, Byoung-Chul ;
Kim, Sang-Yoon ;
Kim, Woo-Yeon ;
Kim, Chulhong ;
Park, Daeui ;
Lee, Yong Seok ;
Kim, Sangsoo ;
Reja, Rohit ;
Jho, Sungwoong ;
Kim, Chang Geun ;
Cha, Ji-Young ;
Kim, Kyung-Hee ;
Lee, Bonghee ;
Bhak, Jong ;
Kim, Seong-Jin .
GENOME RESEARCH, 2009, 19 (09) :1622-1629
[2]   The many faces of sequence alignment [J].
Batzoglou, S .
BRIEFINGS IN BIOINFORMATICS, 2005, 6 (01) :6-22
[3]   Aligning multiple genomic sequences with the threaded blockset aligner [J].
Blanchette, M ;
Kent, WJ ;
Riemer, C ;
Elnitski, L ;
Smit, AFA ;
Roskin, KM ;
Baertsch, R ;
Rosenbloom, K ;
Clawson, H ;
Green, ED ;
Haussler, D ;
Miller, W .
GENOME RESEARCH, 2004, 14 (04) :708-715
[4]   Reconstructing the genomic architecture of ancestral mammals: Lessons from human, mouse, and rat genomes [J].
Bourque, G ;
Pevzner, PA ;
Tesler, G .
GENOME RESEARCH, 2004, 14 (04) :507-516
[5]   Fast Statistical Alignment [J].
Bradley, Robert K. ;
Roberts, Adam ;
Smoot, Michael ;
Juvekar, Sudeep ;
Do, Jaeyoung ;
Dewey, Colin ;
Holmes, Ian ;
Pachter, Lior .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (05)
[6]   AVID: A global alignment program [J].
Bray, N ;
Dubchak, I ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (01) :97-102
[7]   Comparative assessment of methods for aligning multiple genome sequences [J].
Chen, Xiaoyu ;
Tompa, Martin .
NATURE BIOTECHNOLOGY, 2010, 28 (06) :567-U53
[8]   A min-cut algorithm for the consistency problem in multiple sequence alignment [J].
Corel, Eduardo ;
Pitschi, Florian ;
Morgenstern, Burkhard .
BIOINFORMATICS, 2010, 26 (08) :1015-1021
[9]   progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement [J].
Darling, Aaron E. ;
Mau, Bob ;
Perna, Nicole T. .
PLOS ONE, 2010, 5 (06)
[10]   Mauve: Multiple alignment of conserved genomic sequence with rearrangements [J].
Darling, ACE ;
Mau, B ;
Blattner, FR ;
Perna, NT .
GENOME RESEARCH, 2004, 14 (07) :1394-1403