Multiple whole-genome alignments without a reference organism

被引:51
作者
Dubchak, Inna [3 ,4 ]
Poliakov, Alexander [3 ]
Kislyuk, Andrey [5 ]
Brudno, Michael [1 ,2 ]
机构
[1] Univ Toronto, Banting & Best Dept Med Res, Dept Comp Sci, Toronto, ON M5R 3G4, Canada
[2] Univ Toronto, Ctr Anal Genome Evolut & Funct, Toronto, ON M5R 3G4, Canada
[3] Univ Calif Berkeley, Lawrence Berkeley Lab, Genome Sci Div, Berkeley, CA 94720 USA
[4] DOE Joint Genome Inst, Walnut Creek, CA 94598 USA
[5] Georgia Inst Technol, Dept Comp Sci, Atlanta, GA 30332 USA
基金
加拿大自然科学与工程研究理事会;
关键词
CONSERVED NONCODING SEQUENCES; GENE PREDICTION; MOUSE; ELEMENTS; REARRANGEMENT; TOOLS; MODEL; RAT;
D O I
10.1101/gr.081778.108
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignments have become one of the most commonly used resources in genomics research. Most algorithms for multiple alignment of whole genomes rely either on a reference genome, against which all of the other sequences are laid out, or require a one-to-one mapping between the nucleotides of the genomes, preventing the alignment of recently duplicated regions. Both approaches have drawbacks for whole-genome comparisons. In this paper we present a novel symmetric alignment algorithm. The resulting alignments not only represent all of the genomes equally well, but also include all relevant duplications that occurred since the divergence from the last common ancestor. Our algorithm, implemented as a part of the VISTA Genome Pipeline (VGP), was used to align seven vertebrate and six Drosophila genomes. The resulting whole-genome alignments demonstrate a higher sensitivity and specificity than the pairwise alignments previously available through the VGP and have higher exon alignment accuracy than comparable public whole-genome alignments. Of the multiple alignment methods tested, ours performed the best at aligning genes from multigene families-perhaps the most challenging test for whole-genome alignments. Our whole- genome multiple alignments are available through the VISTA Browser at http://genome.lbl.gov/vista/index.shtml.
引用
收藏
页码:682 / 689
页数:8
相关论文
共 38 条
[1]   Human GLI3 Intragenic Conserved Non-Coding Sequences Are Tissue-Specific Enhancers [J].
Abbasi, Amir Ali ;
Paparidis, Zissis ;
Malik, Sajid ;
Goode, Debbie K. ;
Callaway, Heather ;
Elgar, Greg ;
Grzeschik, Karl-Heinz .
PLOS ONE, 2007, 2 (04)
[2]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[3]   Ultraconserved elements in the human genome [J].
Bejerano, G ;
Pheasant, M ;
Makunin, I ;
Stephen, S ;
Kent, WJ ;
Mattick, JS ;
Haussler, D .
SCIENCE, 2004, 304 (5675) :1321-1325
[4]   Aligning multiple genomic sequences with the threaded blockset aligner [J].
Blanchette, M ;
Kent, WJ ;
Riemer, C ;
Elnitski, L ;
Smit, AFA ;
Roskin, KM ;
Baertsch, R ;
Rosenbloom, K ;
Clawson, H ;
Green, ED ;
Haussler, D ;
Miller, W .
GENOME RESEARCH, 2004, 14 (04) :708-715
[5]   MAVID: Constrained ancestral alignment of multiple sequences [J].
Bray, N ;
Pachter, L .
GENOME RESEARCH, 2004, 14 (04) :693-699
[6]   AVID: A global alignment program [J].
Bray, N ;
Dubchak, I ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (01) :97-102
[7]   LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA [J].
Brudno, M ;
Do, CB ;
Cooper, GM ;
Kim, MF ;
Davydov, E ;
Green, ED ;
Sidow, A ;
Batzoglou, S .
GENOME RESEARCH, 2003, 13 (04) :721-731
[8]   Automated whole-genome multiple alignment of rat, mouse, and human [J].
Brudno, M ;
Poliakov, A ;
Salamov, A ;
Cooper, GM ;
Sidow, A ;
Rubin, EM ;
Solovyev, V ;
Batzoglou, S ;
Dubchak, I .
GENOME RESEARCH, 2004, 14 (04) :685-692
[9]   Glocal alignment: finding rearrangements during alignment [J].
Brudno, Michael ;
Malde, Sanket ;
Poliakov, Alexander ;
Do, Chuong B. ;
Couronne, Olivier ;
Dubchak, Inna ;
Batzoglou, Serafim .
BIOINFORMATICS, 2003, 19 :i54-i62
[10]   Strategies and tools for whole-genome alignments [J].
Couronne, O ;
Poliakov, A ;
Bray, N ;
Ishkhanov, T ;
Ryaboy, D ;
Rubin, E ;
Pachter, L ;
Dubchak, I .
GENOME RESEARCH, 2003, 13 (01) :73-80