Aligning multiple genomic sequences with the threaded blockset aligner

被引:1066
作者
Blanchette, M
Kent, WJ
Riemer, C
Elnitski, L
Smit, AFA
Roskin, KM
Baertsch, R
Rosenbloom, K
Clawson, H
Green, ED
Haussler, D
Miller, W [1 ]
机构
[1] Penn State Univ, Ctr Comparat Genom & Bioinformat, University Pk, PA 16802 USA
[2] Univ Calif Santa Cruz, Howard Hughes Med Inst, Santa Cruz, CA 95064 USA
[3] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
[4] Inst Syst Biol, Seattle, WA 98103 USA
[5] NHGRI, Genome Technol Branch, NIH, Bethesda, MD 20892 USA
[6] NHGRI, NIH Intramural Sequencing Ctr, NIH, Bethesda, MD 20892 USA
关键词
D O I
10.1101/gr.1933104
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We define a "threaded blockset," which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for "threaded blockset aligner") builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser.
引用
收藏
页码:708 / 715
页数:8
相关论文
共 26 条
  • [1] Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes
    Aparicio, S
    Chapman, J
    Stupka, E
    Putnam, N
    Chia, J
    Dehal, P
    Christoffels, A
    Rash, S
    Hoon, S
    Smit, A
    Gelpke, MDS
    Roach, J
    Oh, T
    Ho, IY
    Wong, M
    Detter, C
    Verhoef, F
    Predki, P
    Tay, A
    Lucas, S
    Richardson, P
    Smith, SF
    Clark, MS
    Edwards, YJK
    Doggett, N
    Zharkikh, A
    Tavtigian, SV
    Pruss, D
    Barnstead, M
    Evans, C
    Baden, H
    Powell, J
    Glusman, G
    Rowen, L
    Hood, L
    Tan, YH
    Elgar, G
    Hawkins, T
    Venkatesh, B
    Rokhsar, D
    Brenner, S
    [J]. SCIENCE, 2002, 297 (5585) : 1301 - 1310
  • [2] MAVID multiple alignment server
    Bray, N
    Pachter, L
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (13) : 3525 - 3526
  • [3] LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA
    Brudno, M
    Do, CB
    Cooper, GM
    Kim, MF
    Davydov, E
    Green, ED
    Sidow, A
    Batzoglou, S
    [J]. GENOME RESEARCH, 2003, 13 (04) : 721 - 731
  • [4] Fast and sensitive alignment of large genomic sequences
    Brudno, M
    Morgenstern, B
    [J]. CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE, 2002, : 138 - 147
  • [5] A vision for the future of genomics research
    Collins, FS
    Green, ED
    Guttmacher, AE
    Guyer, MS
    [J]. NATURE, 2003, 422 (6934) : 835 - 847
  • [6] GLOBIN GENE SERVER - A PROTOTYPE E-MAIL DATABASE SERVER FEATURING EXTENSIVE MULTIPLE ALIGNMENTS AND DATA COMPILATION FOR ELECTRONIC GENETIC-ANALYSIS
    HARDISON, R
    CHAO, KM
    SCHWARTZ, S
    STOJANOVIC, N
    GANETSKY, M
    MILLER, W
    [J]. GENOMICS, 1994, 21 (02) : 344 - 353
  • [7] HEIN J, 1989, MOL BIOL EVOL, V6, P649
  • [8] The human genome browser at UCSC
    Kent, WJ
    Sugnet, CW
    Furey, TS
    Roskin, KM
    Pringle, TH
    Zahler, AM
    Haussler, D
    [J]. GENOME RESEARCH, 2002, 12 (06) : 996 - 1006
  • [9] Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes
    Kent, WJ
    Baertsch, R
    Hinrichs, A
    Miller, W
    Haussler, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (20) : 11484 - 11489
  • [10] Initial sequencing and analysis of the human genome
    Lander, ES
    Int Human Genome Sequencing Consortium
    Linton, LM
    Birren, B
    Nusbaum, C
    Zody, MC
    Baldwin, J
    Devon, K
    Dewar, K
    Doyle, M
    FitzHugh, W
    Funke, R
    Gage, D
    Harris, K
    Heaford, A
    Howland, J
    Kann, L
    Lehoczky, J
    LeVine, R
    McEwan, P
    McKernan, K
    Meldrim, J
    Mesirov, JP
    Miranda, C
    Morris, W
    Naylor, J
    Raymond, C
    Rosetti, M
    Santos, R
    Sheridan, A
    Sougnez, C
    Stange-Thomann, N
    Stojanovic, N
    Subramanian, A
    Wyman, D
    Rogers, J
    Sulston, J
    Ainscough, R
    Beck, S
    Bentley, D
    Burton, J
    Clee, C
    Carter, N
    Coulson, A
    Deadman, R
    Deloukas, P
    Dunham, A
    Dunham, I
    Durbin, R
    French, L
    [J]. NATURE, 2001, 409 (6822) : 860 - 921