Sequencing of natural strains of Arabidopsis thaliana with short reads

被引:342
作者
Ossowski, Stephan [1 ]
Schneeberger, Korbinian [1 ]
Clark, Richard M. [1 ]
Lanz, Christa [1 ]
Warthmann, Norman [1 ]
Weigel, Detlef [1 ]
机构
[1] Max Planck Inst Dev Biol, Dept Mol Biol, D-72076 Tubingen, Germany
关键词
D O I
10.1101/gr.080200.108
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Whole-genome hybridization studies have suggested that the nuclear genomes of accessions (natural strains) of Arabidopsis thaliana can differ by several percent of their sequence. To examine this variation, and as a first step in the 1001 Genomes Project for this species, we produced 15- to 25-fold coverage in Illumina sequencing-by-synthesis (SBS) reads for the reference accession, Col-0, and two divergent strains, Bur-0 and Tsu-1. We aligned reads to the reference genome sequence to assess data quality metrics and to detect polymorphisms. Alignments revealed 823,325 unique single nucleotide polymorphisms (SNPs) and 79,961 unique 1- to 3-bp indels in the divergent accessions at a specificity of >99%, and over 2000 potential errors in the reference genome sequence. We also identified >3.4 Mb of the Bur-0 and Tsu-1 genomes as being either extremely dissimilar, deleted, or duplicated relative to the reference genome. To obtain sequences for these regions, we incorporated the Velvet assembler into a targeted de novo assembly method. This approach yielded 10,921 high-confidence contigs that were anchored to flanking sequences and harbored indels as large as 641 bp. Our methods are broadly applicable for polymorphism discovery in moderate to large genomes even at highly diverged loci, and we established by subsampling the Illumina SBS coverage depth required to inform a broad range of functional and evolutionary studies. Our pipeline for aligning reads and predicting SNPs and indels, SHORE, is available for download at http://1001genomes.org.
引用
收藏
页码:2024 / 2033
页数:10
相关论文
共 24 条
  • [1] Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
    Kaul, S
    Koo, HL
    Jenkins, J
    Rizzo, M
    Rooney, T
    Tallon, LJ
    Feldblyum, T
    Nierman, W
    Benito, MI
    Lin, XY
    Town, CD
    Venter, JC
    Fraser, CM
    Tabata, S
    Nakamura, Y
    Kaneko, T
    Sato, S
    Asamizu, E
    Kato, T
    Kotani, H
    Sasamoto, S
    Ecker, JR
    Theologis, A
    Federspiel, NA
    Palm, CJ
    Osborne, BI
    Shinn, P
    Conway, AB
    Vysotskaia, VS
    Dewar, K
    Conn, L
    Lenz, CA
    Kim, CJ
    Hansen, NF
    Liu, SX
    Buehler, E
    Altafi, H
    Sakano, H
    Dunn, P
    Lam, B
    Pham, PK
    Chao, Q
    Nguyen, M
    Yu, GX
    Chen, HM
    Southwick, A
    Lee, JM
    Miranda, M
    Toriumi, MJ
    Davis, RW
    [J]. NATURE, 2000, 408 (6814) : 796 - 815
  • [2] Autoimmune response as a mechanism for a Dobzhansky-Muller-type incompatibility syndrome in plants
    Bomblies, Kirsten
    Lempe, Janne
    Epple, Petra
    Warthmann, Norman
    Lanz, Christa
    Dangl, Jeffery L.
    Weigel, Detlef
    [J]. PLOS BIOLOGY, 2007, 5 (09): : 1962 - 1972
  • [3] Large-scale identification of single-feature polymorphisms in complex genomes
    Borevitz, JO
    Liang, D
    Plouffe, D
    Chang, HS
    Zhu, T
    Weigel, D
    Berry, CC
    Winzeler, E
    Chory, J
    [J]. GENOME RESEARCH, 2003, 13 (03) : 513 - 523
  • [4] Genome-wide patterns of single-feature polymorphism in Arabidopsis thaliana
    Borevitz, Justin O.
    Hazen, Samuel P.
    Michael, Todd P.
    Morris, Geoffrey P.
    Baxter, Ivan R.
    Hu, Tina T.
    Chen, Huaming
    Werner, Jonathan D.
    Nordborg, Magnus
    Salf, David E.
    Kay, Steve A.
    Chory, Joanne
    Weigel, Detlef
    Jones, Jonathan D. G.
    Ecker, Joseph R.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (29) : 12057 - 12062
  • [5] ALLPATHS: De novo assembly of whole-genome shotgun microreads
    Butler, Jonathan
    MacCallum, Iain
    Kleber, Michael
    Shlyakhter, Ilya A.
    Belmonte, Matthew K.
    Lander, Eric S.
    Nusbaum, Chad
    Jaffe, David B.
    [J]. GENOME RESEARCH, 2008, 18 (05) : 810 - 820
  • [6] Short read fragment assembly of bacterial genomes
    Chaisson, Mark J.
    Pevzner, Pavel A.
    [J]. GENOME RESEARCH, 2008, 18 (02) : 324 - 330
  • [7] Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana
    Clark, Richard M.
    Schweikert, Gabriele
    Toomajian, Christopher
    Ossowski, Stephan
    Zeller, Georg
    Shinn, Paul
    Warthmann, Norman
    Hu, Tina T.
    Fu, Glenn
    Hinds, David A.
    Chen, Huaming
    Frazer, Kelly A.
    Huson, Daniel H.
    Schoelkopf, Bernhard
    Nordborg, Magnus
    Raetsch, Gunnar
    Ecker, Joseph R.
    Weigel, Detlef
    [J]. SCIENCE, 2007, 317 (5836) : 338 - 342
  • [8] Emrich SJ, 2007, GENETICS, V175, P429, DOI 10.1534/genetics.106.064006
  • [9] De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer
    Hernandez, David
    Francois, Patrice
    Farinelli, Laurent
    Osteras, Magne
    Schrenzel, Jacques
    [J]. GENOME RESEARCH, 2008, 18 (05) : 802 - 809
  • [10] Whole-genome sequencing and variant discovery in C-elegans
    Hillier, LaDeana W.
    Marth, Gabor T.
    Quinlan, Aaron R.
    Dooling, David
    Fewell, Ginger
    Barnett, Derek
    Fox, Paul
    Glasscock, Jarret I.
    Hickenbotham, Matthew
    Huang, Weichun
    Magrini, Vincent J.
    Richt, Ryan J.
    Sander, Sacha N.
    Stewart, Donald A.
    Stromberg, Michael
    Tsung, Eric F.
    Wylie, Todd
    Schedl, Tim
    Wilson, Richard K.
    Mardis, Elaine R.
    [J]. NATURE METHODS, 2008, 5 (02) : 183 - 188