De novo assembly of human genomes with massively parallel short read sequencing

被引:2257
作者
Li, Ruiqiang [1 ,2 ]
Zhu, Hongmei [1 ]
Ruan, Jue [1 ]
Qian, Wubin [1 ]
Fang, Xiaodong [1 ]
Shi, Zhongbin [1 ]
Li, Yingrui [1 ]
Li, Shengting [1 ]
Shan, Gao [1 ]
Kristiansen, Karsten [1 ,2 ]
Li, Songgang [1 ]
Yang, Huanming [1 ]
Wang, Jian [1 ]
Wang, Jun [1 ,2 ]
机构
[1] Beijing Genom Inst Shenzhen, Shenzhen 518083, Peoples R China
[2] Univ Copenhagen, Dept Biol, DK-2200 Copenhagen, Denmark
基金
中国国家自然科学基金;
关键词
SHORT DNA-SEQUENCES; ALIGNMENT; MILLIONS; PROGRAM;
D O I
10.1101/gr.097261.109
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.
引用
收藏
页码:265 / 272
页数:8
相关论文
共 31 条
  • [1] Batzoglou S, 2002, GENOME RES, V12, P177, DOI 10.1101/gr.208902
  • [2] Accurate whole human genome sequencing using reversible terminator chemistry
    Bentley, David R.
    Balasubramanian, Shankar
    Swerdlow, Harold P.
    Smith, Geoffrey P.
    Milton, John
    Brown, Clive G.
    Hall, Kevin P.
    Evers, Dirk J.
    Barnes, Colin L.
    Bignell, Helen R.
    Boutell, Jonathan M.
    Bryant, Jason
    Carter, Richard J.
    Cheetham, R. Keira
    Cox, Anthony J.
    Ellis, Darren J.
    Flatbush, Michael R.
    Gormley, Niall A.
    Humphray, Sean J.
    Irving, Leslie J.
    Karbelashvili, Mirian S.
    Kirk, Scott M.
    Li, Heng
    Liu, Xiaohai
    Maisinger, Klaus S.
    Murray, Lisa J.
    Obradovic, Bojan
    Ost, Tobias
    Parkinson, Michael L.
    Pratt, Mark R.
    Rasolonjatovo, Isabelle M. J.
    Reed, Mark T.
    Rigatti, Roberto
    Rodighiero, Chiara
    Ross, Mark T.
    Sabot, Andrea
    Sankar, Subramanian V.
    Scally, Aylwyn
    Schroth, Gary P.
    Smith, Mark E.
    Smith, Vincent P.
    Spiridou, Anastassia
    Torrance, Peta E.
    Tzonev, Svilen S.
    Vermaas, Eric H.
    Walter, Klaudia
    Wu, Xiaolin
    Zhang, Lu
    Alam, Mohammed D.
    Anastasi, Carole
    [J]. NATURE, 2008, 456 (7218) : 53 - 59
  • [3] Whole-genome re-sequencing
    Bentley, David R.
    [J]. CURRENT OPINION IN GENETICS & DEVELOPMENT, 2006, 16 (06) : 545 - 552
  • [4] ALLPATHS: De novo assembly of whole-genome shotgun microreads
    Butler, Jonathan
    MacCallum, Iain
    Kleber, Michael
    Shlyakhter, Ilya A.
    Belmonte, Matthew K.
    Lander, Eric S.
    Nusbaum, Chad
    Jaffe, David B.
    [J]. GENOME RESEARCH, 2008, 18 (05) : 810 - 820
  • [5] Short read fragment assembly of bacterial genomes
    Chaisson, Mark J.
    Pevzner, Pavel A.
    [J]. GENOME RESEARCH, 2008, 18 (02) : 324 - 330
  • [6] SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing
    Dohm, Juliane C.
    Lottaz, Claudio
    Borodina, Tatiana
    Himmelbauer, Heinz
    [J]. GENOME RESEARCH, 2007, 17 (11) : 1697 - 1706
  • [7] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3
  • [8] Single-molecule DNA sequencing of a viral genome
    Harris, Timothy D.
    Buzby, Phillip R.
    Babcock, Hazen
    Beer, Eric
    Bowers, Jayson
    Braslavsky, Ido
    Causey, Marie
    Colonell, Jennifer
    DiMeo, James
    Efcavitch, J. William
    Giladi, Eldar
    Gill, Jaime
    Healy, John
    Jarosz, Mirna
    Lapen, Dan
    Moulton, Keith
    Quake, Stephen R.
    Steinmann, Kathleen
    Thayer, Edward
    Tyurina, Anastasia
    Ward, Rebecca
    Weiss, Howard
    Xie, Zheng
    [J]. SCIENCE, 2008, 320 (5872) : 106 - 109
  • [9] The atlas genome assembly system
    Havlak, P
    Chen, R
    Durbin, KJ
    Egan, A
    Ren, YR
    Song, XZ
    Weinstock, GM
    Gibbs, RA
    [J]. GENOME RESEARCH, 2004, 14 (04) : 721 - 732
  • [10] De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer
    Hernandez, David
    Francois, Patrice
    Farinelli, Laurent
    Osteras, Magne
    Schrenzel, Jacques
    [J]. GENOME RESEARCH, 2008, 18 (05) : 802 - 809