Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

被引:97
作者
Li, Yingrui [1 ]
Zheng, Hancheng [1 ]
Luo, Ruibang [1 ,2 ,3 ]
Wu, Honglong [1 ,4 ]
Zhu, Hongmei [1 ]
Li, Ruiqiang [1 ]
Cao, Hongzhi [1 ,4 ]
Wu, Boxin [1 ]
Huang, Shujia [1 ,2 ]
Shao, Haojing [1 ,2 ]
Ma, Hanzhou [1 ,2 ]
Zhang, Fan [1 ,2 ]
Feng, Shuijian [1 ]
Zhang, Wei [1 ]
Du, Hongli [2 ]
Tian, Geng [1 ]
Li, Jingxiang [1 ]
Zhang, Xiuqing [1 ]
Li, Songgang [1 ]
Bolund, Lars [1 ,5 ]
Kristiansen, Karsten [1 ,6 ]
De Smith, Adam J. [7 ]
Blakemore, Alexandra I. F. [7 ]
Coin, Lachlan J. M. [8 ]
Yang, Huanming [1 ]
Wang, Jian [1 ]
Wang, Jun [1 ,6 ,9 ]
机构
[1] BGI Shenzhen, Shenzhen, Peoples R China
[2] S China Univ Technol, Sch Biosci & Biotechnol, Guangzhou, Peoples R China
[3] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[4] Shenzhen Univ Med Sch, Genome Res Inst, Shenzhen, Peoples R China
[5] Univ Aarhus, Inst Human Genet, Aarhus, Denmark
[6] Univ Copenhagen, Dept Biol, Copenhagen, Denmark
[7] Univ London Imperial Coll Sci Technol & Med, Dept Genom Common Dis, Sch Publ Hlth, London, England
[8] Univ London Imperial Coll Sci Technol & Med, Dept Epidemiol & Biostat, Sch Publ Hlth, London, England
[9] Univ Copenhagen, Novo Nordisk Fdn Ctr Basic Metab Res, Copenhagen, Denmark
基金
中国国家自然科学基金; 英国惠康基金; 英国医学研究理事会;
关键词
SHORT READ ALIGNMENT; COMPLEX TRAITS; COPY-NUMBER; SEQUENCE; INSERTIONS; ALGORITHMS; DELETIONS; GENES; TOOL; DNA;
D O I
10.1038/nbt.1904
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small-and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise breakpoints, and in contrast to other methods, can resolve complex rearrangements. In total, we identified 277,243 SVs ranging in length from 1-23 kb. Validation using computational and experimental methods suggests that we achieve overall <6% false-positive rate and <10% false-negative rate in genomic regions that can be assembled, which outperforms other methods. Analysis of the SVs in the genomes of 106 individuals sequenced as part of the 1000 Genomes Project suggests that SVs account for a greater fraction of the diversity between individuals than do single-nucleotide polymorphisms (SNPs). These findings demonstrate that whole-genome de novo assembly is a feasible approach to deriving more comprehensive maps of genetic variation.
引用
收藏
页码:723 / +
页数:10
相关论文
共 48 条
[11]   Comprehensive genomic characterization defines human glioblastoma genes and core pathways [J].
Chin, L. ;
Meyerson, M. ;
Aldape, K. ;
Bigner, D. ;
Mikkelsen, T. ;
VandenBerg, S. ;
Kahn, A. ;
Penny, R. ;
Ferguson, M. L. ;
Gerhard, D. S. ;
Getz, G. ;
Brennan, C. ;
Taylor, B. S. ;
Winckler, W. ;
Park, P. ;
Ladanyi, M. ;
Hoadley, K. A. ;
Verhaak, R. G. W. ;
Hayes, D. N. ;
Spellman, Paul T. ;
Absher, D. ;
Weir, B. A. ;
Ding, L. ;
Wheeler, D. ;
Lawrence, M. S. ;
Cibulskis, K. ;
Mardis, E. ;
Zhang, Jinghui ;
Wilson, R. K. ;
Donehower, L. ;
Wheeler, D. A. ;
Purdom, E. ;
Wallis, J. ;
Laird, P. W. ;
Herman, J. G. ;
Schuebel, K. E. ;
Weisenberger, D. J. ;
Baylin, S. B. ;
Schultz, N. ;
Yao, Jun ;
Wiedemeyer, R. ;
Weinstein, J. ;
Sander, C. ;
Gibbs, R. A. ;
Gray, J. ;
Kucherlapati, R. ;
Lander, E. S. ;
Myers, R. M. ;
Perou, C. M. ;
McLendon, Roger .
NATURE, 2008, 455 (7216) :1061-1068
[12]   Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945
[13]   Origins and functional impact of copy number variation in the human genome [J].
Conrad, Donald F. ;
Pinto, Dalila ;
Redon, Richard ;
Feuk, Lars ;
Gokcumen, Omer ;
Zhang, Yujun ;
Aerts, Jan ;
Andrews, T. Daniel ;
Barnes, Chris ;
Campbell, Peter ;
Fitzgerald, Tomas ;
Hu, Min ;
Ihm, Chun Hwa ;
Kristiansson, Kati ;
MacArthur, Daniel G. ;
MacDonald, Jeffrey R. ;
Onyiah, Ifejinelo ;
Pang, Andy Wing Chun ;
Robson, Sam ;
Stirrups, Kathy ;
Valsesia, Armand ;
Walter, Klaudia ;
Wei, John ;
Tyler-Smith, Chris ;
Carter, Nigel P. ;
Lee, Charles ;
Scherer, Stephen W. ;
Hurles, Matthew E. .
NATURE, 2010, 464 (7289) :704-712
[14]   Structural variation in the human genome [J].
Feuk, L ;
Carson, AR ;
Scherer, SW .
NATURE REVIEWS GENETICS, 2006, 7 (02) :85-97
[15]   Human genetic variation and its contribution to complex traits [J].
Frazer, Kelly A. ;
Murray, Sarah S. ;
Schork, Nicholas J. ;
Topol, Eric J. .
NATURE REVIEWS GENETICS, 2009, 10 (04) :241-251
[16]   A census of human cancer genes [J].
Futreal, PA ;
Coin, L ;
Marshall, M ;
Down, T ;
Hubbard, T ;
Wooster, R ;
Rahman, N ;
Stratton, MR .
NATURE REVIEWS CANCER, 2004, 4 (03) :177-183
[17]   High-quality draft assemblies of mammalian genomes from massively parallel sequence data [J].
Gnerre, Sante ;
MacCallum, Iain ;
Przybylski, Dariusz ;
Ribeiro, Filipe J. ;
Burton, Joshua N. ;
Walker, Bruce J. ;
Sharpe, Ted ;
Hall, Giles ;
Shea, Terrance P. ;
Sykes, Sean ;
Berlin, Aaron M. ;
Aird, Daniel ;
Costello, Maura ;
Daza, Riza ;
Williams, Louise ;
Nicol, Robert ;
Gnirke, Andreas ;
Nusbaum, Chad ;
Lander, Eric S. ;
Jaffe, David B. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (04) :1513-1518
[18]  
Harris R. S., 2007, THESIS PENN STATE U
[19]   Whole-genome patterns of common DNA variation in three human populations [J].
Hinds, DA ;
Stuve, LL ;
Nilsen, GB ;
Halperin, E ;
Eskin, E ;
Ballinger, DG ;
Frazer, KA ;
Cox, DR .
SCIENCE, 2005, 307 (5712) :1072-1079
[20]   Genome-wide association studies for common diseases and complex traits [J].
Hirschhorn, JN ;
Daly, MJ .
NATURE REVIEWS GENETICS, 2005, 6 (02) :95-108