Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome

被引:200
作者
Quinlan, Aaron R. [1 ]
Clark, Royden A. [1 ]
Sokolova, Svetlana [1 ]
Leibowitz, Mitchell L. [1 ]
Zhang, Yujun [2 ]
Hurles, Matthew E. [2 ]
Mell, Joshua C. [3 ]
Hall, Ira M. [1 ,4 ]
机构
[1] Univ Virginia, Sch Med, Dept Biochem & Mol Genet, Charlottesville, VA 22908 USA
[2] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[3] Univ British Columbia, Dept Zool, Vancouver, BC V6T 3Z4, Canada
[4] Univ Virginia, Ctr Publ Hlth Genom, Charlottesville, VA 22908 USA
关键词
COPY NUMBER VARIATION; SEGMENTAL DUPLICATIONS; COMPLEX REARRANGEMENTS; MOBILE ELEMENTS; FINE-SCALE; ARCHITECTURE; MECHANISMS; SEQUENCE; ALGORITHM; EVOLUTION;
D O I
10.1101/gr.102970.109
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Structural variation (SV) is a rich source of genetic diversity in mammals, but due to the challenges associated with mapping SV in complex genomes, basic questions regarding their genomic distribution and mechanistic origins remain unanswered. We have developed an algorithm (HYDRA) to localize SV breakpoints by paired-end mapping, and a general approach for the genome-wide assembly and interpretation of breakpoint sequences. We applied these methods to two inbred mouse strains: C57BL/6J and DBA/2J. We demonstrate that HYDRA accurately maps diverse classes of SV, including those involving repetitive elements such as transposons and segmental duplications; however, our analysis of the C57BL/6J reference strain shows that incomplete reference genome assemblies are a major source of noise. We report 7196 SVs between the two strains, more than two-thirds of which are due to transposon insertions. Of the remainder, 59% are deletions (relative to the reference), 26% are insertions of unlinked DNA, 9% are tandem duplications, and 6% are inversions. To investigate the origins of SV, we characterized 3316 breakpoint sequences at single-nucleotide resolution. We find that similar to 16% of non-transposon SVs have complex breakpoint patterns consistent with template switching during DNA replication or repair, and that this process appears to preferentially generate certain classes of complex variants. Moreover, we find that SVs are significantly enriched in regions of segmental duplication, but that this effect is largely independent of DNA sequence homology and thus cannot be explained by non-allelic homologous recombination (NAHR) alone. This result suggests that the genetic instability of such regions is often the cause rather than the consequence of duplicated genomic architecture.
引用
收藏
页码:623 / 635
页数:13
相关论文
共 62 条
[1]   The first Korean genome sequence and analysis: Full genome sequencing for a socio-ethnic group [J].
Ahn, Sung-Min ;
Kim, Tae-Hyung ;
Lee, Sunghoon ;
Kim, Deokhoon ;
Ghang, Ho ;
Kim, Dae-Soo ;
Kim, Byoung-Chul ;
Kim, Sang-Yoon ;
Kim, Woo-Yeon ;
Kim, Chulhong ;
Park, Daeui ;
Lee, Yong Seok ;
Kim, Sangsoo ;
Reja, Rohit ;
Jho, Sungwoong ;
Kim, Chang Geun ;
Cha, Ji-Young ;
Kim, Kyung-Hee ;
Lee, Bonghee ;
Bhak, Jong ;
Kim, Seong-Jin .
GENOME RESEARCH, 2009, 19 (09) :1622-1629
[2]   Extensive variation between inbred mouse strains due to endogenous L1 retrotransposition [J].
Akagi, Keiko ;
Li, Jingfeng ;
Stephens, Robert M. ;
Volfovsky, Natalia ;
Symer, David E. .
GENOME RESEARCH, 2008, 18 (06) :869-880
[3]   Personalized copy number and segmental duplication maps using next-generation sequencing [J].
Alkan, Can ;
Kidd, Jeffrey M. ;
Marques-Bonet, Tomas ;
Aksay, Gozde ;
Antonacci, Francesca ;
Hormozdiari, Fereydoun ;
Kitzman, Jacob O. ;
Baker, Carl ;
Malig, Maika ;
Mutlu, Onur ;
Sahinalp, S. Cenk ;
Gibbs, Richard A. ;
Eichler, Evan E. .
NATURE GENETICS, 2009, 41 (10) :1061-U29
[4]   Hotspots of mammalian chromosomal evolution [J].
Bailey, JA ;
Baertsch, R ;
Kent, WJ ;
Haussler, D ;
Eichler, EE .
GENOME BIOLOGY, 2004, 5 (04)
[5]   Nonrecurrent MECP2 duplications mediated by genomic architecture-driven DNA breaks and break-induced replication repair [J].
Bauters, Marijke ;
Van Esch, Hilde ;
Friez, Michael J. ;
Boespflug-Tanguy, Odile ;
Zenker, Martin ;
Vianna-Morgante, Angela M. ;
Rosenberg, Carla ;
Ignatius, Jaakko ;
Raynaud, Martine ;
Hollanders, Karen ;
Govaerts, Karen ;
Vandenreijt, Kris ;
Niel, Florence ;
Blanc, Pierre ;
Stevenson, Roger E. ;
Fryns, Jean-Pierre ;
Marynen, Peter ;
Schwartz, Charles E. ;
Froyen, Guy .
GENOME RESEARCH, 2008, 18 (06) :847-858
[6]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[7]   The Mouse Genome Database (MGD): mouse biology and model systems [J].
Bult, Carol J. ;
Eppig, Janan T. ;
Kadin, James A. ;
Richardson, Joel E. ;
Blake, Judith A. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D724-D728
[8]   The impact of copy number variation on local gene expression in mouse hematopoietic stem and progenitor cells [J].
Cahan, Patrick ;
Li, Yedda ;
Izumi, Masayo ;
Graubert, Timothy A. .
NATURE GENETICS, 2009, 41 (04) :430-437
[9]   Complex rearrangements in patients with duplications of MECP2 can occur by fork stalling and template switching [J].
Carvalho, Claudia M. B. ;
Zhang, Feng ;
Liu, Pengfei ;
Patel, Ankita ;
Sahoo, Trilochan ;
Bacino, Carlos A. ;
Shaw, Chad ;
Peacock, Sandra ;
Pursley, Amber ;
Tavyev, Y. Jane ;
Ramocki, Melissa B. ;
Nawara, Magdalena ;
Obersztyn, Ewa ;
Vianna-Morgante, Angela M. ;
Stankiewicz, Pawel ;
Zoghbi, Huda Y. ;
Cheung, Sau Wai ;
Lupski, James R. .
HUMAN MOLECULAR GENETICS, 2009, 18 (12) :2188-2203
[10]  
Chen K, 2009, NAT METHODS, V6, P677, DOI [10.1038/NMETH.1363, 10.1038/nmeth.1363]