Copy number variation detection and genotyping from exome sequence data

被引:477
作者
Krumm, Niklas [1 ]
Sudmant, Peter H. [1 ]
Ko, Arthur [1 ]
O'Roak, Brian J. [1 ]
Malig, Maika [1 ]
Coe, Bradley P. [1 ]
Quinlan, Aaron R. [3 ]
Nickerson, Deborah A. [1 ]
Eichler, Evan E. [1 ,4 ]
机构
[1] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[2] NHLBI, NHLBI Exome Sequencing Project, NIH, Bethesda, MD 20892 USA
[3] Univ Virginia, Ctr Publ Hlth Gen, Dept Publ Hlth Sci, Charlottesville, VA 22908 USA
[4] Univ Washington, Howard Hughes Med Inst, Seattle, WA 98195 USA
关键词
STRUCTURAL VARIATION; VARIANTS;
D O I
10.1101/gr.138115.112
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
While exome sequencing is readily amenable to single-nucleotide variant discovery, the sparse and nonuniform nature of the exome capture reaction has hindered exome-based detection and characterization of genic copy number variation. We developed a novel method using singular value decomposition (SVD) normalization to discover rare genic copy number variants (CNVs) as well as genotype copy number polymorphic (CNP) loci with high sensitivity and specificity from exome sequencing data. We estimate the precision of our algorithm using 122 trios (366 exomes) and show that this method can be used to reliably predict (94% overall precision) both de novo and inherited rare CNVs involving three or more consecutive exons. We demonstrate that exome-based genotyping of CNPs strongly correlates with whole-genome data (median r(2) = 0.91), especially for loci with fewer than eight copies, and can estimate the absolute copy number of multi-allelic genes with high accuracy (78% call level). The resulting user-friendly computational pipeline, CoNIFER (copy number inference from exome reads), can reliably be used to discover disruptive genic CNVs missed by standard approaches and should have broad application in human genetic studies of disease.
引用
收藏
页码:1525 / 1532
页数:8
相关论文
共 22 条
[1]   Personalized copy number and segmental duplication maps using next-generation sequencing [J].
Alkan, Can ;
Kidd, Jeffrey M. ;
Marques-Bonet, Tomas ;
Aksay, Gozde ;
Antonacci, Francesca ;
Hormozdiari, Fereydoun ;
Kitzman, Jacob O. ;
Baker, Carl ;
Malig, Maika ;
Mutlu, Onur ;
Sahinalp, S. Cenk ;
Gibbs, Richard A. ;
Eichler, Evan E. .
NATURE GENETICS, 2009, 41 (10) :1061-U29
[2]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[3]   Exome sequencing as a tool for Mendelian disease gene discovery [J].
Bamshad, Michael J. ;
Ng, Sarah B. ;
Bigham, Abigail W. ;
Tabor, Holly K. ;
Emond, Mary J. ;
Nickerson, Deborah A. ;
Shendure, Jay .
NATURE REVIEWS GENETICS, 2011, 12 (11) :745-755
[4]   Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms [J].
Campbell, Catarina D. ;
Sampas, Nick ;
Tsalenko, Anya ;
Sudmant, Peter H. ;
Kidd, Jeffrey M. ;
Malig, Maika ;
Vu, Tiffany H. ;
Vives, Laura ;
Tsang, Peter ;
Bruhn, Laurakay ;
Eichler, Evan E. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2011, 88 (03) :317-332
[5]   High-resolution mapping of copy-number alterations with massively parallel sequencing [J].
Chiang, Derek Y. ;
Getz, Gad ;
Jaffe, David B. ;
O'Kelly, Michael J. T. ;
Zhao, Xiaojun ;
Carter, Scott L. ;
Russ, Carsten ;
Nusbaum, Chad ;
Meyerson, Matthew ;
Lander, Eric S. .
NATURE METHODS, 2009, 6 (01) :99-103
[6]   Origins and functional impact of copy number variation in the human genome [J].
Conrad, Donald F. ;
Pinto, Dalila ;
Redon, Richard ;
Feuk, Lars ;
Gokcumen, Omer ;
Zhang, Yujun ;
Aerts, Jan ;
Andrews, T. Daniel ;
Barnes, Chris ;
Campbell, Peter ;
Fitzgerald, Tomas ;
Hu, Min ;
Ihm, Chun Hwa ;
Kristiansson, Kati ;
MacArthur, Daniel G. ;
MacDonald, Jeffrey R. ;
Onyiah, Ifejinelo ;
Pang, Andy Wing Chun ;
Robson, Sam ;
Stirrups, Kathy ;
Valsesia, Armand ;
Walter, Klaudia ;
Wei, John ;
Tyler-Smith, Chris ;
Carter, Nigel P. ;
Lee, Charles ;
Scherer, Stephen W. ;
Hurles, Matthew E. .
NATURE, 2010, 464 (7289) :704-712
[7]   A common polymorphism of the growth hormone receptor is associated with increased responsiveness to growth hormone [J].
Dos Santos, C ;
Essioux, L ;
Teinturier, C ;
Tauber, M ;
Goffin, V ;
Bougnères, P .
NATURE GENETICS, 2004, 36 (07) :720-724
[8]   Human Copy Number Variation and Complex Genetic Disease [J].
Girirajan, Santhosh ;
Campbell, Catarina D. ;
Eichler, Evan E. .
ANNUAL REVIEW OF GENETICS, VOL 45, 2011, 45 :203-226
[9]   mrsFAST: a cache-oblivious algorithm for short-read mapping [J].
Hach, Faraz ;
Hormozdiari, Fereydoun ;
Alkan, Can ;
Hormozdiari, Farhad ;
Birol, Inanc ;
Eichler, Evan E. ;
Sahinalp, S. Cenk .
NATURE METHODS, 2010, 7 (08) :576-577
[10]   Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes [J].
Hormozdiari, Fereydoun ;
Alkan, Can ;
Eichler, Evan E. ;
Sahinalp, S. Cenk .
GENOME RESEARCH, 2009, 19 (07) :1270-1278