CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing

被引:1424
作者
Talevich, Eric [1 ,2 ,3 ]
Shain, A. Hunter [1 ,2 ,3 ]
Botton, Thomas [1 ,2 ,3 ]
Bastian, Boris C. [1 ,2 ,3 ]
机构
[1] Univ Calif San Francisco, Dept Dermatol, San Francisco, CA 94143 USA
[2] Univ Calif San Francisco, Dept Pathol, San Francisco, CA 94143 USA
[3] Univ Calif San Francisco, Helen Diller Family Comprehens Canc Ctr, San Francisco, CA 94143 USA
基金
美国国家卫生研究院;
关键词
DISCOVERY; AMPLIFICATION; HYBRIDIZATION; SEGMENTATION; VARIANTS; ACCURATE; MUTATION; CANCER; TOOLS; BIAS;
D O I
10.1371/journal.pcbi.1004873
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this approach has limitations in the case of targeted resequencing, which leaves gaps in coverage between the regions chosen for enrichment and introduces biases related to the efficiency of target capture and library preparation. We present a method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome. This combination achieves both exon-level resolution in targeted regions and sufficient resolution in the larger intronic and intergenic regions to identify copy number changes. In particular, we successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes. After normalizing read counts to a pooled reference, we evaluated and corrected for three sources of bias that explain most of the extraneous variability in the sequencing read depth: GC content, target footprint size and spacing, and repetitive sequences. We compared the performance of CNVkit to copy number changes identified by array comparative genomic hybridization. We packaged the components of CNVkit so that it is straightforward to use and provides visualizations, detailed reporting of significant features, and export options for integration into existing analysis pipelines. CNVkit is freely available from https://github.com/etal/cnvkit.
引用
收藏
页数:18
相关论文
共 45 条
[1]
Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries [J].
Aird, Daniel ;
Ross, Michael G. ;
Chen, Wei-Sheng ;
Danielsson, Maxwell ;
Fennell, Timothy ;
Russ, Carsten ;
Jaffe, David B. ;
Nusbaum, Chad ;
Gnirke, Andreas .
GENOME BIOLOGY, 2011, 12 (02)
[2]
[Anonymous], 2013, ALIGNING SEQUENCE RE
[3]
CANOES: detecting rare copy number variants from whole exome sequencing data [J].
Backenroth, Daniel ;
Homsy, Jason ;
Murillo, Laura R. ;
Glessner, Joe ;
Lin, Edwin ;
Brueckner, Martina ;
Lifton, Richard ;
Goldmuntz, Elizabeth ;
Chung, Wendy K. ;
Shen, Yufeng .
NUCLEIC ACIDS RESEARCH, 2014, 42 (12) :e97
[4]
cnvOffSeq: detecting intergenic copy number variation using off-target exome sequencing data [J].
Bellos, Evangelos ;
Coin, Lachlan J. M. .
BIOINFORMATICS, 2014, 30 (17) :I639-I645
[5]
A fast and flexible method for the segmentation of aCGH data [J].
Ben-Yaacov, Erez ;
Eldar, Yonina C. .
BIOINFORMATICS, 2008, 24 (16) :I139-I145
[6]
Summarizing and correcting the GC content bias in high-throughput sequencing [J].
Benjamini, Yuval ;
Speed, Terence P. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (10) :e72
[7]
Dissemination of scientific software with Galaxy ToolShed [J].
Blankenberg, Daniel ;
Von Kuster, Gregory ;
Bouvier, Emil ;
Baker, Dannon ;
Afgan, Enis ;
Stoler, Nicholas ;
Team, Galaxy ;
Taylor, James ;
Nekrutenko, Anton .
GENOME BIOLOGY, 2014, 15 (02)
[8]
Boeva V, 2014, BIOINFORMATICS, P1
[9]
Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization [J].
Boeva, Valentina ;
Zinovyev, Andrei ;
Bleakley, Kevin ;
Vert, Jean-Philippe ;
Janoueix-Lerosey, Isabelle ;
Delattre, Olivier ;
Barillot, Emmanuel .
BIOINFORMATICS, 2011, 27 (02) :268-269
[10]
Recurrent BRAF kinase fusions in melanocytic tumors offer an opportunity for targeted therapy [J].
Botton, Thomas ;
Yeh, Iwei ;
Nelson, Tyrrell ;
Vemula, Swapna S. ;
Sparatta, Alyssa ;
Garrido, Maria C. ;
Allegra, Maryline ;
Rocchi, Stephane ;
Bahadoran, Philippe ;
McCalmont, Timothy H. ;
LeBoit, Philip E. ;
Burton, Elizabeth A. ;
Bollag, Gideon ;
Ballotti, Robert ;
Bastian, Boris C. .
PIGMENT CELL & MELANOMA RESEARCH, 2013, 26 (06) :845-851