ITD assembler: an algorithm for internal tandem duplication discovery from short-read sequencing data

被引:12
作者
Rustagi, Navin [1 ,2 ]
Hampton, Oliver A. [1 ,3 ]
Li, Jie [1 ,4 ]
Xi, Liu [1 ]
Gibbs, Richard A. [1 ,3 ]
Plon, Sharon E. [1 ,3 ,5 ]
Kimmel, Marek [1 ]
Wheeler, David A. [1 ,3 ]
机构
[1] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[2] Rice Univ, Dept Stat, Houston, TX 77251 USA
[3] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
[4] Cent S Univ, Xiangya Hosp, Dept Dermatol, Changsha, Hunan, Peoples R China
[5] Texas Childrens Hosp, Dept Pediat Hematol Oncol, Houston, TX 77030 USA
来源
BMC BIOINFORMATICS | 2016年 / 17卷
关键词
Tandem duplication; De Bruijn graphs; Assembly; FLT3; Data mining; Cancer genetics; AML; Clustering; Somatic mutations; ACUTE MYELOID-LEUKEMIA; INSERTIONS; MUTATIONS; ALIGNMENT; KINASE; GROWTH;
D O I
10.1186/s12859-016-1031-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Detection of tandem duplication within coding exons, referred to as internal tandem duplication (ITD), remains challenging due to inefficiencies in alignment of ITD-containing reads to the reference genome. There is a critical need to develop efficient methods to recover these important mutational events. Results: In this paper we introduce ITD Assembler, a novel approach that rapidly evaluates all unmapped and partially mapped reads from whole exome NGS data using a De Bruijn graphs approach to select reads that harbor cycles of appropriate length, followed by assembly using overlap-layout-consensus. We tested ITD Assembler on The Cancer Genome Atlas AML dataset as a truth set. ITD Assembler identified the highest percentage of reported FLT3-ITDs when compared to other ITD detection algorithms, and discovered additional ITDs in FLT3, KIT, CEBPA, WT1 and other genes. Evidence of polymorphic ITDs in 54 genes were also found. Novel ITDs were validated by analyzing the corresponding RNA sequencing data. Conclusions: ITD Assembler is a very sensitive tool which can detect partial, large and complex tandem duplications. This study highlights the need to more effectively look for ITD's in other cancers and Mendelian diseases.
引用
收藏
页数:8
相关论文
共 22 条
[1]   Limitations of next-generation genome sequence assembly [J].
Alkan, Can ;
Sajjadian, Saba ;
Eichler, Evan E. .
NATURE METHODS, 2011, 8 (01) :61-65
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   FLT3-ITD-TKD dual mutants associated with AML confer resistance to FLT3 PTK inhibitors and cytotoxic agents by overexpression of Bcl-x(L) [J].
Bagrintseva, K ;
Geisenhof, S ;
Kern, R ;
Eichenlaub, S ;
Reindl, C ;
Ellwart, JW ;
Hiddemann, W ;
Spiekermann, K .
BLOOD, 2005, 105 (09) :3679-3685
[4]   BamTools: a C++ API and toolkit for analyzing and managing BAM files [J].
Barnett, Derek W. ;
Garrison, Erik K. ;
Quinlan, Aaron R. ;
Stroemberg, Michael P. ;
Marth, Gabor T. .
BIOINFORMATICS, 2011, 27 (12) :1691-1692
[5]  
Chiba K, 2014, BIOINFORMATICS OXF E, DOI [10.1093/bioinformatics/btu593, DOI 10.1093/BIOINFORMATICS/BTU593]
[6]   Newly identified c-KIT receptor tyrosine kinase ITD in childhood AML induces ligand-independent growth and is responsive to a synergistic effect of imatinib and rapamycin [J].
Corbacioglu, Salim ;
Kilic, Mehtap ;
Westhoff, Mike-Andrew ;
Reinhardt, Dirk ;
Fulda, Simone ;
Debatin, Klaus-Michael .
BLOOD, 2006, 108 (10) :3504-3513
[7]   The impact of FLT3 internal tandem duplication mutant level, number, size, and interaction with NPM1 mutations in a large cohort of young adult patients with acute myeloid leukemia [J].
Gale, Rosemary E. ;
Green, Claire ;
Allen, Christopher ;
Mead, Adam J. ;
Burnett, Alan K. ;
Hils, Robert K. ;
Linch, David C. .
BLOOD, 2008, 111 (05) :2776-2784
[8]   Whole-genome disassembly [J].
Green, P .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (07) :4143-4144
[9]  
Green P, 1996, TITLE SUBORDINATE DO
[10]   Methods for the detection and assembly of novel sequence in high-throughput sequencing data [J].
Holtgrewe, Manuel ;
Kuchenbecker, Leon ;
Reinert, Knut .
BIOINFORMATICS, 2015, 31 (12) :1904-1912