Pegasus: a comprehensive annotation and prediction tool for detection of driver gene fusions in cancer

被引:52
作者
Abate, Francesco [1 ,2 ,3 ]
Zairis, Sakellarios [2 ]
Ficarra, Elisa [3 ]
Acquaviva, Andrea [3 ]
Wiggins, Chris H. [2 ,6 ,7 ]
Frattini, Veronique [5 ]
Lasorella, Anna [5 ]
Iavarone, Antonio [5 ]
Inghirami, Giorgio [4 ]
Rabadan, Raul [1 ,2 ]
机构
[1] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
[2] Columbia Univ, Ctr Computat Biol & Bioinformat, New York, NY 10032 USA
[3] Politecn Torino, Dept Control & Comp Engn, I-10129 Turin, Italy
[4] Univ Turin, Ctr Expt Res & Med Studies, Dept Pathol, Turin, Italy
[5] Columbia Univ, Med Ctr, Inst Canc Genet, New York, NY 10032 USA
[6] Columbia Univ, Fu Fdn Sch Engn & Appl Sci, Dept Appl Phys & Appl Math, New York, NY 10027 USA
[7] Columbia Univ, Inst Data Sci & Engn, New York, NY 10027 USA
关键词
Gene fusion; Next-generation sequencing; Machine learning; IDENTIFICATION; FRAMEWORK; TRANSCRIPTION; LANDSCAPE; DISCOVERY;
D O I
10.1186/s12918-014-0097-z
中图分类号
Q [生物科学];
学科分类号
090105 [作物生产系统与生态工程];
摘要
Background: The extraordinary success of imatinib in the treatment of BCR-ABL1 associated cancers underscores the need to identify novel functional gene fusions in cancer. RNA sequencing offers a genome-wide view of expressed transcripts, uncovering biologically functional gene fusions. Although several bioinformatics tools are already available for the detection of putative fusion transcripts, candidate event lists are plagued with non-functional read-through events, reverse transcriptase template switching events, incorrect mapping, and other systematic errors. Such lists lack any indication of oncogenic relevance, and they are too large for exhaustive experimental validation. Results: We have designed and implemented a pipeline, Pegasus, for the annotation and prediction of biologically functional gene fusion candidates. Pegasus provides a common interface for various gene fusion detection tools, reconstruction of novel fusion proteins, reading-frame-aware annotation of preserved/lost functional domains, and data-driven classification of oncogenic potential. Pegasus dramatically streamlines the search for oncogenic gene fusions, bridging the gap between raw RNA-Seq data and a final, tractable list of candidates for experimental validation. Conclusion: We show the effectiveness of Pegasus in predicting new driver fusions in 176 RNA-Seq samples of glioblastoma multiforme (GBM) and 23 cases of anaplastic large cell lymphoma (ALCL).
引用
收藏
页数:14
相关论文
共 40 条
[1]
Bellerophontes: an RNA-Seq data analysis framework for chimeric transcripts discovery based on accurate fusion model [J].
Abate, Francesco ;
Acquaviva, Andrea ;
Paciello, Giulia ;
Foti, Carmelo ;
Ficarra, Elisa ;
Ferrarini, Alberto ;
Delledonne, Massimo ;
Iacobucci, Ilaria ;
Soverini, Simona ;
Martinelli, Giovanni ;
Macii, Enrico .
BIOINFORMATICS, 2012, 28 (16) :2114-2121
[2]
Ananth Mohan ZC, 2011, JMLR WORKSHOP C P, V14, P77
[3]
[Anonymous], 2001, The elements of statistical learning: data mining, inference and prediction
[4]
Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[5]
SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[6]
The Somatic Genomic Landscape of Glioblastoma [J].
Brennan, Cameron W. ;
Verhaak, Roel G. W. ;
McKenna, Aaron ;
Campos, Benito ;
Noushmehr, Houtan ;
Salama, Sofie R. ;
Zheng, Siyuan ;
Chakravarty, Debyani ;
Sanborn, J. Zachary ;
Berman, Samuel H. ;
Beroukhim, Rameen ;
Bernard, Brady ;
Wu, Chang-Jiun ;
Genovese, Giannicola ;
Shmulevich, Ilya ;
Barnholtz-Sloan, Jill ;
Zou, Lihua ;
Vegesna, Rahulsimham ;
Shukla, Sachet A. ;
Ciriello, Giovanni ;
Yung, W. K. ;
Zhang, Wei ;
Sougnez, Carrie ;
Mikkelsen, Tom ;
Aldape, Kenneth ;
Bigner, Darell D. ;
Van Meir, Erwin G. ;
Prados, Michael ;
Sloan, Andrew ;
Black, Keith L. ;
Eschbacher, Jennifer ;
Finocchiaro, Gaetano ;
Friedman, William ;
Andrews, David W. ;
Guha, Abhijit ;
Iacocca, Mary ;
O'Neill, Brian P. ;
Foltz, Greg ;
Myers, Jerome ;
Weisenberger, Daniel J. ;
Penny, Robert ;
Kucherlapati, Raju ;
Perou, Charles M. ;
Hayes, D. Neil ;
Gibbs, Richard ;
Marra, Marco ;
Mills, Gordon B. ;
Lander, Eric ;
Spellman, Paul ;
Wilson, Richard .
CELL, 2013, 155 (02) :462-477
[7]
State-of-the-Art Fusion-Finder Algorithms Sensitivity and Specificity [J].
Carrara, Matteo ;
Beccuti, Marco ;
Lazzarato, Fulvio ;
Cavallo, Federica ;
Cordero, Francesca ;
Donatelli, Susanna ;
Calogero, Raffaele A. .
BIOMED RESEARCH INTERNATIONAL, 2013, 2013
[8]
The anaplastic lymphoma kinase in the pathogenesis of cancer [J].
Chiarle, Roberto ;
Voena, Claudia ;
Ambrogio, Chiara ;
Piva, Roberto ;
Inghirami, Giorgio .
NATURE REVIEWS CANCER, 2008, 8 (01) :11-23
[9]
Diversity of TMPRSS2-ERG fusion transcripts in the human prostate [J].
Clark, J. ;
Merson, S. ;
Jhavar, S. ;
Flohr, P. ;
Edwards, S. ;
Foster, C. S. ;
Eeles, R. ;
Martin, F. L. ;
Phillips, D. H. ;
Crundwell, M. ;
Christmas, T. ;
Thompson, A. ;
Fisher, C. ;
Kovacs, G. ;
Cooper, C. S. .
ONCOGENE, 2007, 26 (18) :2667-2673
[10]
Identification of fusion genes in breast cancer by paired-end RNA-sequencing [J].
Edgren, Henrik ;
Murumagi, Astrid ;
Kangaspeska, Sara ;
Nicorici, Daniel ;
Hongisto, Vesa ;
Kleivi, Kristine ;
Rye, Inga H. ;
Nyberg, Sandra ;
Wolf, Maija ;
Borresen-Dale, Anne-Lise ;
Kallioniemi, Olli .
GENOME BIOLOGY, 2011, 12 (01)