TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions

被引:9716
作者
Kim, Daehwan [1 ,2 ,3 ]
Pertea, Geo [3 ]
Trapnell, Cole [5 ,6 ]
Pimentel, Harold [7 ]
Kelley, Ryan [8 ]
Salzberg, Steven L. [3 ,4 ]
机构
[1] Univ Maryland, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[3] Johns Hopkins Univ, Sch Med, McKusick Nathans Inst Genet Med, Ctr Computat Biol, Baltimore, MD 21205 USA
[4] Johns Hopkins Univ, Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD 21205 USA
[5] Broad Inst MIT & Harvard, Cambridge Ctr 7, Cambridge, MA 02142 USA
[6] Harvard Univ, Dept Stem Cell & Regenerat Biol, Cambridge, MA 02142 USA
[7] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[8] Illumina Inc, San Diego, CA 92122 USA
来源
GENOME BIOLOGY | 2013年 / 14卷 / 04期
关键词
PSEUDOGENES; ULTRAFAST;
D O I
10.1186/gb-2013-14-4-r36
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
TopHat is a popular spliced aligner for RNA-sequence (RNA-seq) experiments. In this paper, we describe TopHat2, which incorporates many significant enhancements to TopHat. TopHat2 can align reads of various lengths produced by the latest sequencing technologies, while allowing for variable-length indels with respect to the reference genome. In addition to de novo spliced alignment, TopHat2 can align reads across fusion breaks, which can occur after genomic translocations. TopHat2 combines the ability to identify novel splice sites with direct mapping to known transcripts, producing sensitive and accurate alignments, even for highly repetitive genomes or in the presence of pseudogenes. TopHat2 is available at http://ccb.jhu.edu/software/tophat.
引用
收藏
页数:13
相关论文
共 17 条
[1]   Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes [J].
Chen, Rui ;
Mias, George I. ;
Li-Pook-Than, Jennifer ;
Jiang, Lihua ;
Lam, Hugo Y. K. ;
Chen, Rong ;
Miriami, Elana ;
Karczewski, Konrad J. ;
Hariharan, Manoj ;
Dewey, Frederick E. ;
Cheng, Yong ;
Clark, Michael J. ;
Im, Hogune ;
Habegger, Lukas ;
Balasubramanian, Suganthi ;
O'Huallachain, Maeve ;
Dudley, Joel T. ;
Hillenmeyer, Sara ;
Haraksingh, Rajini ;
Sharon, Donald ;
Euskirchen, Ghia ;
Lacroute, Phil ;
Bettinger, Keith ;
Boyle, Alan P. ;
Kasowski, Maya ;
Grubert, Fabian ;
Seki, Scott ;
Garcia, Marco ;
Whirl-Carrillo, Michelle ;
Gallardo, Mercedes ;
Blasco, Maria A. ;
Greenberg, Peter L. ;
Snyder, Phyllis ;
Klein, Teri E. ;
Altman, Russ B. ;
Butte, Atul J. ;
Ashley, Euan A. ;
Gerstein, Mark ;
Nadeau, Kari C. ;
Tang, Hua ;
Snyder, Michael .
CELL, 2012, 148 (06) :1293-1307
[2]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[3]   Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM) [J].
Grant, Gregory R. ;
Farkas, Michael H. ;
Pizarro, Angel D. ;
Lahens, Nicholas F. ;
Schug, Jonathan ;
Brunk, Brian P. ;
Stoeckert, Christian J. ;
Hogenesch, John B. ;
Pierce, Eric A. .
BIOINFORMATICS, 2011, 27 (18) :2518-2528
[4]  
Griebel T., 2012, NUCL ACIDS RES
[5]   Expressed Pseudogenes in the Transcriptional Landscape of Human Cancers [J].
Kalyana-Sundaram, Shanker ;
Kumar-Sinha, Chandan ;
Shankar, Sunita ;
Robinson, Dan R. ;
Wu, Yi-Mi ;
Cao, Xuhong ;
Asangani, Irfan A. ;
Kothari, Vishal ;
Prensner, John R. ;
Lonigro, Robert J. ;
Iyer, Matthew K. ;
Barrette, Terrence ;
Shanmugam, Achiraman ;
Dhanasekaran, Saravana M. ;
Palanisamy, Nallasivam ;
Chinnaiyan, Arul M. .
CELL, 2012, 149 (07) :1622-1634
[6]   TopHat-Fusion: an algorithm for discovery of novel fusion transcripts [J].
Kim, Daehwan ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2011, 12 (08)
[7]  
Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]
[8]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[9]   The diploid genome sequence of an individual human [J].
Levy, Samuel ;
Sutton, Granger ;
Ng, Pauline C. ;
Feuk, Lars ;
Halpern, Aaron L. ;
Walenz, Brian P. ;
Axelrod, Nelson ;
Huang, Jiaqi ;
Kirkness, Ewen F. ;
Denisov, Gennady ;
Lin, Yuan ;
MacDonald, Jeffrey R. ;
Pang, Andy Wing Chun ;
Shago, Mary ;
Stockwell, Timothy B. ;
Tsiamouri, Alexia ;
Bafna, Vineet ;
Bansal, Vikas ;
Kravitz, Saul A. ;
Busam, Dana A. ;
Beeson, Karen Y. ;
Mclntosh, Tina C. ;
Remington, Karin A. ;
Abril, Josep F. ;
Gill, John ;
Borman, Jon ;
Rogers, Yu-Hui ;
Frazier, Marvin E. ;
Scherer, Stephen W. ;
Strausberg, Robert L. ;
Venter, J. Craig .
PLOS BIOLOGY, 2007, 5 (10) :2113-2144
[10]   Mapping and quantifying mammalian transcriptomes by RNA-Seq [J].
Mortazavi, Ali ;
Williams, Brian A. ;
McCue, Kenneth ;
Schaeffer, Lorian ;
Wold, Barbara .
NATURE METHODS, 2008, 5 (07) :621-628