Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation

被引:12210
作者
Trapnell, Cole [1 ,2 ,3 ]
Williams, Brian A. [4 ,5 ]
Pertea, Geo [3 ]
Mortazavi, Ali [4 ,5 ]
Kwan, Gordon [4 ,5 ]
van Baren, Marijke J. [6 ]
Salzberg, Steven L. [2 ,3 ]
Wold, Barbara J. [4 ,5 ]
Pachter, Lior [1 ,7 ,8 ]
机构
[1] Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[3] Univ Maryland, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
[4] CALTECH, Div Biol, Pasadena, CA 91125 USA
[5] CALTECH, Beckman Inst, Pasadena, CA 91125 USA
[6] Washington Univ, Genome Sci Ctr, St Louis, MO USA
[7] Univ Calif Berkeley, Dept Mol & Cell Biol, Berkeley, CA 94720 USA
[8] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94720 USA
基金
美国国家卫生研究院;
关键词
EXPRESSION; GENOME; ALIGNMENT; ARRAYS; MYOD;
D O I
10.1038/nbt.1621
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation(1-3). However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.
引用
收藏
页码:511 / U174
页数:8
相关论文
共 29 条
[1]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[2]   Stem cell transcriptome profiling via massive-scale mRNA sequencing [J].
Cloonan, Nicole ;
Forrest, Alistair R. R. ;
Kolle, Gabriel ;
Gardiner, Brooke B. A. ;
Faulkner, Geoffrey J. ;
Brown, Mellissa K. ;
Taylor, Darrin F. ;
Steptoe, Anita L. ;
Wani, Shivangi ;
Bethel, Graeme ;
Robertson, Alan J. ;
Perkins, Andrew C. ;
Bruce, Stephen J. ;
Lee, Clarence C. ;
Ranade, Swati S. ;
Peckham, Heather E. ;
Manning, Jonathan M. ;
McKernan, Kevin J. ;
Grimmond, Sean M. .
NATURE METHODS, 2008, 5 (07) :613-619
[3]   miR-145 and miR-143 regulate smooth muscle cell fate and plasticity [J].
Cordes, Kimberly R. ;
Sheehy, Neil T. ;
White, Mark P. ;
Berry, Emily C. ;
Morton, Sarah U. ;
Muth, Alecia N. ;
Lee, Ting-Hein ;
Miano, Joseph M. ;
Ivey, Kathryn N. ;
Srivastava, Deepak .
NATURE, 2009, 460 (7256) :705-U80
[4]   FHL3 binds MyoD and negatively regulates myotube formation [J].
Cottle, Denny L. ;
McGrath, Meagan J. ;
Cowling, Belinda S. ;
Coghill, Imogen D. ;
Brown, Susan ;
Mitchell, Christina A. .
JOURNAL OF CELL SCIENCE, 2007, 120 (08) :1423-1435
[5]   Annotating genomes with massive-scale RNA sequencing [J].
Denoeud, France ;
Aury, Jean-Marc ;
Da Silva, Corinne ;
Noel, Benjamin ;
Rogier, Odile ;
Delledonne, Massimo ;
Morgante, Michele ;
Valle, Giorgio ;
Wincker, Patrick ;
Scarpelli, Claude ;
Jaillon, Olivier ;
Artiguenave, Francois .
GENOME BIOLOGY, 2008, 9 (12)
[6]   A DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS [J].
DILWORTH, RP .
ANNALS OF MATHEMATICS, 1950, 51 (01) :161-166
[7]   TRANSCRIPTIONAL AND POSTTRANSCRIPTIONAL CONTROL OF C-MYC DURING MYOGENESIS - ITS MESSENGER-RNA REMAINS INDUCIBLE IN DIFFERENTIATED CELLS AND DOES NOT SUPPRESS THE DIFFERENTIATED PHENOTYPE [J].
ENDO, T ;
NADALGINARD, B .
MOLECULAR AND CELLULAR BIOLOGY, 1986, 6 (05) :1412-1421
[8]   Viral population estimation using pyrosequencing [J].
Eriksson, Nicholas ;
Pachter, Lior ;
Mitsuya, Yumi ;
Rhee, Soo-Yon ;
Wang, Chunlin ;
Gharizadeh, Baback ;
Ronaghi, Mostafa ;
Shafer, Robert W. ;
Beerenwinkel, Niko .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (05)
[9]  
Fuglede B, 2004, 2004 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, P31
[10]   Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals [J].
Guttman, Mitchell ;
Amit, Ido ;
Garber, Manuel ;
French, Courtney ;
Lin, Michael F. ;
Feldser, David ;
Huarte, Maite ;
Zuk, Or ;
Carey, Bryce W. ;
Cassady, John P. ;
Cabili, Moran N. ;
Jaenisch, Rudolf ;
Mikkelsen, Tarjei S. ;
Jacks, Tyler ;
Hacohen, Nir ;
Bernstein, Bradley E. ;
Kellis, Manolis ;
Regev, Aviv ;
Rinn, John L. ;
Lander, Eric S. .
NATURE, 2009, 458 (7235) :223-227