RNA-SeQC: RNA-seq metrics for quality control and process optimization

被引:585
作者
DeLuca, David S. [1 ]
Levin, Joshua Z. [1 ]
Sivachenko, Andrey [1 ]
Fennell, Timothy [1 ]
Nazaire, Marc-Danie [1 ]
Williams, Chris [1 ]
Reich, Michael [1 ]
Winckler, Wendy [1 ]
Getz, Gad [1 ]
机构
[1] Broad Inst MIT & Harvard, Cambridge, MA USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/bts196
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
RNA-seq, the application of next-generation sequencing to RNA, provides transcriptome-wide characterization of cellular activity. Assessment of sequencing performance and library quality is critical to the interpretation of RNA-seq data, yet few tools exist to address this issue. We introduce RNA-SeQC, a program which provides key measures of data quality. These metrics include yield, alignment and duplication rates; GC bias, rRNA content, regions of alignment (exon, intron and intragenic), continuity of coverage, 3'/5' bias and count of detectable transcripts, among others. The software provides multi-sample evaluation of library construction protocols, input materials and other experimental parameters. The modularity of the software enables pipeline integration and the routine monitoring of key measures of data quality such as the number of alignable reads, duplication rates and rRNA contamination. RNA-SeQC allows investigators to make informed decisions about sample inclusion in downstream analysis. In summary, RNA-SeQC provides quality control measures critical to experiment design, process optimization and downstream computational analysis.
引用
收藏
页码:1530 / 1532
页数:3
相关论文
共 8 条
[1]  
Garber M, 2011, NAT METHODS, V8, P469, DOI [10.1038/NMETH.1613, 10.1038/nmeth.1613]
[2]   GENCODE: producing a reference annotation for ENCODE [J].
Harrow, Jennifer ;
Denoeud, France ;
Frankish, Adam ;
Reymond, Alexandre ;
Chen, Chao-Kung ;
Chrast, Jacqueline ;
Lagarde, Julien ;
Gilbert, James Gr ;
Storey, Roy ;
Swarbreck, David ;
Rossier, Colette ;
Ucla, Catherine ;
Hubbard, Tim ;
Antonarakis, Stylianos E. ;
Guigo, Roderic .
GENOME BIOLOGY, 2006, 7 (Suppl 1)
[3]  
Levin JZ, 2010, NAT METHODS, V7, P709, DOI [10.1038/nmeth.1491, 10.1038/NMETH.1491]
[4]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[5]   The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data [J].
McKenna, Aaron ;
Hanna, Matthew ;
Banks, Eric ;
Sivachenko, Andrey ;
Cibulskis, Kristian ;
Kernytsky, Andrew ;
Garimella, Kiran ;
Altshuler, David ;
Gabriel, Stacey ;
Daly, Mark ;
DePristo, Mark A. .
GENOME RESEARCH, 2010, 20 (09) :1297-1303
[6]   Mapping and quantifying mammalian transcriptomes by RNA-Seq [J].
Mortazavi, Ali ;
Williams, Brian A. ;
McCue, Kenneth ;
Schaeffer, Lorian ;
Wold, Barbara .
NATURE METHODS, 2008, 5 (07) :621-628
[7]   GenePattern 2.0 [J].
Reich, M ;
Liefeld, T ;
Gould, J ;
Lerner, J ;
Tamayo, P ;
Mesirov, JP .
NATURE GENETICS, 2006, 38 (05) :500-501
[8]   RNA-Seq: a revolutionary tool for transcriptomics [J].
Wang, Zhong ;
Gerstein, Mark ;
Snyder, Michael .
NATURE REVIEWS GENETICS, 2009, 10 (01) :57-63