Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling

被引:104
作者
Labaj, Pawel P. [1 ]
Leparc, German G. [1 ]
Linggi, Bryan E. [2 ]
Markillie, Lye Meng [2 ]
Wiley, H. Steven [2 ]
Kreil, David P. [1 ]
机构
[1] Boku Univ Vienna, Chair Bioinformat, A-1190 Vienna, Austria
[2] Pacific NW Natl Lab, Environm Mol Sci Lab, Richland, WA 99352 USA
关键词
DIFFERENTIAL EXPRESSION; ALIGNMENT; ULTRAFAST; TOOL;
D O I
10.1093/bioinformatics/btr247
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not. With the compilation of large-scale RNA-Seq datasets with technical replicate samples, however, we can now, for the first time, perform a systematic analysis of the precision of expression level estimates from massively parallel sequencing technology. This then allows considerations for its improvement by computational or experimental means. Results: We report on a comprehensive study of target identification and measurement precision, including their dependence on transcript expression levels, read depth and other parameters. In particular, an impressive recall of 84% of the estimated true transcript population could be achieved with 331 million 50 bp reads, with diminishing returns from longer read lengths and even less gains from increased sequencing depths. Most of the measurement power (75%) is spent on only 7% of the known transcriptome, however, making less strongly expressed transcripts harder to measure. Consequently, <30% of all transcripts could be quantified reliably with a relative error <20%. Based on established tools, we then introduce a new approach for mapping and analysing sequencing reads that yields substantially improved performance in gene expression profiling, increasing the number of transcripts that can reliably be quantified to over 40%. Extrapolations to higher sequencing depths highlight the need for efficient complementary steps. In discussion we outline possible experimental and computational strategies for further improvements in quantification precision.
引用
收藏
页码:I383 / I391
页数:9
相关论文
共 45 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]   Going for algorithm gold [J].
不详 .
NATURE METHODS, 2008, 5 (08) :659-659
[3]   DISTINCTIVE TRAITS OF NORMAL AND TUMOR-DERIVED HUMAN MAMMARY EPITHELIAL-CELLS EXPRESSED IN A MEDIUM THAT SUPPORTS LONG-TERM GROWTH OF BOTH CELL-TYPES [J].
BAND, V ;
SAGER, R .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1989, 86 (04) :1249-1253
[4]   The digital generation [J].
Blow, Nathan .
NATURE, 2009, 458 (7235) :239-244
[5]  
BOLSTAD B, 2004, THESIS U CALIFORNIA
[6]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[7]  
CAMDA, 2008, CRIT ASS MICR DAT AN
[8]   Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia [J].
Carninci, P ;
Waki, K ;
Shiraki, T ;
Konno, H ;
Shibata, K ;
Itoh, M ;
Aizawa, K ;
Arakawa, T ;
Ishii, Y ;
Sasaki, D ;
Bono, H ;
Kondo, S ;
Sugahara, Y ;
Saito, R ;
Osato, N ;
Fukuda, S ;
Sato, K ;
Watahiki, A ;
Hirozane-Kishikawa, T ;
Nakamura, M ;
Shibata, Y ;
Yasunishi, A ;
Kikuchi, N ;
Yoshiki, A ;
Kusakabe, M ;
Gustincich, S ;
Beisel, K ;
Pavan, W ;
Aidinis, V ;
Nakagawara, A ;
Held, WA ;
Iwata, H ;
Kono, T ;
Nakauchi, H ;
Lyons, P ;
Wells, C ;
Hume, DA ;
Fagiolini, M ;
Hensch, TK ;
Brinkmeier, M ;
Camper, S ;
Hirota, J ;
Mombaerts, P ;
Muramatsu, M ;
Okazaki, Y ;
Kawai, J ;
Hayashizaki, Y .
GENOME RESEARCH, 2003, 13 (6B) :1273-1289
[9]   Stem cell transcriptome profiling via massive-scale mRNA sequencing [J].
Cloonan, Nicole ;
Forrest, Alistair R. R. ;
Kolle, Gabriel ;
Gardiner, Brooke B. A. ;
Faulkner, Geoffrey J. ;
Brown, Mellissa K. ;
Taylor, Darrin F. ;
Steptoe, Anita L. ;
Wani, Shivangi ;
Bethel, Graeme ;
Robertson, Alan J. ;
Perkins, Andrew C. ;
Bruce, Stephen J. ;
Lee, Clarence C. ;
Ranade, Swati S. ;
Peckham, Heather E. ;
Manning, Jonathan M. ;
McKernan, Kevin J. ;
Grimmond, Sean M. .
NATURE METHODS, 2008, 5 (07) :613-619
[10]   Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data [J].
Dai, MH ;
Wang, PL ;
Boyd, AD ;
Kostov, G ;
Athey, B ;
Jones, EG ;
Bunney, WE ;
Myers, RM ;
Speed, TP ;
Akil, H ;
Watson, SJ ;
Meng, F .
NUCLEIC ACIDS RESEARCH, 2005, 33 (20) :e175.1-e175.9