voom: precision weights unlock linear model analysis tools for RNA-seq read counts

被引:4127
作者
Law, Charity W. [1 ,2 ]
Chen, Yunshun [1 ,2 ]
Shi, Wei [1 ,3 ]
Smyth, Gordon K. [1 ,4 ]
机构
[1] Walter & Eliza Hall Inst Med Res, Bioinformat Div, Parkville, Vic 3052, Australia
[2] Univ Melbourne, Dept Med Biol, Parkville, Vic 3010, Australia
[3] Univ Melbourne, Dept Comp & Informat Syst, Parkville, Vic 3010, Australia
[4] Univ Melbourne, Dept Math & Stat, Parkville, Vic 3010, Australia
关键词
DIFFERENTIAL EXPRESSION ANALYSIS; GENE-EXPRESSION; STATISTICAL TESTS; NORMALIZATION; LIKELIHOOD; VARIANCE; TRANSCRIPTOMES; BIOCONDUCTOR; MICROARRAYS; DISPERSION;
D O I
10.1186/gb-2014-15-2-r29
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 [微生物学]; 090105 [作物生产系统与生态工程];
摘要
New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.
引用
收藏
页数:17
相关论文
共 67 条
[1]
Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]
Auer P, 2011, TSPM R R CODE 2 STAG
[3]
A Two-Stage Poisson Model for Testing RNA-Seq Data [J].
Auer, Paul L. ;
Doerge, Rebecca W. .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
[4]
The external RNA controls consortium: a progress report [J].
Baker, SC ;
Bauer, SR ;
Beyer, RP ;
Brenton, JD ;
Bromley, B ;
Burrill, J ;
Causton, H ;
Conley, MP ;
Elespuru, R ;
Fero, M ;
Foy, C ;
Fuscoe, J ;
Gao, XL ;
Gerhold, DL ;
Gilles, P ;
Goodsaid, F ;
Guo, X ;
Hackett, J ;
Hockett, RD ;
Ikonomi, P ;
Irizarry, RA ;
Kawasaki, ES ;
Kaysser-Kranich, T ;
Kerr, K ;
Kiser, G ;
Koch, WH ;
Lee, KY ;
Liu, CM ;
Liu, ZL ;
Lucas, A ;
Manohar, CF ;
Miyada, G ;
Modrusan, Z ;
Parkes, H ;
Puri, RK ;
Reid, L ;
Ryder, TB ;
Salit, M ;
Samaha, RR ;
Scherf, U ;
Sendera, TJ ;
Setterquist, RA ;
Shi, LM ;
Shippy, R ;
Soriano, JV ;
Wagar, EA ;
Warrington, JA ;
Williams, M ;
Wilmer, F ;
Wilson, M .
NATURE METHODS, 2005, 2 (10) :731-734
[5]
Rao's score, Neyman's C(α) and Silvey's LM tests:: an essay on historical developments and some new results [J].
Bera, AK ;
Bilias, Y .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2001, 97 (01) :9-44
[6]
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias [J].
Bolstad, BM ;
Irizarry, RA ;
Åstrand, M ;
Speed, TP .
BIOINFORMATICS, 2003, 19 (02) :185-193
[7]
Evaluating Gene Expression in C57BL/6J and DBA/2J Mouse Striatum Using RNA-Seq and Microarrays [J].
Bottomly, Daniel ;
Walter, Nicole A. R. ;
Hunter, Jessica Ezzell ;
Darakjian, Priscila ;
Kawane, Sunita ;
Buck, Kari J. ;
Searles, Robert P. ;
Mooney, Michael ;
McWeeney, Shannon K. ;
Hitzemann, Robert .
PLOS ONE, 2011, 6 (03)
[8]
Carlson M, ORG DM EG DB GENOME
[9]
X-inactivation profile reveals extensive variability in X-linked gene expression in females [J].
Carrel, L ;
Willard, HF .
NATURE, 2005, 434 (7031) :400-404
[10]
A COMPARISON BETWEEN MAXIMUM-LIKELIHOOD AND GENERALIZED LEAST-SQUARES IN A HETEROSCEDASTIC LINEAR-MODEL [J].
CARROLL, RJ ;
RUPPERT, D .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1982, 77 (380) :878-882