Statistical methods for identifying differentially expressed genes in RNA-Seq exeriments

被引:33
作者
Fang, Zhide [1 ]
Martin, Jeffrey [2 ,3 ]
Wang, Zhong [2 ,3 ,4 ]
机构
[1] Louisiana State Univ, Hlth Sci Ctr, Sch Publ Hlth, Biostat Program, New Orleans, LA 70112 USA
[2] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Genom Div, Berkeley, CA 94720 USA
[3] Joint Genome Inst, Dept Energy, Walnut Creek, CA 94598 USA
[4] DOE Joint Genome Inst, Walnut Creek, CA 94598 USA
关键词
QUANTIFICATION; NORMALIZATION; POWERFUL; TESTS; SAGE;
D O I
10.1186/2045-3701-2-26
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
RNA sequencing (RNA-Seq) is rapidly replacing microarrays for profiling gene expression with much improved accuracy and sensitivity. One of the most common questions in a typical gene profiling experiment is how to identify a set of transcripts that are differentially expressed between different experimental conditions. Some of the statistical methods developed for microarray data analysis can be applied to RNA-Seq data with or without modifications. Recently several additional methods have been developed specifically for RNA-Seq data sets. This review attempts to give an in-depth review of these statistical methods, with the goal of providing a comprehensive guide when choosing appropriate metrics for RNA-Seq statistical analyses.
引用
收藏
页数:8
相关论文
共 42 条
[1]  
Agresti A., 2002, CATEGORICAL DATA ANA, DOI [10.1002/0471249688, DOI 10.1002/0471249688]
[2]   Microarray data analysis: from disarray to consolidation and consensus [J].
Allison, DB ;
Cui, XQ ;
Page, GP ;
Sabripour, M .
NATURE REVIEWS GENETICS, 2006, 7 (01) :55-65
[3]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[4]   A Two-Stage Poisson Model for Testing RNA-Seq Data [J].
Auer, Paul L. ;
Doerge, Rebecca W. .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
[5]   Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates [J].
Baggerly, KA ;
Deng, L ;
Morris, JS ;
Aldaz, CM .
BMC BIOINFORMATICS, 2004, 5 (1)
[6]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[7]   Global identification of human transcribed sequences with genome tiling arrays [J].
Bertone, P ;
Stolc, V ;
Royce, TE ;
Rozowsky, JS ;
Urban, AE ;
Zhu, XW ;
Rinn, JL ;
Tongprasit, W ;
Samanta, M ;
Weissman, S ;
Gerstein, M ;
Snyder, M .
SCIENCE, 2004, 306 (5705) :2242-2246
[8]   Measuring differential gene expression by short read sequencing: quantitative comparison to 2-channel gene expression microarrays [J].
Bloom, Joshua S. ;
Khan, Zia ;
Kruglyak, Leonid ;
Singh, Mona ;
Caudy, Amy A. .
BMC GENOMICS, 2009, 10
[9]   Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution [J].
Cheng, J ;
Kapranov, P ;
Drenkow, J ;
Dike, S ;
Brubaker, S ;
Patel, S ;
Long, J ;
Stern, D ;
Tammana, H ;
Helt, G ;
Sementchenko, V ;
Piccolboni, A ;
Bekiranov, S ;
Bailey, DK ;
Ganesh, M ;
Ghosh, S ;
Bell, I ;
Gerhard, DS ;
Gingeras, TR .
SCIENCE, 2005, 308 (5725) :1149-1154
[10]   Fundamentals of experimental design for cDNA microarrays [J].
Churchill, GA .
NATURE GENETICS, 2002, 32 (Suppl 4) :490-495