A two-parameter generalized Poisson model to improve the analysis of RNA-seq data

被引:93
作者
Srivastava, Sudeep [1 ]
Chen, Liang [1 ]
机构
[1] Univ So Calif, Dept Biol Sci, Los Angeles, CA 90089 USA
基金
美国国家卫生研究院;
关键词
DIFFERENTIAL EXPRESSION; HUMAN TRANSCRIPTOME; GENOME; NORMALIZATION; ALIGNMENT;
D O I
10.1093/nar/gkq670
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Deep sequencing of RNAs (RNA-seq) has been a useful tool to characterize and quantify transcriptomes. However, there are significant challenges in the analysis of RNA-seq data, such as how to separate signals from sequencing bias and how to perform reasonable normalization. Here, we focus on a fundamental question in RNA-seq analysis: the distribution of the position-level read counts. Specifically, we propose a two-parameter generalized Poisson (GP) model to the position-level read counts. We show that the GP model fits the data much better than the traditional Poisson model. Based on the GP model, we can better estimate gene or exon expression, perform a more reasonable normalization across different samples, and improve the identification of differentially expressed genes and the identification of differentially spliced exons. The usefulness of the GP model is demonstrated by applications to multiple RNA-seq data sets.
引用
收藏
页码:e170 / e170
页数:15
相关论文
共 25 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[3]   Evaluation of DNA microarray results with quantitative gene expression platforms [J].
Canales, Roger D. ;
Luo, Yuling ;
Willey, James C. ;
Austermiller, Bradley ;
Barbacioru, Catalin C. ;
Boysen, Cecilie ;
Hunkapiller, Kathryn ;
Jensen, Roderick V. ;
Knight, Charles R. ;
Lee, Kathleen Y. ;
Ma, Yunqing ;
Maqsodi, Botoul ;
Papallo, Adam ;
Peters, Elizabeth Herness ;
Poulter, Karen ;
Ruppel, Patricia L. ;
Samaha, Raymond R. ;
Shi, Leming ;
Yang, Wen ;
Zhang, Lu ;
Goodsaid, Federico M. .
NATURE BIOTECHNOLOGY, 2006, 24 (09) :1115-1122
[4]   Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines [J].
Castle, John C. ;
Zhang, Chaolin ;
Shah, Jyoti K. ;
Kulkarni, Amit V. ;
Kalsotra, Auinash ;
Cooper, Thomas A. ;
Johnson, Jason M. .
NATURE GENETICS, 2008, 40 (12) :1416-1425
[5]   Stem cell transcriptome profiling via massive-scale mRNA sequencing [J].
Cloonan, Nicole ;
Forrest, Alistair R. R. ;
Kolle, Gabriel ;
Gardiner, Brooke B. A. ;
Faulkner, Geoffrey J. ;
Brown, Mellissa K. ;
Taylor, Darrin F. ;
Steptoe, Anita L. ;
Wani, Shivangi ;
Bethel, Graeme ;
Robertson, Alan J. ;
Perkins, Andrew C. ;
Bruce, Stephen J. ;
Lee, Clarence C. ;
Ranade, Swati S. ;
Peckham, Heather E. ;
Manning, Jonathan M. ;
McKernan, Kevin J. ;
Grimmond, Sean M. .
NATURE METHODS, 2008, 5 (07) :613-619
[6]  
Consul P.C., 1989, Generalized Poisson Distributions: Properties and Applications
[7]  
CONSUL PC, 1974, SANKHYA SER B, V36, P391
[8]   SOME INTERESTING PROPERTIES OF GENERALIZED POISSON DISTRIBUTION [J].
CONSUL, PC ;
JAIN, GC .
BIOMETRISCHE ZEITSCHRIFT, 1973, 15 (07) :495-500
[9]   GENERALIZATION OF POISSON DISTRIBUTION [J].
CONSUL, PC ;
JAIN, GC .
TECHNOMETRICS, 1973, 15 (04) :791-799
[10]   DAVID: Database for annotation, visualization, and integrated discovery [J].
Dennis, G ;
Sherman, BT ;
Hosack, DA ;
Yang, J ;
Gao, W ;
Lane, HC ;
Lempicki, RA .
GENOME BIOLOGY, 2003, 4 (09)