Moderated statistical tests for assessing differences in tag abundance

被引:599
作者
Robinson, Mark D.
Smyth, Gordon K. [1 ]
机构
[1] Royal Melbourne Hosp, Walter & Eliza Hall Inst Med Res, Bioinformat Div, Parkville, Vic 3050, Australia
[2] Univ Melbourne, Dept Med Biol, Parkville, Vic 3010, Australia
基金
英国医学研究理事会;
关键词
D O I
10.1093/bioinformatics/btm453
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Digital gene expression (DGE) technologies measure gene expression by counting sequence tags. They are sensitive technologies for measuring gene expression on a genomic scale, without the need for prior knowledge of the genome sequence. As the cost of sequencing DNA decreases, the number of DGE datasets is expected to grow dramatically. Various tests of differential expression have been proposed for replicated DGE data using binomial, Poisson, negative binomial or pseudo-likelihood (PL) models for the counts, but none of the these are usable when the number of replicates is very small. Results: We develop tests using the negative binomial distribution to model overdispersion relative to the Poisson, and use conditional weighted likelihood to moderate the level of overdispersion across genes. Not only is our strategy applicable even with the smallest number of libraries, but it also proves to be more powerful than previous strategies when more libraries are available. The methodology is equally applicable to other counting technologies, such as proteomic spectral counts.
引用
收藏
页码:2881 / 2887
页数:7
相关论文
共 24 条
[1]   Systematic search for gastric cancer-specific genes based on SAGE data: melanoma inhibitory activity and matrix metalloproteinase-10 are novel prognostic factors in patients with gastric cancer [J].
Aung, PP ;
Oue, N ;
Mitani, Y ;
Nakayama, H ;
Yoshida, K ;
Noguchi, T ;
Bosserhoff, AK ;
Yasui, W .
ONCOGENE, 2006, 25 (17) :2546-2557
[2]   Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates [J].
Baggerly, KA ;
Deng, L ;
Morris, JS ;
Aldaz, CM .
BMC BIOINFORMATICS, 2004, 5 (1)
[3]   Differential expression in SAGE: accounting for normal between-library variation [J].
Baggerly, KA ;
Deng, L ;
Morris, JS ;
Aldaz, CM .
BIOINFORMATICS, 2003, 19 (12) :1477-1483
[4]   Bayesian inference for the negative binomial distribution via polynomial expansions [J].
Bradlow, ET ;
Hardie, BGS ;
Fader, PS .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2002, 11 (01) :189-201
[5]   Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays [J].
Brenner, S ;
Johnson, M ;
Bridgham, J ;
Golda, G ;
Lloyd, DH ;
Johnson, D ;
Luo, SJ ;
McCurdy, S ;
Foy, M ;
Ewan, M ;
Roth, R ;
George, D ;
Eletr, S ;
Albrecht, G ;
Vermaas, E ;
Williams, SR ;
Moon, K ;
Burcham, T ;
Pallas, M ;
DuBridge, RB ;
Kirchner, J ;
Fearon, K ;
Mao, J ;
Corcoran, K .
NATURE BIOTECHNOLOGY, 2000, 18 (06) :630-634
[6]   High-throughput GLGI procedure for converting a large number of serial analysis of gene expression tag sequences into 3′ complementary DNAs [J].
Chen, JJ ;
Lee, SG ;
Zhou, GL ;
Wang, SM .
GENES CHROMOSOMES & CANCER, 2002, 33 (03) :252-261
[7]   The colorectal microRNAome [J].
Cummins, JM ;
He, YP ;
Leary, RJ ;
Pagliarini, R ;
Diaz, LA ;
Sjoblom, T ;
Barad, O ;
Bentwich, Z ;
Szafranska, AE ;
Labourier, E ;
Raymond, CK ;
Roberts, BS ;
Juhl, H ;
Kinzler, KW ;
Vogelstein, B ;
Velculescu, VE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (10) :3687-3692
[8]   Distinct epigenetic changes in the stromal cells of breast cancers [J].
Hu, M ;
Yao, J ;
Cai, L ;
Bachman, KE ;
van den Brûle, F ;
Velculescu, V ;
Polyak, K .
NATURE GENETICS, 2005, 37 (08) :899-905
[9]  
Impey S, 2004, CELL, V119, P1041, DOI 10.1016/S0092-8674(04)01159-6
[10]   Polony multiplex analysis of gene expression (PMAGE) in mouse hypertrophic cardiomyopathy [J].
Kim, Jae Bum ;
Porreca, Gregory J. ;
Song, Lei ;
Greenway, Steven C. ;
Gorham, Joshua M. ;
Church, George M. ;
Seidman, Christine E. ;
Seidman, J. G. .
SCIENCE, 2007, 316 (5830) :1481-1484