Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach

被引:131
作者
Opgen-Rhein, Rainer [1 ]
Strimmer, Korbinian [1 ]
机构
[1] Univ Munich, Dept Stat, D-80539 Munich, Germany
来源
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY | 2007年 / 6卷
关键词
high-dimensional case-control data; James-Stein shrinkage; limited-translation; quasi-empirical Bayes; regularized t statistic; variance shrinkage;
D O I
10.2202/1544-6115.1252
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
High-dimensional case-control analysis is encountered in many different settings in genomics. In order to rank genes accordingly, many different scores have been proposed, ranging from ad hoc modifications of the ordinary t statistic to complicated hierarchical Bayesian models. Here, we introduce the "shrinkage t" statistic that is based on a novel and model-free shrinkage estimate of the variance vector across genes. This is derived in a quasi-empirical Bayes setting. The new rank score is fully automatic and requires no specification of parameters or distributions. It is computationally inexpensive and can be written analytically in closed form. Using a series of synthetic and three real expression data we studied the quality of gene rankings produced by the "shrinkage t" statistic. The new score consistently leads to highly accurate rankings for the complete range of investigated data sets and all considered scenarios for across-gene variance structures.
引用
收藏
页数:20
相关论文
共 40 条
[1]   A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes [J].
Baldi, P ;
Long, AD .
BIOINFORMATICS, 2001, 17 (06) :509-519
[2]   A FAMILY OF MINIMAX ESTIMATORS OF MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION [J].
BARANCHI.AJ .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (02) :642-&
[3]   Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset [J].
Choe, SE ;
Boutros, M ;
Michelson, AM ;
Church, GM ;
Halfon, MS .
GENOME BIOLOGY, 2005, 6 (02)
[4]   A benchmark for affymetrix GeneChip expression measures [J].
Cope, LM ;
Irizarry, RA ;
Jaffee, HA ;
Wu, ZJ ;
Speed, TP .
BIOINFORMATICS, 2004, 20 (03) :323-331
[5]   Improved statistical tests for differential gene expression by shrinking variance components estimates [J].
Cui, XG ;
Hwang, JTG ;
Qiu, J ;
Blades, NJ ;
Churchill, GA .
BIOSTATISTICS, 2005, 6 (01) :59-75
[6]   Statistical tests for differential expression in cDNA microarray experiments [J].
Cui, XQ ;
Churchill, GA .
GENOME BIOLOGY, 2003, 4 (04)
[7]   Normal uniform mixture differential gene expression detection for cDNA microarrays [J].
Dean, N ;
Raftery, AE .
BMC BIOINFORMATICS, 2005, 6 (1)
[8]   Mixture model on the variance for the differential analysis of gene expression data [J].
Delmar, P ;
Robin, S ;
Tronik-Le Roux, D ;
Daudin, JJ .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2005, 54 :31-50
[9]   LIMITING RISK OF BAYES AND EMPIRICAL BAYES ESTIMATORS .2. EMPIRICAL BAYES CASE [J].
EFRON, B ;
MORRIS, C .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1972, 67 (337) :130-&
[10]   DATA-ANALYSIS USING STEINS ESTIMATOR AND ITS GENERALIZATIONS [J].
EFRON, B ;
MORRIS, C .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (350) :311-319