Robust semiparametric mixing for detecting differentially expressed genes in microarray experiments

被引:3
作者
Alfo, Marco [1 ]
Farcomeni, Alessio [1 ]
Tardella, Luca [1 ]
机构
[1] Univ Roma La Sapienza, Dipartimento Stat Probabil & Stat Applicate, I-00185 Rome, Italy
关键词
microarray data; up-regulated genes; mixture models; counting distribution; false discovery rate;
D O I
10.1016/j.csda.2006.08.009
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
An important goal of microarray studies is the detection of genes that show significant changes in observed expressions when two or more classes of biological samples such as treatment and control are compared. Using the c-fold rule, a gene is declared to be differentially expressed if its average expression level varies by more than a constant factor c between treatment and control (typically c = 2). While often used, however, this simple rule is not completely convincing. By modeling this filter, a binary variable is defined at the gene x experiment level, allowing for a more powerful treatment of the corresponding information. A gene-specific random term is introduced to control for both dependence among genes and variability with respect to the c-fold threshold. Inference is carried out via a two-level finite mixture model under a likelihood approach. Then, parameter estimates are also derived using the counting distribution under a Bayesian nonparametric approach which allows to keep under control some error rate of erroneous discoveries. The effectiveness of both proposed approaches is illustrated through a large-scale simulation study and a well known benchmark data set. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:5253 / 5265
页数:13
相关论文
共 30 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], 2004, Exploration and analysis of DNA microarray and protein array data
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]   Identification of interaction patterns and classification with applications to microarray data [J].
Boulesteix, AL ;
Tutz, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (03) :783-802
[5]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[6]   Identifying differentially expressed genes in unreplicated multiple-treatment microarray timecourse experiments [J].
DeCook, R ;
Nettleton, D ;
Foster, C ;
Wurtele, ES .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (02) :518-532
[7]  
DETTE H, 1997, PROBABILITY ANAL
[8]   Multiple hypothesis testing in microarray experiments [J].
Dudoit, S ;
Shaffer, JP ;
Boldrick, JC .
STATISTICAL SCIENCE, 2003, 18 (01) :71-103
[9]   Operating characteristics and extensions of the false discovery rate procedure [J].
Genovese, C ;
Wasserman, L .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :499-517
[10]   FULL LIKELIHOOD PROCEDURE FOR ANALYZING EXCHANGEABLE BINARY DATA [J].
GEORGE, EO ;
BOWMAN, D .
BIOMETRICS, 1995, 51 (02) :512-523