Noise sampling method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarrays

被引:44
作者
Draghici, S
Kulaeva, O
Hoff, B
Petrov, A
Shams, S
Tainsky, MA
机构
[1] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
[2] BioDiscovery Inc, Marina Del Rey, CA 90292 USA
[3] Wayne State Univ, Karmanos Canc Inst, Detroit, MI 48201 USA
关键词
D O I
10.1093/bioinformatics/btg165
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A crucial step in microarray data analysis is the selection of subsets of interesting genes from the initial set of genes. In many cases, especially when comparing a specific condition to a reference, the genes of interest are those which are differentially expressed. Two common methods for gene selection are: (a) selection by fold difference (at least n fold variation) and (b) selection by altered ratio (at least n standard deviations away from the mean ratio). Results: The novel method proposed here is based on ANOVA and uses replicate spots to estimate an empirical distribution of the noise. The measured intensity range is divided in a number of intervals. A noise distribution is constructed for each such interval. Bootstrapping is used to map the desired confidence levels from the noise distribution corresponding to a given interval to the measured log ratios in that interval. If the method is applied on individual arrays having replicate spots, the method can calculate an overall width of the noise distribution which can be used as an indicator of the array quality. We compared this method with the fold change and unusual ratio method. We also discuss the relationship with an ANOVA model proposed by Churchill et al. In silico experiments were performed while controlling the degree of regulation as well as the amount of noise. Such experiments show the performance of the classical methods can be very unsatisfactory. We also compared the results of the 2-fold method with the results of the noise sampling method using pre and post immortalization cell lines derived from the MDAH041 fibroblasts hybridized on Affymetrix GeneChip arrays. The 2-fold method reported 198 genes as upregulated and 493 genes as downregulated. The noise sampling method reported 98 gene upregulated and 240 genes downregulated at the 99.99% confidence level. The methods agreed on 221 genes downregulated and 66 genes upregulated. Fourteen genes from the subset of genes reported by both methods were all confirmed by Q-RT-PCR. Alternative assays on various subsets of genes on which the two methods disagreed suggested that the noise sampling method is likely to provide fewer false positives.
引用
收藏
页码:1348 / 1359
页数:12
相关论文
共 44 条
[1]   Identification of the SAAT gene involved in strawberry flavor biogenesis by use of DNA microarrays [J].
Aharoni, A ;
Keizer, LCP ;
Bouwmeester, HJ ;
Sun, ZK ;
Alvarez-Huerta, M ;
Verhoeven, HA ;
Blaas, J ;
van Houwelingen, AMML ;
De Vos, RCH ;
van der Voet, H ;
Jansen, RC ;
Guis, M ;
Mol, J ;
Davis, RW ;
Schena, M ;
van Tunen, AJ ;
O'Connell, AP .
PLANT CELL, 2000, 12 (05) :647-661
[2]  
BISCHOFF F, 1990, CANCER RES, V24, P7979
[3]   Calculation of the minimum number of replicate spots required for detection of significant gene expression fold change in microarray experiments [J].
Black, MA ;
Doerge, RW .
BIOINFORMATICS, 2002, 18 (12) :1609-1616
[4]   Gene expression data analysis [J].
Brazma, A ;
Vilo, J .
FEBS LETTERS, 2000, 480 (01) :17-24
[5]   Significance and statistical errors in the analysis of DNA microarray data [J].
Brody, JP ;
Williams, BA ;
Wold, BJ ;
Quake, SR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (20) :12975-12978
[6]  
Chen Y, 1997, J Biomed Opt, V2, P364, DOI 10.1117/12.281504
[7]   Computational methods for the identification of differential and coordinated gene expression [J].
Claverie, JM .
HUMAN MOLECULAR GENETICS, 1999, 8 (10) :1821-1832
[8]  
DeRisi J, 1996, NAT GENET, V14, P457
[9]   Exploring the metabolic and genetic control of gene expression on a genomic scale [J].
DeRisi, JL ;
Iyer, VR ;
Brown, PO .
SCIENCE, 1997, 278 (5338) :680-686
[10]   Statistical intelligence: effective analysis of high-density microarray data [J].
Draghici, S .
DRUG DISCOVERY TODAY, 2002, 7 (11) :S55-S63