Evaluation of microarray data normalization procedures using spike-in experiments

被引:14
作者
Ryden, Patrik [1 ]
Andersson, Henrik
Landfors, Mattias
Naslund, Linda
Hartmanova, Blanka
Noppa, Laila
Sjostedt, Anders
机构
[1] Umea Univ, Dept Clin Microbiol, Div Clin Bacteriol, SE-90187 Umea, Sweden
[2] Umea Univ, Dept Mat & Math Stat, SE-90187 Umea, Sweden
[3] Univ Def, Proteome Ctr Study Intracellular Parasitism Bacte, Fac Mil Hlth Sci, Hradec Kralove 50001, Czech Republic
[4] Swedish Def Res Agcy, Div NBC Def, Dept Med Countermeasures, SE-90182 Umea, Sweden
关键词
D O I
10.1186/1471-2105-7-300
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Recently, a large number of methods for the analysis of microarray data have been proposed but there are few comparisons of their relative performances. By using so-called spike-in experiments, it is possible to characterize the analyzed data and thereby enable comparisons of different analysis methods. Results: A spike-in experiment using eight in-house produced arrays was used to evaluate established and novel methods for filtration, background adjustment, scanning, channel adjustment, and censoring. The S-plus package EDMA, a stand-alone tool providing characterization of analyzed cDNA-microarray data obtained from spike-in experiments, was developed and used to evaluate 252 normalization methods. For all analyses, the sensitivities at low false positive rates were observed together with estimates of the overall bias and the standard deviation. In general, there was a trade-off between the ability of the analyses to identify differentially expressed genes (i.e. the analyses' sensitivities) and their ability to provide unbiased estimators of the desired ratios. Virtually all analysis underestimated the magnitude of the regulations; often less than 50% of the true regulations were observed. Moreover, the bias depended on the underlying mRNA-concentration; low concentration resulted in high bias. Many of the analyses had relatively low sensitivities, but analyses that used either the constrained model (i.e. a procedure that combines data from several scans) or partial filtration (a novel method for treating data from so-called not-found spots) had with few exceptions high sensitivities. These methods gave considerable higher sensitivities than some commonly used analysis methods. Conclusion: The use of spike-in experiments is a powerful approach for evaluating microarray preprocessing procedures. Analyzed data are characterized by properties of the observed log-ratios and the analysis' ability to detect differentially expressed genes. If bias is not a major problem; we recommend the use of either the CM-procedure or partial filtration.
引用
收藏
页数:17
相关论文
共 18 条
[1]  
*AX INSTR INC, 2003, GENEPIX PRO 5 0 US G
[2]   Calibration and assessment of channel-specific biases in microarray data with extended dynamical range -: art. no. 177 [J].
Bengtsson, H ;
Jönsson, G ;
Vallon-Christersson, J .
BMC BIOINFORMATICS, 2004, 5 (1)
[3]  
BENGTSSON H, 2004, MATH STAT
[4]   Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range [J].
Dudley, AM ;
Aach, J ;
Steffen, MA ;
Church, GM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (11) :7554-7559
[5]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[6]   Linear models for microarray data analysis: Hidden similarities and differences [J].
Kerr, MK .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (06) :891-901
[7]  
Lönnstedt I, 2002, STAT SINICA, V12, P31
[8]   Towards sound epistemological foundations of statistical methods for high-dimensional biology [J].
Mehta, T ;
Tanik, M ;
Allison, DB .
NATURE GENETICS, 2004, 36 (09) :943-947
[9]   False discovery rate, sensitivity and sample size for microarray studies [J].
Pawitan, Y ;
Michiels, S ;
Koscielny, S ;
Gusnanto, A ;
Ploner, A .
BIOINFORMATICS, 2005, 21 (13) :3017-3024
[10]  
*PERKINELMER LIF S, 2002, SCANARRAYEXPRESS 2 0