Microarray background correction: maximum likelihood estimation for the normal-exponential convolution

被引:144
作者
Silver, Jeremy D. [1 ,2 ]
Ritchie, Matthew E. [3 ]
Smyth, Gordon K. [1 ]
机构
[1] Walter & Eliza Hall Inst Med Res, Bioinformat Div, Parkville, Vic 3050, Australia
[2] Univ Copenhagen, Dept Biostat, DK-1014 Copenhagen K, Denmark
[3] Univ Cambridge, Dept Oncol, Cambridge CB2 0RE, England
基金
英国医学研究理事会; 澳大利亚国家健康与医学研究理事会;
关键词
2-color microarray; Background correction; Maximum likelihood; Nelder-Mead algorithm; Newton- Raphson algorithm; Normal-exponential convolution; GENE-EXPRESSION;
D O I
10.1093/biostatistics/kxn042
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background correction is an important preprocessing step for microarray data that attempts to adjust the data for the ambient intensity surrounding each feature. The "normexp" method models the observed pixel intensities as the sum of 2 random variables, one normally distributed and the other exponentially distributed, representing background noise and signal, respectively. Using a saddle-point approximation, Ritchie and others (2007) found normexp to be the best background correction method for 2-color microarray data. This article develops the normexp method further by improving the estimation of the parameters. A complete mathematical development is given of the normexp model and the associated saddle-point approximation. Some subtle numerical programming issues are solved which caused the original normexp method to fail occasionally when applied to unusual data sets. A practical and reliable algorithm is developed for exact maximum likelihood estimation (MLE) using high-quality optimization software and using the saddle-point estimates as starting values. "MLE" is shown to outperform heuristic estimators proposed by other authors, both in terms of estimation accuracy and in terms of performance on real data. The saddle-point approximation is an adequate replacement in most practical situations. The performance of normexp for assessing differential expression is improved by adding a small offset to the corrected intensities.
引用
收藏
页码:352 / 363
页数:12
相关论文
共 19 条
[1]  
BARNDORFFNIELSE.OE, 1981, ASYMPTOTIC TECHNIQUE
[2]  
BOLSTAD B, 2004, THESIS U CALIFORNIA
[3]   ROBUST LOCALLY WEIGHTED REGRESSION AND SMOOTHING SCATTERPLOTS [J].
CLEVELAND, WS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1979, 74 (368) :829-836
[4]   Enhanced identification and biological validation of differential gene expression via Illumina whole-genome expression arrays through the use of the model-based background correction methodology [J].
Ding, Liang-Hao ;
Xie, Yang ;
Park, Seongmi ;
Xiao, Guanghua ;
Story, Michael D. .
NUCLEIC ACIDS RESEARCH, 2008, 36 (10)
[5]   affy -: analysis of Affymetrix GeneChip data at the probe level [J].
Gautier, L ;
Cope, L ;
Bolstad, BM ;
Irizarry, RA .
BIOINFORMATICS, 2004, 20 (03) :307-315
[6]  
Gay D.M., 1990, Computing Science Technical Report, V153, P1
[7]   COMPUTING OPTIMAL LOCALLY CONSTRAINED STEPS [J].
GAY, DM .
SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1981, 2 (02) :186-197
[8]   SUBROUTINES FOR UNCONSTRAINED MINIMIZATION USING A MODEL TRUST-REGION APPROACH [J].
GAY, DM .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1983, 9 (04) :503-524
[9]   Statistical analysis of an RNA titration series evaluates microarray precision and sensitivity on a whole-array basis [J].
Holloway, Andrew J. ;
Oshlack, Alicia ;
Diyagama, Dileepa S. ;
Bowtell, David D. L. ;
Smyth, Gordon K. .
BMC BIOINFORMATICS, 2006, 7 (1)
[10]   Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264