Orthogonal projections to latent structures as a strategy for microarray data normalization

被引:58
作者
Bylesjo, Max [1 ]
Eriksson, Daniel
Sjodin, Andreas
Jansson, Stefan
Moritz, Thomas
Trygg, Johan
机构
[1] Umea Univ, Dept Chem, Res Grp Chemometr, S-90187 Umea, Sweden
[2] Swedish Univ Agr Sci, Umea Plant Sci Ctr, Dept Forest Genet & Plant Physiol, S-90183 Umea, Sweden
[3] Umea Univ, Umea Plant Sci Ctr, Dept Plant Physiol, S-90187 Umea, Sweden
关键词
D O I
10.1186/1471-2105-8-207
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: During generation of microarray data, various forms of systematic biases are frequently introduced which limits accuracy and precision of the results. In order to properly estimate biological effects, these biases must be identified and discarded. Results: We introduce a normalization strategy for multi-channel microarray data based on orthogonal projections to latent structures (OPLS); a multivariate regression method. The effect of applying the normalization methodology on single-channel Affymetrix data as well as dual-channel cDNA data is illustrated. We provide a parallel comparison to a wide range of commonly employed normalization methods with diverse properties and strengths based on sensitivity and specificity from external (spike-in) controls. On the illustrated data sets, the OPLS normalization strategy exhibits leading average true negative and true positive rates in comparison to other evaluated methods. Conclusion: The OPLS methodology identifies joint variation within biological samples to enable the removal of sources of variation that are non-correlated (orthogonal) to the within-sample variation. This ensures that structured variation related to the underlying biological samples is separated from the remaining, bias-related sources of systematic variation. As a consequence, the methodology does not require any explicit knowledge regarding the presence or characteristics of certain biases. Furthermore, there is no underlying assumption that the majority of elements should be non-differentially expressed, making it applicable to specialized boutique arrays.
引用
收藏
页数:10
相关论文
共 34 条
[1]  
[Anonymous], 2003, STAT APPL GENET MOL
[2]   Global analysis of carbohydrate utilization by Lactobacillus acidophilus using cDNA microarrays [J].
Barrangou, R ;
Azcarate-Peril, MA ;
Duong, T ;
Conners, SB ;
Kelly, RM ;
Klaenhammer, TR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (10) :3816-3821
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]   A comparison of normalization methods for high density oligonucleotide array data based on variance and bias [J].
Bolstad, BM ;
Irizarry, RA ;
Åstrand, M ;
Speed, TP .
BIOINFORMATICS, 2003, 19 (02) :185-193
[5]   OPLS discriminant analysis:: combining the strengths of PLS-DA and SIMCA classification [J].
Bylesjo, Max ;
Rantalainen, Mattias ;
Cloarec, Olivier ;
Nicholson, Jeremy K. ;
Holmes, Elaine ;
Trygg, Johan .
JOURNAL OF CHEMOMETRICS, 2006, 20 (8-10) :341-351
[6]   Fundamentals of experimental design for cDNA microarrays [J].
Churchill, GA .
NATURE GENETICS, 2002, 32 (Suppl 4) :490-495
[7]   Triple-target microarray experiments: a novel experimental strategy [J].
Forster, T ;
Costa, Y ;
Roy, D ;
Cooke, HJ ;
Maratou, K .
BMC GENOMICS, 2004, 5 (1)
[8]   Model selection and efficiency testing for normalization of cDNA microarray data [J].
Futschik, M ;
Crompton, T .
GENOME BIOLOGY, 2004, 5 (08)
[9]   Three color cDNA microarrays: quantitative assessment through the use of fluorescein-labeled probes [J].
Hessner, MJ ;
Wang, XJ ;
Hulse, K ;
Meyer, L ;
Wu, Y ;
Nye, S ;
Guo, SW ;
Ghosh, S .
NUCLEIC ACIDS RESEARCH, 2003, 31 (04) :e14
[10]  
Huber Wolfgang, 2002, Bioinformatics, V18 Suppl 1, pS96