Characterizing dye bias in microarray experiments

被引：46

作者：

Dobbin, KK ^{[1
]}

Kawasaki, ES

Petersen, DW

Simon, RM

机构：

[1] NCI, Biometr Res Branch, NIH, Bethesda, MD 20892 USA

[2] NCI, Ctr Adv Technol, NIH, Bethesda, MD 20892 USA

来源：

BIOINFORMATICS | 2005年 / 21卷 / 10期

关键词：

D O I：

10.1093/bioinformatics/bti378

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: Spot intensity serves as a proxy for gene expression in dual-label microarray experiments. Dye bias is defined as an intensity difference between samples labeled with different dyes attributable to the dyes instead of the gene expression in the samples. Dye bias that is not removed by array normalization can introduce bias into comparisons between samples of interest. But if the bias is consistent across samples for the same gene, it can be corrected by proper experimental design and analysis. If the dye bias is not consistent across samples for the same gene, but is different for different samples, then removing the bias becomes more problematic, perhaps indicating a technical limitation to the ability of fluorescent signals to accurately represent gene expression. Thus, it is important to characterize dye bias to determine: (1) whether it will be removed for all genes by array normalization, (2) whether it will not be removed by normalization but can be removed by proper experimental design and analysis and (3) whether dye bias correction is more problematic than either of these and is not easily removable. Results: We analyzed two large (each > 27 arrays) tissue culture experiments with extensive dye swap arrays to better characterize dye bias. Indirect, amino-allyl labeling was used in both experiments. We found that post-normalization dye bias that is consistent across samples does appear to exist for many genes, and that controlling and correcting for this type of dye bias in design and analysis is advisable. The extent of this type of dye bias remained unchanged under a wide range of normalization methods (median-centering, various loess normalizations) and statistical analysis techniques (parametric, rank based, permutation based, etc.). We also found dye bias related to the individual samples for a much smaller subset of genes. But these sample-specific dye biases appeared to have minimal impact on estimated gene-expression differences between the cell lines.

引用

页码：2430 / 2437

页数：8

共 22 条

[1] ROBUST ESTIMATION IN HETEROSCEDASTIC LINEAR-MODELS [J].