Which is better for cDNA-microarray-based classification: ratios or direct intensities

被引:19
作者
Attoor, S
Dougherty, ER [1 ]
Chen, Y
Bittner, ML
Trent, JM
机构
[1] Texas A&M Univ, Dept Elect Engn, College Stn, TX 77843 USA
[2] Univ Texas, MD Anderson Canc Ctr, Dept Pathol, Houston, TX 77030 USA
[3] NHGRI, NIH, Bethesda, MD 20892 USA
[4] Translat Genom Res Inst, Phoenix, AZ 85004 USA
关键词
D O I
10.1093/bioinformatics/bth272
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There are two general methods for making gene-expression microarrays: one is to hybridize a single test set of labeled targets to the probe, and measure the background-subtracted intensity at each probe site; the other is to hybridize both a test and a reference set of differentially labeled targets to a single detector array, and measure the ratio of the background-subtracted intensities at each probe site. Which method is better depends on the variability in the cell system and the random factors resulting from the microarray technology. It also depends on the purpose for which the microarray is being used. Classification is a fundamental application and it is the one considered here. Results: This paper describes a model-based simulation paradigm that compares the classification accuracy provided by these methods over a variety of noise types and presents the results of a study modeled on noise typical of cDNA microarray data. The model consists of four parts: (1) the measurement equation for genes in the reference state; (2) the measurement equation for genes in the test state; (3) the ratio and normalization procedure for a dual-channel system; and (4) the intensity and normalization procedure for a single-channel system. In the reference state, the mean intensities are modeled as a shifted exponential distribution, and the intensity for a particular gene is modeled via a normal distribution, Normal(I, alphaI), about its mean intensity I, with alpha being the coefficient of variation of the cell system. In the test state, some genes have their intensities up-regulated by a random factor. The model includes a number of random factors affecting intensity measurement: deposition gain d, labeling gain, and post-image-processing residual noise. The key conclusion resulting from the study is that the coefficient of variation governing the randomness of the intensities and the deposition gain are the most important factors for determining whether a single-channel or dual-channel system provides superior classification, and the decision region in the alpha-d plane is approximately linear.
引用
收藏
页码:2513 / 2520
页数:8
相关论文
共 15 条
[1]  
BENDOR A, 2000, AGL200013 AG LAB
[2]  
Chen Y, 1997, J Biomed Opt, V2, P364, DOI 10.1117/12.281504
[3]   Ratio statistics of gene expression levels and applications to microarray data analysis [J].
Chen, YD ;
Kamat, V ;
Dougherty, ER ;
Bittner, ML ;
Meltzer, PS ;
Trent, JM .
BIOINFORMATICS, 2002, 18 (09) :1207-1215
[4]   Statistical design of reverse dye microarrays [J].
Dobbin, K ;
Shih, JH ;
Simon, R .
BIOINFORMATICS, 2003, 19 (07) :803-810
[5]   Mathematical modeling of noise and discovery of genetic expression classes in gliomas [J].
Fathallah-Shaykh, HM ;
Rigen, M ;
Zhao, LJ ;
Bansal, K ;
He, B ;
Engelhard, HH ;
Cerullo, L ;
Von Roenn, K ;
Byrne, R ;
Munoz, L ;
Rosseau, GL ;
Glick, R ;
Lichtor, T ;
DiSavino, E .
ONCOGENE, 2002, 21 (47) :7164-7174
[6]   Making sense of microarray data distributions [J].
Hoyle, DC ;
Rattray, M ;
Jupp, R ;
Brass, A .
BIOINFORMATICS, 2002, 18 (04) :576-584
[7]  
Kerr MK, 2001, GENET RES, V77, P123
[8]   Analysis of variance for gene expression microarray data [J].
Kerr, MK ;
Martin, M ;
Churchill, GA .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (06) :819-837
[9]   Importance of replication in microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations [J].
Lee, MLT ;
Kuo, FC ;
Whitmore, GA ;
Sklar, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :9834-9839
[10]   Statistical analysis of high-density oligonucleotide arrays:: a multiplicative noise model [J].
Sásik, R ;
Calvo, E ;
Corbeil, J .
BIOINFORMATICS, 2002, 18 (12) :1633-1640