Characterization of variability in large-scale gene expression data: Implications for study design

被引:147
作者
Novak, JP
Sladek, R
Hudson, TJ
机构
[1] McGill Univ, Montreal Genome Ctr, Montreal, PQ H3G 1A4, Canada
[2] MIT, Whitehead Inst, Ctr Genome Res, Cambridge, MA 02139 USA
关键词
oligonucleotide microarrays; gene expression microarray analysis;
D O I
10.1006/geno.2001.6675
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Large-scale gene expression measurement techniques provide a unique opportunity to gain insight into biological processes under normal and pathological conditions. To interpret the changes in expression profiles for thousands of genes, we face the nontrivial problem of understanding the significance of these changes. In practice, the sources of background variability in expression data can be divided into three categories: technical, physiological, and sampling. To assess the relative importance of these sources of background variation, we generated replicate gene expression profiles on high-density Affymetrix GeneChip oligonucleotide arrays, using either identical RNA samples or RNA samples obtained under similar biological states. We derived a novel measure of dispersion in two-way comparisons, using a linear characteristic function. When comparing expression profiles from replicate tests using the same RNA sample (a test for technical variability), we observed a level of dispersion similar to the pattern obtained with RNA samples from replicate cultures of the same cell line (a test for physiological variability). On the other hand, a higher level of dispersion was observed when tissue samples of different animals were compared (an example of sampling variability). This implies that, in experiments in which samples from different subjects are used, the variation induced by the stimulus may be masked by non-stimuli-related differences in the subjects' biological state. These analyses underscore the need for replica experiments to reliably interpret large-scale expression data sets, even with simple microarray experiments.
引用
收藏
页码:104 / 113
页数:10
相关论文
共 17 条
[1]   Fluorescent cDNA microarray hybridization reveals complexity and heterogeneity of cellular genotoxic stress responses [J].
Amundson, SA ;
Bittner, M ;
Chen, YD ;
Trent, J ;
Meltzer, P ;
Fornace, AJ .
ONCOGENE, 1999, 18 (24) :3666-3672
[2]  
[Anonymous], BIOMETRY
[3]   A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes [J].
Baldi, P ;
Long, AD .
BIOINFORMATICS, 2001, 17 (06) :509-519
[4]  
Carlisle AJ, 2000, MOL CARCINOGEN, V28, P12, DOI 10.1002/(SICI)1098-2744(200005)28:1<12::AID-MC3>3.0.CO
[5]  
2-Q
[6]  
Chen Y, 1997, J Biomed Opt, V2, P364, DOI 10.1117/12.281504
[7]   Computational methods for the identification of differential and coordinated gene expression [J].
Claverie, JM .
HUMAN MOLECULAR GENETICS, 1999, 8 (10) :1821-1832
[8]   Expression analysis with oligonucleotide microarrays reveals that MYC regulates genes involved in growth, cell cycle, signaling, and adhesion [J].
Coller, HA ;
Grandori, C ;
Tamayo, P ;
Colbert, T ;
Lander, ES ;
Eisenman, RN ;
Golub, TR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (07) :3260-3265
[9]   Functional discovery via a compendium of expression profiles [J].
Hughes, TR ;
Marton, MJ ;
Jones, AR ;
Roberts, CJ ;
Stoughton, R ;
Armour, CD ;
Bennett, HA ;
Coffey, E ;
Dai, HY ;
He, YDD ;
Kidd, MJ ;
King, AM ;
Meyer, MR ;
Slade, D ;
Lum, PY ;
Stepaniants, SB ;
Shoemaker, DD ;
Gachotte, D ;
Chakraburtty, K ;
Simon, J ;
Bard, M ;
Friend, SH .
CELL, 2000, 102 (01) :109-126
[10]   Importance of replication in microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations [J].
Lee, MLT ;
Kuo, FC ;
Whitmore, GA ;
Sklar, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :9834-9839