Multivariate exploratory tools for microarray data analysis

被引:36
作者
Szabo, A
Boucher, K
Jones, D
Tsodikov, AD
Klebanov, LB
Yakovlev, AY
机构
[1] Univ Utah, Huntsman Canc Inst, Salt Lake City, UT 84112 USA
[2] Univ Utah, Dept Oncol Sci, Salt Lake City, UT 84112 USA
[3] Charles Univ, Dept Probabil & Stat, CZ-18675 Prague, Czech Republic
[4] Univ Rochester, Dept Biostat & Computat Biol, Rochester, NY 14642 USA
关键词
cross-validation; differential expression; permutation test; probability distance; random search; sets of genes;
D O I
10.1093/biostatistics/4.4.555
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The ultimate success of microarray technology in basic and applied biological sciences depends critically on the development of statistical methods for gene expression data analysis. The most widely used tests for differential expression of genes are essentially univariate. Such tests disregard the multidimensional structure of microarray data. Multivariate methods are needed to utilize the information hidden in gene interactions and hence to provide more powerful and biologically meaningful methods for finding subsets of differentially expressed genes. The objective of this paper is to develop methods of multidimensional search for biologically significant genes, considering expression signals as mutually dependent random variables. To attain these ends, we consider the utility of a pertinent distance between random vectors and its empirical counterpart constructed from gene expression data. The distance furnishes exploratory procedures aimed at finding a target subset of differentially expressed genes. To determine the size of the target subset, we resort to successive elimination of smaller subsets resulting from each step of a random search algorithm based on maximization of the proposed distance. Different stopping rules associated with this procedure are evaluated. The usefulness of the proposed approach is illustrated with an application to the analysis of two sets of gene expression data.
引用
收藏
页码:555 / 567
页数:13
相关论文
共 17 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]  
[Anonymous], STABILITY PROBLEMS S
[3]   Tissue classification with gene expression profiles [J].
Ben-Dor, A ;
Bruhn, L ;
Friedman, N ;
Nachman, I ;
Schummer, M ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :559-583
[4]   Multivariate approach for selecting sets of differentially expressed genes [J].
Chilingaryan, A ;
Gevorgyan, N ;
Vardanyan, A ;
Jones, D ;
Szabo, A .
MATHEMATICAL BIOSCIENCES, 2002, 176 (01) :59-69
[5]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[6]  
Hastie T, 2001, GENOME BIOL, V2
[7]  
Kerr M K, 2001, Biostatistics, V2, P183, DOI 10.1093/biostatistics/2.2.183
[8]   Analysis of variance for gene expression microarray data [J].
Kerr, MK ;
Martin, M ;
Churchill, GA .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (06) :819-837
[9]  
Khan J, 1998, CANCER RES, V58, P5009
[10]  
Li LP, 2001, COMB CHEM HIGH T SCR, V4, P727