Hotelling's T2 multivariate profiling for detecting differential expression in microarrays

被引:75
作者
Lu, Y
Liu, PY
Xiao, P
Deng, HW [1 ]
机构
[1] Xi An Jiao Tong Univ, Minist Educ, Key Lab Biomed Informat Engn, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, Sch Life Sci & Technol, Inst Mol Genet, Xian 710049, Peoples R China
[3] Hunan Normal Univ, Coll Lofe Sci, Lab Mol & Stat Genet, Changsha 410081, Hunan, Peoples R China
[4] Creighton Univ, Osteoporosis Res Ctr, Omaha, NE 68131 USA
基金
美国国家卫生研究院; 中国国家自然科学基金;
关键词
D O I
10.1093/bioinformatics/bti496
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The most widely used statistical methods for finding differentially expressed genes (DEGs) are essentially univariate. In this study, we present a new T-2 statistic for analyzing microarray data. We implemented our method using a multiple forward search (MFS) algorithm that is designed for selecting a subset of feature vectors in high-dimensional microarray datasets. The proposed T-2 statistic is a corollary to that originally developed for multivariate analyses and possesses two prominent statistical properties. First, our method takes into account multidimensional structure of microarray data. The utilization of the information hidden in gene interactions allows for finding genes whose differential expressions are not marginally detectable in univariate testing methods. Second, the statistic has a close relationship to discriminant analyses for classification of gene expression patterns. Our search algorithm sequentially maximizes gene expression difference/distance between two groups of genes. Including such a set of DEGs into initial feature variables may increase the power of classification rules. We validated our method by using a spike-in HGU95 dataset from Affymetrix. The utility of the new method was demonstrated by application to the analyses of gene expression patterns in human liver cancers and breast cancers. Extensive bioinformatics analyses and cross-validation of DEGs identified in the application datasets showed the significant advantages of our new algorithm.
引用
收藏
页码:3105 / 3113
页数:9
相关论文
共 36 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], 2002, The Mahalanobis-Taguchi Strategy: A Pattern Technology System
[3]  
[Anonymous], 1979, Multivariate analysis
[4]   A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes [J].
Baldi, P ;
Long, AD .
BIOINFORMATICS, 2001, 17 (06) :509-519
[5]   Gene expression data analysis [J].
Brazma, A ;
Vilo, J .
FEBS LETTERS, 2000, 480 (01) :17-24
[6]   Mutations of mitotic checkpoint genes in human cancers [J].
Cahill, DP ;
Lengauer, C ;
Yu, J ;
Riggins, GJ ;
Willson, JKV ;
Markowitz, SD ;
Kinzler, KW ;
Vogelstein, B .
NATURE, 1998, 392 (6673) :300-303
[7]  
CASIANO CA, 1993, J CELL SCI, V106, P1045
[8]   Gene expression patterns in human liver cancers [J].
Chen, X ;
Cheung, ST ;
So, S ;
Fan, ST ;
Barry, C ;
Higgins, J ;
Lai, KM ;
Ji, JF ;
Dudoit, S ;
Ng, IOL ;
van de Rijn, M ;
Botstein, D ;
Brown, PO .
MOLECULAR BIOLOGY OF THE CELL, 2002, 13 (06) :1929-1939
[9]  
Chen Y, 1997, J Biomed Opt, V2, P364, DOI 10.1117/12.281504
[10]   Multivariate approach for selecting sets of differentially expressed genes [J].
Chilingaryan, A ;
Gevorgyan, N ;
Vardanyan, A ;
Jones, D ;
Szabo, A .
MATHEMATICAL BIOSCIENCES, 2002, 176 (01) :59-69