Gene selection with multiple ordering criteria

被引:28
作者
Chen, James J.
Tsai, Chen-An
Tzeng, ShengLi
Chen, Chun-Houh
机构
[1] Acad Sinica, Inst Stat Sci, Taipei 115, Taiwan
[2] US FDA, Div Biometry & Risk Assessment, Natl Ctr Toxicol Res, Jefferson, AR 72079 USA
关键词
D O I
10.1186/1471-2105-8-74
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: A microarray study may select different differentially expressed gene sets because of different selection criteria. For example, the fold-change and p-value are two commonly known criteria to select differentially expressed genes under two experimental conditions. These two selection criteria often result in incompatible selected gene sets. Also, in a two-factor, say, treatment by time experiment, the investigator may be interested in one gene list that responds to both treatment and time effects. Results: We propose three layer ranking algorithms, point-admissible, line-admissible (convex), and Pareto, to provide a preference gene list from multiple gene lists generated by different ranking criteria. Using the public colon data as an example, the layer ranking algorithms are applied to the three univariate ranking criteria, fold-change, p-value, and frequency of selections by the SVM-RFE classifier. A simulation experiment shows that for experiments with small or moderate sample sizes (less than 20 per group) and detecting a 4-fold change or less, the two-dimensional (p-value and fold-change) convex layer ranking selects differentially expressed genes with generally lower FDR and higher power than the standard p-value ranking. Three applications are presented. The first application illustrates a use of the layer rankings to potentially improve predictive accuracy. The second application illustrates an application to a two-factor experiment involving two dose levels and two time points. The layer rankings are applied to selecting differentially expressed genes relating to the dose and time effects. In the third application, the layer rankings are applied to a benchmark data set consisting of three dilution concentrations to provide a ranking system from a long list of differentially expressed genes generated from the three dilution concentrations. Conclusion: The layer ranking algorithms are useful to help investigators in selecting the most promising genes from multiple gene lists generated by different filter, normalization, or analysis methods for various objectives.
引用
收藏
页数:17
相关论文
共 23 条
[1]   Alterations in gene expression profiles and the DNA-damage response in ionizing radiation-exposed TK6 cells [J].
Akerman, GS ;
Rosenzweig, BA ;
Domon, OE ;
Tsai, CA ;
Bishop, ME ;
McGarrity, LJ ;
MacGregor, JT ;
Sistare, FD ;
Chen, JJ ;
Morris, SM .
ENVIRONMENTAL AND MOLECULAR MUTAGENESIS, 2005, 45 (2-3) :188-205
[2]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[3]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[4]  
[Anonymous], 2000, Geometry, Spinors and Applications
[5]   ON DISTRIBUTION OF NUMBER OF ADMISSIBLE POINTS IN A VECTOR RANDOM SAMPLE [J].
BARNDORF.O ;
SOBEL, M .
THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1966, 11 (02) :249-&
[6]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[7]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Gene selection and classification from microarray data using kernel machine [J].
Cho, JH ;
Lee, D ;
Park, JH ;
Lee, IB .
FEBS LETTERS, 2004, 571 (1-3) :93-98
[10]   Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset [J].
Choe, SE ;
Boutros, M ;
Michelson, AM ;
Church, GM ;
Halfon, MS .
GENOME BIOLOGY, 2005, 6 (02)