Small sample issues for microarray-based classification

被引:103
作者
Dougherty, ER [1 ]
机构
[1] Texas A&M Univ, Dept Elect Engn, College Stn, TX 77843 USA
来源
COMPARATIVE AND FUNCTIONAL GENOMICS | 2001年 / 2卷 / 01期
关键词
classification; gene expression; genomics; microarrays; pattern recognition;
D O I
10.1002/cfg.62
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In order to study the molecular biological differences between normal and diseased tissues, it is desirable to perform classification among diseases and stages of disease using microarray-based gene-expression values. Owing to the limited number of microarrays, typically used in these studies, serious issues arise with respect to the design, performance and analysis of classifiers based on microarray data. This paper reviews some fundamental issues facing small-sample classification: classification rules, constrained classifiers, error estimation and feature selection. It discusses both unconstrained and constrained classifier design from sample data, and the contributions to classifier error from constrained optimization and lack of optimality owing to design from sample data. The difficulty with estimating classifier error when confined to small samples is addressed, particularly estimating, the error from training data. The impact of small samples on the ability to include more than a, few variables as classifier features is explained. Copyright (C) 2001. John Wiley & Sons, Ltd.
引用
收藏
页码:28 / 34
页数:7
相关论文
共 8 条
[1]   Exploring the metabolic and genetic control of gene expression on a genomic scale [J].
DeRisi, JL ;
Iyer, VR ;
Brown, PO .
SCIENCE, 1997, 278 (5338) :680-686
[2]  
Devroye L., 1996, A probabilistic theory of pattern recognition
[3]   Expression profiling using cDNA microarrays [J].
Duggan, DJ ;
Bittner, M ;
Chen, YD ;
Meltzer, P ;
Trent, JM .
NATURE GENETICS, 1999, 21 (Suppl 1) :10-14
[4]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[5]  
Kauffman S.A., 1993, ORIGINS ORDER
[6]   General nonlinear framework for the analysis of gene interaction via multivariate expression arrays [J].
Kim, S ;
Dougherty, ER ;
Bittner, ML ;
Chen, YD ;
Sivakumar, K ;
Meltzer, P ;
Trent, JM .
JOURNAL OF BIOMEDICAL OPTICS, 2000, 5 (04) :411-424
[7]   QUANTITATIVE MONITORING OF GENE-EXPRESSION PATTERNS WITH A COMPLEMENTARY-DNA MICROARRAY [J].
SCHENA, M ;
SHALON, D ;
DAVIS, RW ;
BROWN, PO .
SCIENCE, 1995, 270 (5235) :467-470
[8]   UNIFORM CONVERGENCE OF RELATIVE FREQUENCIES OF EVENTS TO THEIR PROBABILITIES [J].
VAPNIK, VN ;
CHERVONENKIS, AY .
THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1971, 16 (02) :264-+