SPARSE LINEAR DISCRIMINANT ANALYSIS BY THRESHOLDING FOR HIGH DIMENSIONAL DATA

被引:164
作者
Shao, Jun [1 ,2 ]
Wang, Yazhen [2 ]
Deng, Xinwei [2 ]
Wang, Sijian [2 ]
机构
[1] E China Normal Univ, Shanghai, Peoples R China
[2] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
Classification; high dimensionality; misclassification rate; normality; optimal classification rule; sparse estimates; CLASSIFICATION;
D O I
10.1214/10-AOS870
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is usually unknown, the rule of classification is constructed using a training sample. The well-known linear discriminant analysis (LDA) works well for the situation where the number of variables used for classification is much smaller than the training sample size. Because of the advance in technologies, modern statistical studies often face classification problems with the number of variables much larger than the sample size, and the LDA may perform poorly. We explore when and why the LDA has poor performance and propose a sparse LDA that is asymptotically optimal under some sparsity conditions on the unknown parameters. For illustration of application, we discuss an example of classifying human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72. A simulation is also conducted to check the performance of the proposed method.
引用
收藏
页码:1241 / 1265
页数:25
相关论文
共 13 条
[1]   COVARIANCE REGULARIZATION BY THRESHOLDING [J].
Bickel, Peter J. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2008, 36 (06) :2577-2604
[2]   Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations [J].
Bickel, PJ ;
Levina, E .
BERNOULLI, 2004, 10 (06) :989-1010
[3]  
Clemmensen L, 2008, SPARSE DISCRIMINANT
[4]   IDEAL SPATIAL ADAPTATION BY WAVELET SHRINKAGE [J].
DONOHO, DL ;
JOHNSTONE, IM .
BIOMETRIKA, 1994, 81 (03) :425-455
[5]  
DONOHO DL, 1995, J ROY STAT SOC B MET, V57, P301
[6]  
Fan JQ, 2010, STAT SINICA, V20, P101
[7]   HIGH-DIMENSIONAL CLASSIFICATION USING FEATURES ANNEALED INDEPENDENCE RULES [J].
Fan, Jianqing ;
Fan, Yingying .
ANNALS OF STATISTICS, 2008, 36 (06) :2605-2637
[8]  
Fang KT., 1990, Statistical inference in elliptically contoured and related distributions
[9]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[10]   Regularized linear discriminant analysis and its application in microarrays [J].
Guo, Yaqian ;
Hastie, Trevor ;
Tibshirani, Robert .
BIOSTATISTICS, 2007, 8 (01) :86-100