The maximal data piling direction for discrimination

被引:50
作者
Ahn, Jeongyoun [1 ]
Marron, J. S. [2 ]
机构
[1] Univ Georgia, Dept Stat, Athens, GA 30602 USA
[2] Univ N Carolina, Dept Stat & Operat Res, Chapel Hill, NC 27599 USA
基金
美国国家科学基金会;
关键词
Classification; Fisher's linear discrimination; High dimension; low sample size; Maximal data piling; Support vector machine; GEOMETRIC REPRESENTATION; HIGH-DIMENSION;
D O I
10.1093/biomet/asp084
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We study a discriminant direction vector that generally exists only in high-dimension, low sample size settings. Projections of data onto this direction vector take on only two distinct values, one for each class. There exist infinitely many such directions in the subspace generated by the data; but the maximal data piling vector has the longest distance between the projections. This paper investigates mathematical properties and classification performance of this discrimination method.
引用
收藏
页码:254 / 259
页数:6
相关论文
共 11 条
[1]   The high-dimension, low-sample-size geometric representation holds under mild conditions [J].
Ahn, Jeongyoun ;
Marron, J. S. ;
Muller, Keith M. ;
Chi, Yueh-Yun .
BIOMETRIKA, 2007, 94 (03) :760-766
[2]  
[Anonymous], APPL STAT
[3]  
[Anonymous], 1982, Matrix Algebra Useful for Statistics
[4]   Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations [J].
Bickel, PJ ;
Levina, E .
BERNOULLI, 2004, 10 (06) :989-1010
[5]  
Deev A., 1970, Reports of Academy of Sciences of the USSR, V195, P756
[6]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[7]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[8]   Geometric representation of high dimension, low sample size data [J].
Hall, P ;
Marron, JS ;
Neeman, A .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2005, 67 :427-444
[9]   Distance-weighted discrimination [J].
Marron, J. S. ;
Todd, Michael J. ;
Ahn, Jeongyoun .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (480) :1267-1271
[10]  
Vapnik V., 1998, Statistical Learning Theory, P5