Semi-supervised local Fisher discriminant analysis for dimensionality reduction

被引:26
作者
Masashi Sugiyama
Tsuyoshi Idé
Shinichi Nakajima
Jun Sese
机构
[1] Tokyo Institute of Technology,Department of Computer Science
[2] IBM Research,Department of Information Science
[3] Tokyo Research Laboratory,undefined
[4] Nikon Corporation,undefined
[5] Ochanomizu University,undefined
来源
Machine Learning | 2010年 / 78卷
关键词
Semi-supervised learning; Dimensionality reduction; Cluster assumption; Local Fisher discriminant analysis; Principal component analysis;
D O I
暂无
中图分类号
学科分类号
摘要
When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The proposed method, which we call SEmi-supervised Local Fisher discriminant analysis (SELF), has an analytic form of the globally optimal solution and it can be computed based on eigen-decomposition. We show the usefulness of SELF through experiments with benchmark and real-world document classification datasets.
引用
收藏
页码:35 / 61
页数:26
相关论文
共 50 条
[1]  
Aronszajn N.(1950)Theory of reproducing kernels Transactions of the American Mathematical Society 68 337-404
[2]  
Belkin M.(2003)Laplacian eigenmaps for dimensionality reduction and data representation Neural Computation 15 1373-1396
[3]  
Niyogi P.(2006)Manifold regularization: a geometric framework for learning from labeled and unlabeled examples Journal of Machine Learning Research 7 2399-2434
[4]  
Belkin M.(1936)The use of multiple measurements in taxonomic problems Annals of Eugenics 7 179-188
[5]  
Niyogi P.(1989)Regularized discriminant analysis Journal of the American Statistical Association 84 165-175
[6]  
Sindhwani V.(2003)A survey of kernels for structured data SIGKDD Explorations 5 S268-S275
[7]  
Fisher R. A.(2003)An introduction to variable and feature selection Journal of Machine Learning Research 3 1157-1182
[8]  
Friedman J. H.(2006)Reducing the dimensionality of data with neural networks Science 313 504-507
[9]  
Gärtner T.(1997)Wrappers for feature selection Artificial Intelligence 97 273-324
[10]  
Guyon I.(2002)Text classification using string kernels Journal of Machine Learning Research 2 419-444