The generalized LASSO

被引:203
作者
Roth, V [1 ]
机构
[1] Univ Bonn, Dept Comp Sci 3, D-53117 Bonn, Germany
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2004年 / 15卷 / 01期
关键词
kernel regression; probabilistic interpretation; robust loss functions; sparisity; support vector machines (SVMs);
D O I
10.1109/TNN.2003.809398
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last few years, the support vector machine (SVM) method has motivated new interest in kernel regression techniques. Although the SVM has been shown to exhibit excellent generalization properties in many experiments, it suffers from several drawbacks, both of a theoretical and a technical nature: the absence of probabilistic outputs, the restriction to Mercer kernels, and the steep growth of the number of support vectors with increasing size of the training set. In this paper, we present a different class of kernel regressors that effectively overcome the above problems. We call this approach generalrized LASSO regression. It has a clear probabilistic interpretation, can handle learning sets that are corrupted by outliers, produces extremely sparse solutions, and is capable of dealing with large-scale problems. For regression functionals which can be modeled as iteratively reweighted least-squares (IRLS) problems, we present a highly efficient algorithm with guaranteed global convergence. This defies a unique framework for sparse regression models in the very rich class of IRLS models, including various types of robust regression models and logistic regression. Performance studies for many standard benchmark datasets effectively demonstrate the advantages of this model over related approaches.
引用
收藏
页码:16 / 28
页数:13
相关论文
共 38 条
[21]   An introduction to kernel-based learning algorithms [J].
Müller, KR ;
Mika, S ;
Rätsch, G ;
Tsuda, K ;
Schölkopf, B .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2001, 12 (02) :181-201
[22]  
NABNEY I, 1999, NCRG99002 AST U
[23]  
Nash SG., 1996, LINEAR NONLINEAR PRO
[24]   Weighted least squares training of support vector classifiers leading to compact and adaptive schemes [J].
Navia-Vázquez, A ;
Pérez-Cruz, F ;
Artés-Rodríguez, A ;
Figueiras-Vidal, AR .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2001, 12 (05) :1047-1059
[25]   On the LASSO and its dual [J].
Osborne, MR ;
Presnell, B ;
Turlach, BA .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2000, 9 (02) :319-337
[26]   A new approach to variable selection in least squares problems [J].
Osborne, MR ;
Presnell, B ;
Turlach, BA .
IMA JOURNAL OF NUMERICAL ANALYSIS, 2000, 20 (03) :389-403
[27]  
Pontil M., 1998, 1651 AI MIT ART INT
[28]  
Roth V, 2000, ADV NEUR IN, V12, P568
[29]  
SAUNDERS S, 1998, RIDGE REGRESSION LEA
[30]  
Smola A.J., 1998, NC2TR1998030 U LOND