Convex multi-task feature learning

被引:948
作者
Argyriou, Andreas [1 ]
Evgeniou, Theodoros [2 ]
Pontil, Massimiliano [1 ]
机构
[1] UCL, Dept Comp Sci, London WC1E 6BT, England
[2] INSEAD, F-77300 Fontainebleau, France
基金
英国工程与自然科学研究理事会;
关键词
Collaborative filtering; Inductive transfer; Kernels; Multi-task learning; Regularization; Transfer learning; Vector-valued functions;
D O I
10.1007/s10994-007-5040-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single-task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks. We prove that the method is equivalent to solving a convex optimization problem for which there is an iterative algorithm which converges to an optimal solution. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the former step it learns task-specific functions and in the latter step it learns common-across-tasks sparse representations for these functions. We also provide an extension of the algorithm which learns sparse nonlinear representations using kernels. We report experiments on simulated and real data sets which demonstrate that the proposed method can both improve the performance relative to learning each task independently and lead to a few learned features common across related tasks. Our algorithm can also be used, as a special case, to simply select-not learn-a few common variables across the tasks.
引用
收藏
页码:243 / 272
页数:30
相关论文
共 48 条
  • [1] Aaker D A., 2004, Marketing Research
  • [2] ABERNETHY J, 2006, 200668TOMDS INSEAD
  • [3] Ando RK, 2005, J MACH LEARN RES, V6, P1817
  • [4] Learning convex combinations of continuously parameterized basic kernels
    Argyriou, A
    Micchelli, CA
    Pontil, M
    [J]. LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 338 - 352
  • [5] ARGYRIOU A, 2007, ADV NEURAL INFORM PR, V19, P41
  • [6] ARGYRIOU A, 2007, REPRESENTER THEOREMS
  • [7] THEORY OF REPRODUCING KERNELS
    ARONSZAJN, N
    [J]. TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1950, 68 (MAY) : 337 - 404
  • [8] Task clustering and gating for Bayesian multitask learning
    Bakker, B
    Heskes, T
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (01) : 83 - 99
  • [9] A model of inductive bias learning
    Baxter, J
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2000, 12 : 149 - 198
  • [10] Exploiting task relatedness for multiple task learning
    Ben-David, S
    Schuller, R
    [J]. LEARNING THEORY AND KERNEL MACHINES, 2003, 2777 : 567 - 580