Tree-Based Ranking Methods

被引：40

作者：

Clemencon, Stephan ^{[1
]}

Vayatis, Nicolas ^{[2
,3
]}

机构：

[1] Telecom Paristech TSI, LTCI, Inst Telecom, CNRS,UMR 5141 46, F-75634 Paris, France

[2] ENS, F-94235 Cachan, France

[3] UniverSud, CMLA, UMR 8536, CNRS, F-94235 Cachan, France

来源：

IEEE TRANSACTIONS ON INFORMATION THEORY | 2009年 / 55卷 / 09期

关键词：

Adaptive piecewise linear approximation; AUC criterion; bipartite ranking problem; decision Tree; ROC curve; AREA;

D O I：

10.1109/TIT.2009.2025558

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates how recursive partitioning methods can be adapted to the bipartite ranking problem. In ranking, the pursued goal is global: based on past data, define an order on the whole input space chi, so that positive instances take up the top ranks with maximum probability. The most natural way to order all instances consists of projecting the input data onto the real line through a real-valued scoring function s and use the natural order on R. The accuracy of the ordering induced by a candidate s is classically measured in terms of the ROC curve or the AUC. Here we discuss the design of tree-structured scoring functions obtained by recursively maximizing the AUC criterion. The connection with recursive piecewise linear approximation of the optimal ROC curve both in the L-1-sense and in the L-infinity-sense is highlighted. A novel tree-based algorithm for ranking, called TREERANK, is proposed. Consistency results and generalization bounds of functional nature are established for this ranking method, when considering either the L-1 or L-infinity distance. We also describe committee-based learning procedures using TREERANK as a "base ranker," in order to overcome obvious drawbacks of such a top-down partitioning technique. Simulation results on artificial data are also displayed.

引用

页码：4316 / 4336

页数：21

共 32 条

[1] Agarwal S, 2005, J MACH LEARN RES, V6, P393
[2] [Anonymous], C M
[3] [Anonymous], 2003, Journal of machine learning research
[4] [Anonymous], 1975, SIGNAL DETECTION THE
[5] [Anonymous], 2013, A Probabilistic Theory of Pattern Recognition
[6] Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
[7] Breiman L., 1984, BIOMETRICS, V40, P874, DOI 10.1201/9781315139470
[8] Ranking and scoring using empirical risk minimization
Clémençon, S
Lugosi, G
Vayatis, N
[J]. LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 1 - 15
[9] CLEMENCON S, 2009, J MACH LEARN RES P A, V5, P89
[10] Ranking and empirical minimization of U-statistics
Clemencon, Stephan
Lugosi, Gabor
Vayatis, Nicolas
[J]. ANNALS OF STATISTICS, 2008, 36 (02) : 844 - 874

← 1 2 3 4 →