Comparing top k lists

被引:383
作者
Fagin, R [1 ]
Kumar, R [1 ]
Sivakumar, D [1 ]
机构
[1] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
关键词
triangle inequality; polygonal inequality; metric; near metric; distance measures; top k list; rank aggregation;
D O I
10.1137/S0895480102412856
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Motivated by several applications, we introduce various distance measures between "top k lists." Some of these distance measures are metrics, while others are not. For each of these latter distance measures, we show that they are "almost" a metric in the following two seemingly unrelated aspects: (i) they satisfy a relaxed version of the polygonal ( hence, triangle) inequality, and (ii) there is a metric with positive constant multiples that bound our measure above and below. This is not a coincidence - we show that these two notions of almost being a metric are the same. Based on the second notion, we de. ne two distance measures to be equivalent if they are bounded above and below by constant multiples of each other. We thereby identify a large and robust equivalence class of distance measures. Besides the applications to the task of identifying good notions of (dis) similarity between two top k lists, our results imply polynomial-time constant-factor approximation algorithms for the rank aggregation problem with respect to a large class of distance measures.
引用
收藏
页码:134 / 160
页数:27
相关论文
共 16 条
[1]   PERFORMANCE GUARANTEES FOR APPROXIMATION ALGORITHMS DEPENDING ON PARAMETRIZED TRIANGLE INEQUALITIES [J].
ANDREAE, T ;
BANDELT, HJ .
SIAM JOURNAL ON DISCRETE MATHEMATICS, 1995, 8 (01) :1-16
[2]   Performance guarantees for the TSP with a parameterized triangle inequality [J].
Bender, MA ;
Chekuri, C .
INFORMATION PROCESSING LETTERS, 2000, 73 (1-2) :17-21
[3]  
Carmel D., 2001, SIGIR Forum, P43, DOI 10.1145/383952.383958
[4]  
Charikar M, 2002, LECT NOTES COMPUT SC, V2380, P693
[5]  
Critchlow DE, 1980, LECT NOTES STAT
[6]   SPEARMANS FOOTRULE AS A MEASURE OF DISARRAY [J].
DIACONIS, P ;
GRAHAM, RL .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (02) :262-268
[7]  
Diaconis PW, 1988, IMS LECT NOTES MONOG, V11
[8]  
Dugundji J., 1966, TOPOLOGY
[9]  
Dwork C., 2001, P 10 INT C WORLD WID, P613, DOI DOI 10.1145/371920.372165
[10]  
Fagin R, 2003, SIAM PROC S, P28