Probabilistic relevance ranking for collaborative filtering

被引:37
作者
Wang, Jun [1 ]
Robertson, Stephen [2 ]
de Vries, Arjen P. [3 ]
Reinders, Marcel J. T. [4 ]
机构
[1] UCL, Ipswich IP5 3RE, Suffolk, England
[2] Microsoft Res, Cambridge, England
[3] CWI, NL-1009 AB Amsterdam, Netherlands
[4] Delft Univ Technol, Delft, Netherlands
来源
INFORMATION RETRIEVAL | 2008年 / 11卷 / 06期
关键词
collaborative filtering; recommender systems; Probability Ranking Principle; relevance ranking; personalization;
D O I
10.1007/s10791-008-9060-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Collaborative filtering is concerned with making recommendations about items to users. Most formulations of the problem are specifically designed for predicting user ratings, assuming past data of explicit user ratings is available. However, in practice we may only have implicit evidence of user preference; and furthermore, a better view of the task is of generating a top-N list of items that the user is most likely to like. In this regard, we argue that collaborative filtering can be directly cast as a relevance ranking problem. We begin with the classic Probability Ranking Principle of information retrieval, proposing a probabilistic item ranking framework. In the framework, we derive two different ranking models, showing that despite their common origin, different factorizations reflect two distinctive ways to approach item ranking. For the model estimations, we limit our discussions to implicit user preference data, and adopt an approximation method introduced in the classic text retrieval model (i.e. the Okapi BM25 formula) to effectively decouple frequency counts and presence/absence counts in the preference data. Furthermore, we extend the basic formula by proposing the Bayesian inference to estimate the probability of relevance (and non-relevance), which largely alleviates the data sparsity problem. Apart from a theoretical contribution, our experiments on real data sets demonstrate that the proposed methods perform significantly better than other strong baselines.
引用
收藏
页码:477 / 497
页数:21
相关论文
共 40 条
[1]   Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions [J].
Adomavicius, G ;
Tuzhilin, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (06) :734-749
[2]  
Baeza-Yates R., 1999, Modern Information Retrieval, Book
[3]   INFORMATION FILTERING AND INFORMATION-RETRIEVAL - 2 SIDES OF THE SAME COIN [J].
BELKIN, NJ ;
CROFT, WB .
COMMUNICATIONS OF THE ACM, 1992, 35 (12) :29-38
[4]  
Bishop C. M., 2006, Pattern Recognition and Machine Learning, P179
[5]  
Breese J. S., 1998, UAI, P43, DOI 10.5555/2074094.2074100
[6]  
Canny J., 2002, Proceedings of SIGIR 2002. Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P238, DOI 10.1145/564376.564419
[7]   Selective flooding for improved quality-of-service routing [J].
Claypool, M ;
Kannan, G .
QUALITY OF SERVICE OVER NEXT-GENERATION DATA NETWORKS, 2001, 4524 :33-44
[8]   SOME INCONSISTENCIES AND MISIDENTIFIED MODELING ASSUMPTIONS IN PROBABILISTIC INFORMATION-RETRIEVAL [J].
COOPER, WS .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1995, 13 (01) :100-111
[9]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]   Item-based top-N recommendation algorithms [J].
Deshpande, M ;
Karypis, G .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2004, 22 (01) :143-177