WEMAREC: Accurate and Scalable Recommendation through Weighted and Ensemble Matrix Approximation

被引：36

作者：

Chen, Chao ^{[1
]}

Li, Dongsheng ^{[2
]}

Zhao, Yingying ^{[1
]}

Lv, Qin ^{[3
]}

Shang, Li ^{[3
]}

机构：

[1] Tongji Univ, Shanghai 201804, Peoples R China

[2] IBM Res China, Shanghai 201203, Peoples R China

[3] Univ Colorado, Boulder, CO 80309 USA

来源：

SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2015年

基金：

美国国家科学基金会;

关键词：

R ecommendation; matrix approximation; weighted; ensemble;

D O I：

10.1145/2766462.2767718

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Matrix approximation is one of the most effective methods for collaborative filtering-based recommender systems. However, the high computation complexity of matrix factorization on large datasets limits its scalability. Prior solutions have adopted co-clustering methods to partition a large matrix into a set of smaller submatrices, which can then be processed in parallel to improve scalability. The drawback is that the recommendation accuracy is lower as the submatrices only contain subsets of the user-item rating information. This paper presents WEMAREC, a weighted and ensemble matrix approximation method for accurate and scalable recommendation. It builds upon the intuition that (sub) matrices containing more frequent samples of certain user/item/rating tend to make more reliable rating predictions for these specific user/item/rating. WEMAREC consists of two important components: (1) a weighting strategy that is computed based on the rating distribution in each submatrix and applied to approximate a single matrix containing those submatrices; and (2) an ensemble strategy that leverages user-specific and item-specific rating distributions to combine the approximation matrices of multiple sets of co-clustering results. Evaluations using real-world datasets demonstrate that WEMAREC outperforms state-of-the-art matrix approximation methods in recommendation accuracy (0.5-11.9% on the MovieLens dataset and 2.2-13.1% on the Netflix dataset) with 3-10X improvement on scalability.

引用

页码：303 / 312

页数：10

共 29 条

[1] Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions [J].

Adomavicius, G ;

Tuzhilin, A .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (06) :734-749

[2]

[Anonymous], 2003, P 20 INT C MACHINE L

[3]

[Anonymous], 2013, INT C MACHINE LEARNI

[4]

[Anonymous], 2007, P KDD CUP WORKSH

[5]

[Anonymous], 2005, P 22 INT C MACHINE L, DOI DOI 10.1145/1102351.1102441

[6]

Banerjee A, 2007, J MACH LEARN RES, V8, P1919

[7]

Bell RM, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P95

[8]

Billsus D., 1998, Machine Learning. Proceedings of the Fifteenth International Conference (ICML'98), P46

[9] The Power of Convex Relaxation: Near-Optimal Matrix Completion [J].

Candes, Emmanuel J. ;

Tao, Terence .

IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (05) :2053-2080

[10] Matrix Completion With Noise [J].

Candes, Emmanuel J. ;

Plan, Yaniv .

PROCEEDINGS OF THE IEEE, 2010, 98 (06) :925-936

← 1 2 3 →