Multi-view clustering

被引:549
作者
Bickel, S [1 ]
Scheffer, T [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, D-10099 Berlin, Germany
来源
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2004年
关键词
D O I
10.1109/ICDM.2004.10095
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider clustering problems in which the available attributes can be split into two independent subsets, such that either subset suffices for learning. Example applications of this multi-view setting include clustering of web pages which have an intrinsic view,(the pages themselves) and an extrinsic view (e.g., anchor texts of inbound hyperlinks); multi-view learning has so far been studied in the context of classification. We develop and study partitioning and agglomerative, hierarchical multi-view clustering algorithms for text data. We find empirically that the multiview versions of k-Means and EM greatly improve on their single-view counterparts. By contrast, we obtain negative results for agglomerative hierarchical multi-view clustering. Our analysis explains this surprising phenomenon.
引用
收藏
页码:19 / 26
页数:8
相关论文
共 20 条
  • [1] Abney S., 2002, P 40 ANN M ASS COMP
  • [2] [Anonymous], J ROYAL STAT SOC B
  • [3] BANERJEE A, 2003, P 9 ACM SIGKDD C KNO
  • [4] BERKHIN P, 2002, UNPUB SURVEY CLUSTER
  • [5] Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
  • [6] Brefeld U., 2004, P INT C MACH LEARN
  • [7] Collins Michael, 1999, EMNLP
  • [8] Dasgupta S., 2001, P NEUR INF PROC SYST
  • [9] Concept decompositions for large sparse text data using clustering
    Dhillon, IS
    Modha, DS
    [J]. MACHINE LEARNING, 2001, 42 (1-2) : 143 - 175
  • [10] Ghani R., 2002, P INT C MACH LEARN