CABOSFV algorithm for high dimensional sparse data clustering

被引:6
作者
Wu, S [1 ]
Gao, XD [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Management, Beijing 100083, Peoples R China
来源
JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING | 2004年 / 11卷 / 03期
关键词
clustering; data mining; sparse; high dimensionality;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.
引用
收藏
页码:283 / 288
页数:6
相关论文
共 8 条
[1]  
Ester M., 1996, 2 INT C KNOWL DISCOV, P226, DOI DOI 10.5555/3001460.3001507
[2]  
Guha S., 1998, SIGMOD Record, V27, P73, DOI 10.1145/276305.276312
[3]  
Han J., 2012, Data Mining, P393, DOI [DOI 10.1016/B978-0-12-381479-1.00009-5, 10.1016/B978-0-12-381479-1.00001-0]
[4]  
Ng R.T., 1994, Proceedings of the 20th International Conference on Very Large Data Bases, VLDB '94, P144
[5]   Cyclic allocation of two-dimensional data [J].
Prabhakar, S ;
Abdel-Ghaffar, K ;
Agrawal, D ;
El Abbadi, A .
14TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1998, :94-101
[6]   STING+: An approach to active spatial data mining [J].
Wang, W ;
Yang, J ;
Muntz, R .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :116-125
[7]  
WANG W, 1997, P 23 VLDB C ATH, P186
[8]  
Zhang T., 1996, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, V25, P103, DOI [10.1145/235968.233324, /10.1145/235968.233324, DOI 10.1145/235968.233324]