Clustering categorical data sets using tabu search techniques

被引:62
作者
Ng, MK [1 ]
Wong, JC [1 ]
机构
[1] Univ Hong Kong, Dept Math, Hong Kong, Hong Kong, Peoples R China
关键词
clustering; k-means; k-modes; tabu search; numeric data; categorical data;
D O I
10.1016/S0031-3203(02)00021-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering methods partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. The fuzzy k-means-type algorithm is best suited for implementing this clustering operation because of its effectiveness in clustering data sets. However, working only on numeric values limits its use because data sets often contain categorical values. In this paper, we present a tabu search based clustering algorithm, to extend the k-means paradigm to categorical domains, and domains with both numeric and categorical values. Using tabu search based techniques, our algorithm can explore the solution space beyond local optimality in order to aim at finding a global solution of the fuzzy clustering problem. It is found that the clustering results produced by the proposed algorithm are very high in accuracy. (C) 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:2783 / 2790
页数:8
相关论文
共 14 条