Categorical data visualization and clustering using subjective factors

被引:29
作者
Chang, CH [1 ]
Ding, ZK [1 ]
机构
[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Taoyuan 320, Taiwan
关键词
data mining; cluster analysis; categorical data; cluster visualization;
D O I
10.1016/j.datak.2004.09.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an important data mining problem. However, most earlier work on clustering focused on numeric attributes which have a natural ordering to their attribute values. Recently, clustering data with categorical attributes, whose attribute values do not have a natural ordering, has received more attention. A common issue in cluster analysis is that there is no single correct answer to the number of clusters, since cluster analysis involves human subjective judgement. Interactive visualization is one of the methods where users can decide a proper clustering parameters. In this paper, a new clustering approach called CDCS (Categorical Data Clustering with Subjective factors) is introduced, where a visualization tool for clustered categorical data is developed such that the result of adjusting parameters is instantly reflected. The experiment shows that CDCS generates high quality clusters compared to other typical algorithms. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:243 / 262
页数:20
相关论文
共 23 条