An optimization algorithm for clustering using weighted dissimilarity measures

被引:269
作者
Chan, EY
Ching, WK
Ng, MK
Huang, JZ
机构
[1] Univ Hong Kong, Dept Math, Hong Kong, Hong Kong, Peoples R China
[2] Univ Hong Kong, E Business Technol Inst, Hong Kong, Hong Kong, Peoples R China
关键词
clustering; data mining; optimization; attributes weights;
D O I
10.1016/j.patcog.2003.11.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
One of the main problems in cluster analysis is the weighting of attributes so as to discover structures that may be present. By using weighted dissimilarity measures for objects, a new approach is developed, which allows the use of the k-means-type paradigm to efficiently cluster large data sets. The optimization algorithm is presented and the effectiveness of the algorithm is demonstrated with both synthetic and real data sets. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:943 / 952
页数:10
相关论文
共 10 条
[1]
Anderberg M.R., 1973, Probability and Mathematical Statistics
[2]
A CLUSTERING TECHNIQUE FOR SUMMARIZING MULTIVARIATE DATA [J].
BALL, GH ;
HALL, DJ .
BEHAVIORAL SCIENCE, 1967, 12 (02) :153-&
[4]
Duda R. O., 1973, PATTERN CLASSIFICATI
[5]
SYMBOLIC CLUSTERING USING A NEW DISSIMILARITY MEASURE [J].
GOWDA, KC ;
DIDAY, E .
PATTERN RECOGNITION, 1991, 24 (06) :567-578
[6]
Extensions to the k-means algorithm for clustering large data sets with categorical values [J].
Huang, ZX .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (03) :283-304
[7]
A fuzzy k-modes algorithm for clustering categorical data [J].
Huang, ZX ;
Ng, MK .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) :446-452
[8]
Jain K, 1988, Algorithms for clustering data
[9]
MACQUEEN JB, 1967, P 5 S MATH STAT PROB, V7, P281
[10]
Clustering categorical data sets using tabu search techniques [J].
Ng, MK ;
Wong, JC .
PATTERN RECOGNITION, 2002, 35 (12) :2783-2790