ON K-MEDOID CLUSTERING OF LARGE DATA SETS WITH THE AID OF A GENETIC ALGORITHM - BACKGROUND, FEASIBILITY AND COMPARISON

被引:63
作者
LUCASIUS, CB [1 ]
DANE, AD [1 ]
KATEMAN, G [1 ]
机构
[1] CATHOLIC UNIV NIJMEGEN, FAC SCI, ANALYT CHEM LAB, 6525 ED NIJMEGEN, NETHERLANDS
关键词
DATA REDUCTION; K-MEDOID CLUSTERING; GENETIC ALGORITHMS; SUBSET SELECTION;
D O I
10.1016/0003-2670(93)80130-D
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
A novel approach to the problem of k-medoid clustering of large data sets is presented, using a genetic algorithm. Genetic algorithms comprise a family of optimization methods based loosely upon principles of natural evolution. They have proven to be especially suited to tackle complex, large-scale optimization problems efficiently, including a rapidly growing variety of problems of practical utility. Our pilot study lays emphasis on the feasibility of GCA - our genetic algorithm for k-medoid clustering of large datasets - and provides some background information to elucidate differences with traditional approaches. The experimental part of this study is done on the basis of artificial data sets and includes a comparison with CLARA - another approach to k-medoid clustering of large data sets. introduced recently. Results indicate that GCA accomplishes a better sampling of the combinatorial search space.
引用
收藏
页码:647 / 669
页数:23
相关论文
共 63 条
[1]  
Aarts E., 1989, SIMULATED ANNEALING
[2]  
BELEW RK, 1991, 4TH P INT C GEN ALG
[3]  
BHUYAN JN, 1991, 4TH P INT C GEN ALG, P408
[4]  
Davidor Y., 1991, GENETIC ALGORITHMS R, V1
[5]  
DAVIS L, 1989, PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON GENETIC ALGORITHMS, P61
[6]  
Davis L. E.., 1991, HDB GENETIC ALGORITH
[7]  
DUBES R, 1980, ADV COMPUT, V19, P113
[8]  
FALKENAUER E, 1991, 5TH P INT S APPL STO, P198
[9]  
FALKENAUER E, 1991, CRIF FMS39 IND AUT T
[10]  
Glover F., 1990, ORSA Journal on Computing, V2, P4, DOI [10.1287/ijoc.1.3.190, 10.1287/ijoc.2.1.4]