Data clustering: A review

被引：7853

作者：

Jain, AK

Murty, MN

Flynn, PJ

机构：

[1] Michigan State Univ, Dept Comp Sci, E Lansing, MI 48824 USA

[2] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India

[3] Ohio State Univ, Dept Elect Engn, Columbus, OH 43210 USA

来源：

ACM COMPUTING SURVEYS | 1999年 / 31卷 / 03期

关键词：

algorithms; cluster analysis; clustering applications; exploratory data analysis; incremental clustering; similarity indices; unsupervised learning;

D O I：

10.1145/331499.331504

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.

引用

页码：264 / 323

页数：60

共 204 条

[1] Aarts E., 1989, Wiley-Interscience Series in Discrete Mathematics and Optimization
[2] *ACM, 1994, ACM COMPUT SURV, V35, P5
[3] Allen P.A, 1990, BASIN ANAL PRINCIPLE, V1st
[4] A TABU SEARCH APPROACH TO THE CLUSTERING PROBLEM
ALSULTAN, KS
[J]. PATTERN RECOGNITION, 1995, 28 (09) : 1443 - 1451
[5] Computational experience on four algorithms for the hard clustering problem
AlSultan, KS
Khan, MM
[J]. PATTERN RECOGNITION LETTERS, 1996, 17 (03) : 295 - 308
[6] LOW-LEVEL SEGMENTATION OF MULTISPECTRAL IMAGES VIA AGGLOMERATIVE CLUSTERING OF UNIFORM NEIGHBORHOODS
AMADASUN, M
KING, RA
[J]. PATTERN RECOGNITION, 1988, 21 (03) : 261 - 268
[7] Anderberg M.R., 1973, Probability and Mathematical Statistics
[8] [Anonymous], P 6 INT C WORLD WID
[9] [Anonymous], 1978, INTERACTIVE PATTERN
[10] [Anonymous], P INT C IM PROC ICIP

← 1 2 3 4 5 6 7 8 9 10 →