Fast multiscale clustering and manifold identification

被引:56
作者
Kushnir, Dan [1 ]
Galun, Meirav [1 ]
Brandt, Achi [1 ]
机构
[1] Weizmann Inst Sci, Dept Comp Sci & Appl Math, IL-76100 Rehovot, Israel
基金
以色列科学基金会;
关键词
algebraic multigrid (AMG); aggregation; graph partitioning; similarity-based clustering; manifold; data analysis; astrophysical models;
D O I
10.1016/j.patcog.2006.04.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
We present a novel multiscale clustering algorithm inspired by algebraic multigrid techniques. Our method begins with assembling data points according to local similarities. It uses an aggregation process to obtain reliable scale-dependent global properties, which arise from the local similarities. As the aggregation process proceeds, these global properties affect the formation of coherent clusters. The global features that can be utilized are for example density, shape, intrinsic dimensionality and orientation. The last three features are a part of the manifold identification process which is performed in parallel to the clustering process. The algorithm detects clusters that are distinguished by their multiscale nature, separates between clusters with different densities, and identifies and resolves intersections between clusters. The algorithm is tested on synthetic and real data sets, its running time complexity is linear in the size of the data set. (c) 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1876 / 1891
页数:16
相关论文
共 34 条
[1]
[Anonymous], 2000, Geometry, Spinors and Applications
[2]
Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[3]
Data clustering using a model granular magnet [J].
Blatt, M ;
Wiseman, S ;
Domany, E .
NEURAL COMPUTATION, 1997, 9 (08) :1805-1842
[4]
Brandt A., 1982, Report
[5]
Brandt A., 2003, Multilevel Optimization in VLSICAD, P1
[6]
Cox T. F., 1994, MULTIDIMENSIONAL SCA
[7]
MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[8]
Duda R.O., 2001, Pattern Classification, V2nd
[9]
Bagging for path-based clustering [J].
Fischer, B ;
Buhmann, JM .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (11) :1411-1415
[10]
FISCHER B, 2004, NEWS PHYSL SCI, V16