HOW TO CHOOSE A REPRESENTATIVE SUBSET FROM A SET OF DATA IN MULTIDIMENSIONAL SPACE

被引:13
作者
CHAUDHURI, BB
机构
[1] Electronics and Communication Sciences Unit, Indian Statistical Institute, Calcutta, 700 035, 203, B.T. Road
关键词
CLUSTERING; SEEDPOINT SELECTION; PATTERN RECOGNITION;
D O I
10.1016/0167-8655(94)90151-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given a set of N points in multi-dimensional space, it may be necessary to choose a subset of n representative points. For example, in clustering problems, it is necessary to choose a few seed points around which the cluster may grow. This problem may be posed as that of choosing one out of each k data when right perpendicular N/n left perpendicular=k. In our proposed method, the data points are ordered in decreasing magnitude of density. The datum toping the ordered list is chosen and its k-1 nearest neighbours are deleted from the ordered list. From the remaining data, the one currently toping the list is chosen. The process is repeated till the data are exhausted. The problem of more general choice of n is also addressed.
引用
收藏
页码:893 / 899
页数:7
相关论文
共 6 条
[1]  
ASTRAHAN MM, 1970, AD709067
[2]  
BALL GH, 1967, AD822174
[3]  
CHAUDHURI D, 1993, UNPUB IEEE T PATTERN
[4]  
MACQUEEN JB, 1967, AD669871, P281
[5]  
MCRAE DJ, 1971, BEHAV SCI, V16, P423
[6]  
Prakasa Rao B.L.S., 1983, NONPARAMETRIC FUNCTI