Probability density estimation from optimally condensed data samples

被引:129
作者
Girolami, M [1 ]
He, C [1 ]
机构
[1] Univ Paisley, Sch Informat & Commun Technol, Appl Computat Intelligence Res Unit, Paisley PA1 2BE, Renfrew, Scotland
关键词
kernel density estimation; Parzen window; data condensation; sparse representation;
D O I
10.1109/TPAMI.2003.1233899
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The requirement to reduce the computational cost of evaluating a point probability density estimate when employing a Parzen window estimator is a well-known problem. This paper presents the Reduced Set Density Estimator that provides a kernel-based density estimator which employs a small percentage of the available data sample and is optimal in the L-2 sense. While only requiring O(N-2) optimization routines to estimate the required kernel weighting coefficients, the proposed method provides similar levels of performance accuracy and sparseness of representation as Support Vector Machine density estimation, which requires optimization routines, and which has previously been shown to consistently outperform Gaussian Mixture Models. It is also demonstrated that the proposed density estimator consistently provides superior density estimates for similar levels of data reduction to that provided by the recently proposed Density-Based Multiscale Data Condensation algorithm and, in addition, has comparable computational scaling. The additional advantage of the proposed method is that no extra free parameters are introduced such as regularization, bin width, or condensation ratios, making this method a very simple and straightforward approach to providing a reduced set density estimator with comparable accuracy to that of the full sample Parzen density estimator.
引用
收藏
页码:1253 / 1264
页数:12
相关论文
共 35 条
[1]  
[Anonymous], THESIS RICE U
[2]  
Astrahan M, 1970, SPEECH ANAL CLUSTERI
[3]   Weighted Parzen windows for pattern classification [J].
Babich, GA ;
Camps, OI .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (05) :567-570
[4]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[5]  
ELGAMMAL E, 2000, P 6 EUR C COMP VIS, P751
[6]  
FUKUDA K, 1989, TRENDS PHARM SCI S, V11, P4
[7]   NONPARAMETRIC DATA REDUCTION [J].
FUKUNAGA, K ;
MANTOCK, JM .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (01) :115-118
[8]   Orthogonal series density estimation and the kernel eigenvalue problem [J].
Girolami, M .
NEURAL COMPUTATION, 2002, 14 (03) :669-688
[9]   PRINCIPAL CURVES [J].
HASTIE, T ;
STUETZLE, W .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (406) :502-516
[10]   The accuracy and the computational complexity of a multivariate binned kernel density estimator [J].
Holmström, L .
JOURNAL OF MULTIVARIATE ANALYSIS, 2000, 72 (02) :264-309