Bayesian classification for data from the same unknown class

被引:26
作者
Huang, HJ [1 ]
Hsu, CN
机构
[1] Natl Chiao Tung Univ, Dept Comp & Informat Sci, Hsinchu 300, Taiwan
[2] Acad Sinica, Inst Informat Sci, Taipei 115, Taiwan
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS | 2002年 / 32卷 / 02期
关键词
classification; machine learning; naive Bayes classifier; speaker recognition;
D O I
10.1109/3477.990870
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 [计算机科学与技术];
摘要
In this paper, we address the problem of how to classify a set of query vectors that belong to the same unknown class. Sets of data known to be sampled from the same class are naturally available in many application domains, such as speaker recognition. We refer to these sets as homologous sets. We show how to take advantage of homologous sets in classification to obtain improved accuracy over classifying each query vector individually. Our method, called homologous naive Bayes (HNB), is based on the naive Bayes classifier, a simple algorithm shown to be effective in many application domains. HNB uses a modified classification procedure that classifies multiple instances as a single unit. Compared with a voting method and several other variants of naive Bayes classification, HNB significantly outperforms these methods in a variety of test data sets, even when the number of query vectors in the homologous sets is small. We also report a successful application of HNB to speaker recognition. Experimental results show that HNB can achieve classification accuracy comparable to the Gaussian mixture model (GMM), the most widely used speaker recognition approach, while using less time for both training and classification.
引用
收藏
页码:137 / 145
页数:9
相关论文
共 22 条
[1]
Almond R.G., 1995, Graphical belief modeling
[2]
[Anonymous], 1993, P 13 INT JOINT C ART
[3]
[Anonymous], AUTOMATIC SPEECH SPE
[4]
Blake C.L., 1998, UCI repository of machine learning databases
[5]
CESTNIK B, 1991, LECT NOTES ARTIF INT, V482, P138, DOI 10.1007/BFb0017010
[6]
MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]
COMPARISON OF HIDDEN MARKOV MODEL TECHNIQUES FOR AUTOMATIC SPEAKER VERIFICATION IN REAL-WORLD CONDITIONS [J].
DEVETH, J ;
BOURLARD, H .
SPEECH COMMUNICATION, 1995, 17 (1-2) :81-90
[8]
On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130
[9]
DOUGHERTY J, 1985, MACHINE LEARNING
[10]
Bayesian network classifiers [J].
Friedman, N ;
Geiger, D ;
Goldszmidt, M .
MACHINE LEARNING, 1997, 29 (2-3) :131-163