Component-based discriminative classification for hidden Markov models

被引:20
作者
Bicego, Manuele [1 ,2 ]
Pekalska, Elzbieta [3 ]
Tax, David M. J. [4 ]
Duin, Robert P. W. [4 ]
机构
[1] Univ Verona, Dept Comp Sci, I-37134 Verona, Italy
[2] Univ Sassari, DEIR, I-07100 Sassari, Italy
[3] Univ Manchester, Sch Comp Sci, Manchester M13 9PL, Lancs, England
[4] Delft Univ Technol, NL-2628 CD Delft, Netherlands
基金
英国工程与自然科学研究理事会;
关键词
Hidden Markov models; Discriminative classification; Dimensionality reduction; Hybrid models; Generative embeddings;
D O I
10.1016/j.patcog.2009.03.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hidden Markov models (HMMs) have been successfully applied to a wide range of sequence modeling problems. In the classification context, one of the simplest approaches is to train a single HMM per class. A test sequence is then assigned to the class whose HMM yields the maximum a posterior (MAP) probability. This generative scenario works well when the models are correctly estimated. However, the results can become poor when improper models are employed, due to the lack of prior knowledge, poor estimates, violated assumptions or insufficient training data. To improve the results in these cases we propose to combine the descriptive strengths of HMMs with discriminative classifiers. This is achieved by training feature-based classifiers in an HMM-induced vector space defined by specific components of individual hidden Markov models. We introduce four major ways of building Such vector spaces and study which trained combiners are useful in which context. Moreover, we motivate and discuss the merit of our method in comparison to dynamic kernels, in particular, to the Fisher Kernel approach. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2637 / 2648
页数:12
相关论文
共 64 条
[1]  
ALTUN Y, 2003, ICML, P3
[2]   Natural gradient works efficiently in learning [J].
Amari, S .
NEURAL COMPUTATION, 1998, 10 (02) :251-276
[3]  
Andreu G, 1997, 1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, P1341, DOI 10.1109/ICNN.1997.616230
[4]  
[Anonymous], 1961, Adaptive Control Processes: a Guided Tour, DOI DOI 10.1515/9781400874668
[5]  
[Anonymous], 2002, THESIS U NEW S WALES
[6]  
Arica N, 2000, INT C PATT RECOG, P924, DOI 10.1109/ICPR.2000.905592
[7]   Modeling splicing sites with pairwise correlations [J].
Arita, M ;
Tsuda, K ;
Asai, K .
BIOINFORMATICS, 2002, 18 :S27-S34
[8]  
BAHL L, 1986, P INT C AC SPEECH SI, V1, P49, DOI DOI 10.1109/ICASSP.1986.1169179>
[9]  
Baum L.E., 1970, INEQUALITIES, V3, P1
[10]   AN INEQUALITY WITH APPLICATIONS TO STATISTICAL ESTIMATION FOR PROBABILISTIC FUNCTIONS OF MARKOV PROCESSES AND TO A MODEL FOR ECOLOGY [J].
BAUM, LE ;
EAGON, JA .
BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 1967, 73 (03) :360-&