Use of generalized dynamic feature parameters for speech recognition

被引:12
作者
Chengalvarayan, R [1 ]
Deng, L [1 ]
机构
[1] UNIV WATERLOO, DEPT ELECT & COMP ENGN, WATERLOO, ON N2L 3G1, CANADA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1997年 / 5卷 / 03期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/89.568730
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, a new hidden Markov model that integrates generalized dynamic feature parameters into the model structure is developed and evaluated using maximum-likelihood (ML) and minimum-classification-error (MCE) pattern recognition approaches, in addition to the motivation of direct minimization of error sate, the MCE approach automatically eliminates the necessity of artificial constraints, which were essential far the model formulation based on the ML approach, on the weighting functions in the definition of the generalized dynamic parameters, We design the loss function for minimizing error rate specifically for the new model, and derive an analytical form of the gradient of the loss function that enables the implementation of the MCE approach, The convergence property of the training procedure based on the MCE approach is investigated, and the experimental results from a standard TIMIT phonetic classification task demonstrate a 13.4% error rate reduction compared with the ML approach.
引用
收藏
页码:232 / 242
页数:11
相关论文
共 22 条
[1]   A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].
AMARI, S .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+
[2]  
APPLEBAUM T, 1991, P IEEE INT C AC SPEE, V2, P985
[3]   A MAXIMUM-LIKELIHOOD APPROACH TO CONTINUOUS SPEECH RECOGNITION [J].
BAHL, LR ;
JELINEK, F ;
MERCER, RL .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1983, 5 (02) :179-190
[4]  
Baum L.E., 1972, Inequalities III: Proceedings of the Third Symposium on Inequalities, page, V3, P1
[5]  
CHANG PC, 1992, P ICASSP 92 SAN FRAN, V1, P493
[6]  
CHENGALVARAYAN R, 1995, P IEEE INT C AC SPEE, V1, P373
[7]  
CHENGALVARAYAN R, 1995, THESIS U WATERLOO WA
[8]  
CHOW W, INT J PATTERN RECOGN, V8, P5
[9]  
DENG D, 1990, COMPUTER SPEECH LANG, V4, P345
[10]   CONTEXT-DEPENDENT MARKOV MODEL STRUCTURED BY LOCUS EQUATIONS - APPLICATIONS TO PHONETIC CLASSIFICATION [J].
DENG, L ;
BRAAM, D .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 96 (04) :2008-2025