Improved speech modelling and recognition using a new training algorithm based on outlier-emphasis for nonstationary state HMM

被引：1

作者：

Chengalvarayan, R ^{[1
]}

机构：

[1] Lucent Technol, Bell Labs, Speech Proc Grp, Naperville, IL 60566 USA

来源：

SPEECH COMMUNICATION | 1998年 / 26卷 / 03期

关键词：

speech signal; speech recognition; discriminative training; hidden Markov models; outlier-emphasis; non-stationary states;

D O I：

10.1016/S0167-6393(98)00057-0

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this study, we develop a modified maximum likelihood algorithm for optimally estimating the state-dependent polynomial parameters in the nonstationary-state HMM. The newly devised training method controls the influence of outliers in the training data on the constructed models. For an alphabet recognition task, outlier emphasis resulted in improved performance. An error rate reduction of 14% is achieved for the linear trend and 7.5% is obtained for the stationary-state HMMs over the conventional models trained by the Viterbi algorithm based on the joint-state maximum likelihood criterion. The properties of the nonstationary-state HMM trained with the proposed approach are analysed by examining goodness-of-fit of the real speech data to the polynomial trajectories in the model. (C) 1998 Elsevier Science B.V. All rights reserved.

引用

页码：191 / 201

页数：11

共 28 条

[1] A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].

AMARI, S .

IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+

[2]

ARSLAN LM, 1996, P ICASSP, V2, P589

[3]

BAHL L, 1986, P INT C AC SPEECH SI, V1, P49, DOI DOI 10.1109/ICASSP.1986.1169179>

[4]

BORWN PF, 1987, THESIS CARNEGIE MELL

[5] An N-Best Candidates-Based Discriminative Training for Speech Recognition Applications [J].

Chen, Jung-Kuei ;

Soong, Frank K. .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01) :206-216

[6] Use of generalized dynamic feature parameters for speech recognition [J].

Chengalvarayan, R ;

Deng, L .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (03) :232-242

[7]

CHENGALVARAYAN R, 1997, P ICASSP, V2, P1415

[8]

CHENGALVARAYAN R, 1996, P ICSLP, V2, P1049

[9]

Chou W., 1994, International Journal of Pattern Recognition and Artificial Intelligence, V8, P5, DOI 10.1142/S0218001494000024

[10] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].

DAVIS, SB ;

MERMELSTEIN, P .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366

← 1 2 3 →