Performance of HMM-based speech recognizers with discriminative state-weights

被引：5

作者：

Kwon, OW ^{[1
]}

Un, CK ^{[1
]}

机构：

[1] KOREA ADV INST SCI & TECHNOL, DEPT ELECT ENGN, COMMUN RES LAB, TAEJON 305701, SOUTH KOREA

来源：

SPEECH COMMUNICATION | 1996年 / 19卷 / 03期

关键词：

speech recognition; hidden Markov models;

D O I：

10.1016/0167-6393(96)00035-0

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, assuming that the score of a speech utterance is a weighted sum of hidden Markov model (HMM) log state-likelihoods, we propose a new method of finding discriminative state-weights recursively using the generalized probabilistic descent method. With this method the conventional parameter estimation method and the Viterbi algorithm can be applied to continuous speech recognition as well as isolated word recognition without large modification by constraining the sum of the state-weights to the number of states in a recognition unit. Compared with the previous approaches, this method does not increase complexity and can be implemented with minor modification of the conventional parameter estimation and recognition algorithms by constraining the sum of the state-weights to the number of states in a recognition unit, and further it can be applied to continuous speech recognition as well as isolated word recognition. To evaluate the performance of the state-weighted HMM recognizer, we perform two kinds of experiments with phoneme-based and word-based state-weights using various kinds of speech databases. Experimental results showed that the recognizers with phoneme-based and word-based state-weights achieved 20% and 50% decrease in word error rate, respectively, for isolated word recognition, and 5% decrease for continuous speech recognition. Our approach yields recognition accuracies comparable to those of the previous approaches for continuous speech recognition, but it is much simpler to implement than others.

引用

页码：197 / 205

页数：9

共 9 条

[1] A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].

AMARI, S .

IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+

[2] Discriminative Analysis of Distortion Sequences in Speech Recognition [J].

Chang, Pao-Chung ;

Chen, Sin-Horng ;

Juang, Biing-Hwang .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (03) :326-333

[3]

Chou W., 1992, P IEEE ICASSP 92, P473

[4]

CHOU W, 1993, P IEEE INT C AC SPEE, V2, P652

[5] Multilayer perceptrons for state-dependent weightings of HMM likelihoods [J].

Chung, YJ ;

Un, CK .

SPEECH COMMUNICATION, 1996, 18 (01) :79-89

[6]

Lee C. H., 1990, Computer Speech and Language, V4, P127, DOI 10.1016/0885-2308(90)90002-N

[7] A FRAME-SYNCHRONOUS NETWORK SEARCH ALGORITHM FOR CONNECTED WORD RECOGNITION [J].

LEE, CH ;

RABINER, LR .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (11) :1649-1658

[8] Speech Recognition Using Weighted HMM and Subspace Projection Approaches [J].

Su, Keh-Yih ;

Lee, Chin-Hui .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01) :69-79

[9]

WOLFERSTETTER F, 1994, P INT C SPOI LANG PR, P219

← 1 →