Maximum Mutual Information Neural Networks for Hybrid Connectionist-HMM Speech Recognition Systems

被引：24

作者：

Rigoll, Gerhard ^{[1
]}

机构：

[1] Univ Duisburg, Dept Comp Sci, Fac Elect Engn, Duisburg, Germany

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1994年 / 2卷 / 01期

关键词：

D O I：

10.1109/89.260360

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper proposes a novel approach for a hybrid connectionist-hidden Markov model (HMM) speech recognition system based on the use of a neural network as vector quantizer. The neural network is trained with a new learning algorithm offering the following innovations: 1) It is an unsupervised learning algorithm for perceptron-like neural networks that are usually trained in supervised mode. 2) Information theory principles are used as learning criteria, making the network especially suitable for combination with a HMM-based speech recognition system. 3) The neural network is not trained using the standard error-backpropagation algorithm but using instead a newly developed self-organizing learning approach. The use of the hybrid system with the neural vector quantizer results in a 25% error reduction compared with the same HMM system using a standard k-means vector quantizer. The training algorithm can be further refined by using a combination of unsupervised and supervised learning algorithms. Finally, it is demonstrated how the new learning approach can be applied to multiple-feature hybrid speech recognition systems, using a joint information theory-based optimization procedure for the multiple neural codebooks, resulting in a 30% error reduction.

引用

页码：175 / 184

页数：10

共 11 条

[1]

BAHL LR, 1986, P IEEE INT C AC SPEE, P49

[2]

BOURLAND H, 1991, P IEEE IJCNN SING, P242

[3]

GOPALAKRISHNAN PS, 1988, P IEEE INT C AC SPEE, P20

[4]

Gupta V. N., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0), P697

[5]

IWAMIDA H, 1991, P ICASSP 91 TORONTO, P553

[6]

KIMBER D, 1990, P INT C AC SPEECH SI, P497

[7] CONTEXT-DEPENDENT PHONETIC HIDDEN MARKOV-MODELS FOR SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION [J].

LEE, KF .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (04) :599-609

[8]

MA W, 1990, P IEEE ICASSP ALB, P421

[9]

RIGOLL G, 1991, P IEEE IJCNN SING, P1764

[10]

RIGOLL G, 1992, P IEEE ICASSP SAN FR, P393

← 1 2 →