Maximum Mutual Information Neural Networks for Hybrid Connectionist-HMM Speech Recognition Systems

被引:24
作者
Rigoll, Gerhard [1 ]
机构
[1] Univ Duisburg, Dept Comp Sci, Fac Elect Engn, Duisburg, Germany
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1994年 / 2卷 / 01期
关键词
D O I
10.1109/89.260360
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a novel approach for a hybrid connectionist-hidden Markov model (HMM) speech recognition system based on the use of a neural network as vector quantizer. The neural network is trained with a new learning algorithm offering the following innovations: 1) It is an unsupervised learning algorithm for perceptron-like neural networks that are usually trained in supervised mode. 2) Information theory principles are used as learning criteria, making the network especially suitable for combination with a HMM-based speech recognition system. 3) The neural network is not trained using the standard error-backpropagation algorithm but using instead a newly developed self-organizing learning approach. The use of the hybrid system with the neural vector quantizer results in a 25% error reduction compared with the same HMM system using a standard k-means vector quantizer. The training algorithm can be further refined by using a combination of unsupervised and supervised learning algorithms. Finally, it is demonstrated how the new learning approach can be applied to multiple-feature hybrid speech recognition systems, using a joint information theory-based optimization procedure for the multiple neural codebooks, resulting in a 30% error reduction.
引用
收藏
页码:175 / 184
页数:10
相关论文
共 11 条
[1]  
BAHL LR, 1986, P IEEE INT C AC SPEE, P49
[2]  
BOURLAND H, 1991, P IEEE IJCNN SING, P242
[3]  
GOPALAKRISHNAN PS, 1988, P IEEE INT C AC SPEE, P20
[4]  
Gupta V. N., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0), P697
[5]  
IWAMIDA H, 1991, P ICASSP 91 TORONTO, P553
[6]  
KIMBER D, 1990, P INT C AC SPEECH SI, P497
[7]   CONTEXT-DEPENDENT PHONETIC HIDDEN MARKOV-MODELS FOR SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION [J].
LEE, KF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (04) :599-609
[8]  
MA W, 1990, P IEEE ICASSP ALB, P421
[9]  
RIGOLL G, 1991, P IEEE IJCNN SING, P1764
[10]  
RIGOLL G, 1992, P IEEE ICASSP SAN FR, P393