A speech recognition method based on the sequential multi-layer perceptrons

被引:16
作者
Chen, WY
Chen, SH
Lin, CJ
机构
[1] IND TECHNOL RES INST, HSINCHU, TAIWAN
[2] NATL CHIAO TUNG UNIV, HSINCHU, TAIWAN
关键词
neural network; generalized probabilistic descent; multi-layer perceptrons; hidden markov models; speech recognition; dynamic programming;
D O I
10.1016/0893-6080(95)00140-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A novel multi-layer perceptrons (MLP)-based speech recognition method is proposed in this study. In this method, the dynamic time warping capability of hidden Markov models (HMM) is directly combined with the discriminant based learning of MLP for the sake of employing a sequence of MLPs (SMLP) as a word recognizer. Each MLP is regarded as a state recognizer to distinguish an acoustic event. Next, the word recognizer is formed by serially cascading all state recognizers. Advantages of both HMM and MLP methods are attained in this system through training the SMLP with an algorithm which combines a dynamic programming (DP) procedure with a generalized probabilistic descent (GPD) algorithm. Additionally, two sub-syllable SMLP-based schemes are studied through application of this method toward the recognition of isolated Mandarin digits. Simulation results confirm that the performance of the methods is comparable to a well modeled continuous Gaussian mixture density HMM trained with the minimum error criterion. Not only does the SMLP require less trainable parameters than the HMM system, but the former is more convenient for analysing internal features. With the aid of internal feature selection, discarding the least useful parameters of SMLP without affecting its performance is relatively easy. Copyright (C) 1996 Elsevier Science Ltd
引用
收藏
页码:655 / 669
页数:15
相关论文
共 40 条
[1]  
AIKAWA K, 1991, P IEEE WORKSH NEUR N, P337
[2]   A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].
AMARI, S .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+
[3]  
[Anonymous], 1982, Pattern recognition: A statistical approach
[4]  
AUSTIN S, 1991, P IEEE WORKSH NEUR N, P347
[5]  
Bladon A., 1985, Computer speech processing, P29
[6]   LINKS BETWEEN MARKOV-MODELS AND MULTILAYER PERCEPTRONS [J].
BOURLARD, H ;
WELLEKENS, CJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1990, 12 (12) :1167-1178
[7]  
BOURLARD H, 1992, P IEEE INT C AC SPEE, P349
[9]  
CERF PL, 1994, IEEE T SPEECH AUDIO, V2, P185
[10]   Discriminative Training of Dynamic Programming Based Speech Recognizers [J].
Chang, Pao-Chung ;
Juang, Biing-Hwang .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (02) :135-143