SPEECH RECOGNITION WITH HIERARCHICAL RECURRENT NEURAL NETWORKS

被引:16
作者
CHEN, WY
LIAO, YF
CHEN, SH
机构
[1] Department of Communication Engineering, National Chiao Tung University, Hsinchu
关键词
SPEECH RECOGNITION; HIERARCHICAL; RECURRENT NEURAL NETWORKS; GENERALIZED PROBABILISTIC DESCENT; DISCRIMINATIVE TRAINING;
D O I
10.1016/0031-3203(94)00145-C
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A hierarchical recurrent neural network (HRNN)for speech recognition is presented. The HRNN is trained by a generalized probabilistic descent (GPD) algorithm. Consequently, the difficulty of empirically selecting an appropriate target function for training RNNs can be avoided. Results obtained in this study indicate the proposed HRNN has the advantages of being capable of absorbing the temporal variation of speech patterns as well as possessing effective discrimination capabilities. The scaling problem of RNNs is also greatly reduced. Additionally, a realization of the system using initial/final sub-syllable models for isolated Mandarin syllable recognition is also undertaken for verifying its effectiveness. The effectiveness of the proposed HRNN is confirmed by the experimental results.
引用
收藏
页码:795 / 805
页数:11
相关论文
共 24 条
[1]   Discriminative Training of Dynamic Programming Based Speech Recognizers [J].
Chang, Pao-Chung ;
Juang, Biing-Hwang .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (02) :135-143
[2]  
CHEN SH, 1990, IEEE T COMMUN, V38, P1317
[3]  
CHEN WY, 1991, P IEEE WORKSH NEUR N, P376
[4]  
CHEN Y, 1992, P INT JT C NEUROL NE, V4, P743
[5]   THE META-PI NETWORK - BUILDING DISTRIBUTED KNOWLEDGE REPRESENTATIONS FOR ROBUST MULTISOURCE PATTERN-RECOGNITION [J].
HAMPSHIRE, JB ;
WAIBEL, A .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1992, 14 (07) :751-769
[6]  
HAMPSHIRE JB, 1990, P INT C ACOUST SPEEC, P165
[7]  
HILD H, 1993, P IEEE INT C ACOUSTI, V2, P255
[8]  
ISO K, 1993, P INT C AC SPEECH SI, V2, P283
[9]  
JACOBS R, 1988, P CONN MOD SUMM SCH, P144
[10]  
Juang B.-H., 1992, Journal of the Acoustical Society of Japan (E), V13, P333, DOI 10.1250/ast.13.333