SPEAKER-INDEPENDENT ISOLATED DIGIT RECOGNITION - MULTILAYER PERCEPTRONS VS DYNAMIC TIME WARPING

被引:12
作者
BOTTOU, L [1 ]
SOULIE, FF [1 ]
BLANCHET, P [1 ]
LIENARD, JS [1 ]
机构
[1] LAB INFORMAT MEAN & SCI INGN,ORSAY,FRANCE
关键词
Dynamic time warping; Hidden cells states clustering; Isolated digits recognition; Speaker independence; Speech recognition; Time delay neural network;
D O I
10.1016/0893-6080(90)90028-J
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Former experiments have shown the benefit of using specific multi-layer architectures, the so-called time delay neural networks, for phoneme recognition (Waibel, Hanazawa, Hinton, Shikano, & Lang, 1988). Similar experiments on a speaker-independent task were also performed on a small set of minimal pairs (Bottou, 1988). In this paper we focus on a speaker-independent, global word recognition task with time delay networks. We first describe these networks as a way for learning feature extractors by constrained back-propagation. Such a time-delay network is shown to be capable of dealing with a near real-sized problem: French digit recognition. The results are discussed and compared, on the same data sets, with those obtained with a classical time warping system. © 1990.
引用
收藏
页码:453 / 465
页数:13
相关论文
共 25 条
[1]  
[Anonymous], 1987, LEARNING INTERNAL RE
[2]  
BOTTOU L, 1988, P NEURO NIMES 88 EC, V2, P371
[3]  
BOTTOU L, 1988, NEURO NIMES 88 EC, V2, P197
[4]  
BOURLARD H, 1988, ADV NEURAL INFORMATI
[5]  
BRIDLE JS, 1984, 1984 P I AC AUT M
[6]  
ELMAN JL, 1987, LEARNING HIDDEN STRU
[7]  
GAUVAIN JL, 1986, 1986 P IEEE C AC SPE
[8]  
GAUVAIN JL, 1983, P IEEE C ACOUSTIC SP
[9]  
HAFFNER P, 1988, SPEECH
[10]  
KOHONEN T, 1988, IEEE COMPUTER MA MAR, P11