LVQ-BASED SHIFT-TOLERANT PHONEME RECOGNITION

被引:14
作者
MCDERMOTT, E
KATAGIRI, S
机构
[1] ATR Auditory and Visual Perception Research Laboratories, Kyoto 619-02, Sanpeidani, Induidani Seika-cho Soraku-gun
关键词
D O I
10.1109/78.136545
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper we describe a shift-tolerant neural network architecture for phoneme recognition. Our system is based on learning vector quantization (LVQ) algorithms, recently developed by Kohonen (1986, 1988), which pay close attention to approximating optimal decision lines in a discrimination task. Recognition performances in the 98%-99% correct range were obtained for LVQ networks aimed at speaker-dependent recognition of phonemes in small but ambiguous Japanese phonemic classes. A correct recognition rate of 97.7% was achieved by a large LVQ network covering all Japanese consonants. These recognition results are as good as those obtained in the time delay neural network system developed by Waibel et al. (1989), and suggest that LVQ could be the basis for a high performance speech recognition system.
引用
收藏
页码:1398 / 1411
页数:14
相关论文
共 20 条
[1]  
BAHL LR, 1988, P ICASSP 88 NEW YORK, P493
[2]  
DUDA RO, 1973, PATTERN CLASSIFICATI, pCH2
[3]  
HAFFNER P, 1988, RTI0058 ATR INT TEL
[4]  
IWAMIDA H, 1990, P INT C ACOUST SPEEC
[5]  
IWAMIDA H, 1989, TRA0061 ATR TECH REP
[6]  
KOHONEN T, 1988, SELF ORG ASS MEMORY, P199
[7]  
KOHONEN T, 1988, IEEE COMPUTER MAG, V21
[8]  
Kohonen T., 1986, TKKFA601 HELS U TECH
[9]  
KOHONEN T, 1988, JUL IEEE P ICNN, V1, P61
[10]  
LIPPMANN RP, 1989, NEURAL COMPUTATION, V1