SPOKEN-WORD RECOGNITION USING DYNAMIC FEATURES ANALYZED BY TWO-DIMENSIONAL CEPSTRUM

被引:12
作者
ARIKI, Y [1 ]
MIZUTA, S [1 ]
NAGATA, M [1 ]
SAKAI, T [1 ]
机构
[1] KYOTO UNIV,FAC ENGN,DEPT INFORMAT SCI,KYOTO 606,JAPAN
来源
IEE PROCEEDINGS-I COMMUNICATIONS SPEECH AND VISION | 1989年 / 136卷 / 02期
关键词
Pattern recognition;
D O I
10.1049/ip-i-2.1989.0017
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Two-dimensional cepstrum (TDC) analysis and its application to word and monosyllable recognition are described. The TDC can simultaneously represent several different kinds of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structure. Noise reduction and speech enhancement can be easily performed using the TDC. Using word and monosyllable recognition experiments based on dynamic programming (DP) matching of a time sequence of the TDC, it is confirmed that the global static features (spectral envelope) and global dynamic features are both effective for speech recognition. A speaker-independent (noisy) word recognition algorithm is proposed which recognises the words based on the similarity of dynamic features. The algorithm employs linear matching instead of DP nonlinear matching, requires a small amount of memory, and shows high speed and high accuracy in recognition. At present, the recognition rate is 89.0% at ∞ dB and 70.0% at 0 dB snr.
引用
收藏
页码:133 / 140
页数:8
相关论文
共 14 条
[1]  
ARIKI Y, 1986, P ICASSP86, P97
[2]   SEPARATION OF FRICATIVES FROM ASPIRATED PLOSIVES BY MEANS OF TEMPORAL SPECTRAL VARIATION. [J].
Chan, Chorkin ;
Ng, K.W. .
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1985, ASSP-33 (05) :1130-1137
[3]   ON THE ROLE OF SPECTRAL TRANSITION FOR SPEECH-PERCEPTION [J].
FURUI, S .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1986, 80 (04) :1016-1025
[4]   SPEAKER-INDEPENDENT ISOLATED WORD RECOGNITION USING DYNAMIC FEATURES OF SPEECH SPECTRUM [J].
FURUI, S .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (01) :52-59
[5]  
Gupta V. N., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0), P697
[6]  
IMAI S, 1976, T I ELECTRON COM A J, V59, P1096
[7]   MINIMUM PREDICTION RESIDUAL PRINCIPLE APPLIED TO SPEECH RECOGNITION [J].
ITAKURA, F .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1975, AS23 (01) :67-72
[8]  
NISHIMURA M, 1907, P ICASSP87, P1163
[9]   SPEAKER-INDEPENDENT RECOGNITION OF ISOLATED WORDS USING CLUSTERING TECHNIQUES [J].
RABINER, LR ;
LEVINSON, SE ;
ROSENBERG, AE ;
WILPON, JG .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (04) :336-349
[10]  
RABINER LR, 1987, DIGITAL PROCESSING S, pCH7