High resolution speech feature parametrization for monophone-based stressed speech recognition

被引:43
作者
Sarikaya, R [1 ]
Hansen, JHL [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Robust Speech Proc Lab, Boulder, CO 80309 USA
关键词
feature extraction; speech recognition; speech under stress; wavelet analysis;
D O I
10.1109/97.847363
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis. The two parameter schemes are entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC). The performance of these features is compared to traditional Mel-frequency cepstral coefficients (MFCC) for stressed speech monophone recognition. The stressed speaking styles considered areneutral, angry, loud, and Lombard effect(1) speech from the SUSAS database. An overall monophone recognition improvement of 20.4% and 17.2% is achieved for loud and angry stressed speech, with a corresponding increase in the neutral monophone rate of 9.9% over MFCC parameters.
引用
收藏
页码:182 / 185
页数:4
相关论文
共 10 条
[1]   ORTHONORMAL BASES OF COMPACTLY SUPPORTED WAVELETS [J].
DAUBECHIES, I .
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 1988, 41 (07) :909-996
[2]   COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366
[3]  
ERZIN E, ICASSP 95 DETR MI, V1, P417
[4]   Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition [J].
Hansen, JHL .
SPEECH COMMUNICATION, 1996, 20 (1-2) :151-173
[5]  
HANSEN JHL, IN PRESS ANAL ACOU 3
[6]  
HANSEN JHL, 1997, EUROSPEECH 97, V4, P1743
[7]  
*NATO RES TECH ORG, 2000, RTOTR10 NATO
[8]   Wavelets and signal processing [J].
Rioul, Olivier ;
Vetterli, Martin .
IEEE SIGNAL PROCESSING MAGAZINE, 1991, 8 (04) :14-38
[9]  
Sarikaya R, 1998, INT CONF ACOUST SPEE, P569, DOI 10.1109/ICASSP.1998.674494
[10]  
SARIKAYA R, 1998, IEEE NORD SIGN PROC, P81