Teager energy based feature parameters for speech recognition in car noise

被引:74
作者
Jabloun, F [1 ]
Çetin, AE
Erzin, E
机构
[1] Bilkent Univ, Dept Elect & Elect Engn, TR-06533 Ankara, Turkey
[2] Lucent Technol, Whippany, NJ 07981 USA
关键词
Mel-scale; multirate signal processing; speech recognition; Teager energy operator;
D O I
10.1109/97.789604
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this letter, a new set of speech feature parameters based on multirate signal processing and the Teager energy operator is introduced. The speech signal is first divided into nonuniform subbands in mel-scale using a multirate filterbank, then the Teager energies of the subsignals are estimated. Finally, the feature vector is constructed by log-compression and inverse discrete cosine transform (DCT) computation. The nem feature parameters have robust speech recognition performance in the presence of car engine noise.
引用
收藏
页码:259 / 261
页数:3
相关论文
共 11 条
[1]   CONDITIONS FOR POSITIVITY OF AN ENERGY OPERATOR [J].
BOVIK, AC ;
MARAGOS, P .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1994, 42 (02) :469-471
[2]   AM-FM ENERGY DETECTION AND SEPARATION IN NOISE USING MULTIBAND ENERGY OPERATORS [J].
BOVIK, AC ;
MARAGOS, P ;
QUATIERI, TF .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (12) :3245-3265
[3]   COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366
[4]  
ERZIN E, 1995, P INT C AC SPEECH SI
[5]  
KIM CW, 1992, INT CONF ACOUST SPEE, pD673
[6]   ON AMPLITUDE AND FREQUENCY DEMODULATION USING ENERGY OPERATORS [J].
MARAGOS, P ;
KAISER, JF ;
QUATIERI, TF .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (04) :1532-1550
[7]  
MARAGOS P, 1993, IEEE T SIGNAL PROCES, V41, P3025
[8]  
SARIKAYA R, 1998, P INT C AC SPEECH SI, V1, P596
[9]  
SARIKAYA R, P NORSIG 98, P81
[10]   SOME OBSERVATIONS ON ORAL AIR-FLOW DURING PHONATION [J].
TEAGER, HM .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (05) :599-601