Auditory-based wavelet packet filterbank for speech recognition using neural network

被引：12

作者：

Gandhiraj, R. ^{[1
]}

Sathidevi, P. S. ^{[2
]}

机构：

[1] Dr Mahalingam Coll Engg & Tech, ECE Dept, Pollachi, Tamil Nadu, India

[2] Nat Inst Technol Calicut, ECE Dept, Calicut, Kerala, India

来源：

ADCOM 2007: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS | 2007年

关键词：

auditory-based; speech recognition; wavelet packet; neural network;

D O I：

10.1109/ADCOM.2007.104

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 [计算机科学与技术];

摘要：

A major problem of most speech recognition systems is their unsatisfactory robustness in noise. Human inner ear based 'feature extraction' leads to very robust speech understanding in noise. This 'Model of Auditory Periphery' is acting as front-end model of this speech recognition process. This paper describes two quantitative models for signal processing in auditory system (i) Gamma Tone Filter Bank (GTFB) and (ii) Wavelet Packet (WP) as front-ends for robust speech recognition. The auditory feature vectors had been used to train neural network. The classification of the feature vectors was done by the neural network using Back Propagation (BP) algorithm. The system performance was measured by recognition rate with various signal-to-noise ratios over -10 to 10 dB. The proposed system's performance was compared with various types of front-ends and recognition methods such as auditory features with Hidden Markov Model (HMM) & Layered Neural Network (LRNN), auditory features with Mel Frequency Cepstral Coefficient (WFCC) & LRNN and vocal tract model: MFCC & HMM, Dynamic time warping (DTW). The performances of proposed models with gamma tone filter bank and wavelet packet as front-ends were also compared. It had been identified that proposed system with wavelet packet as front-end and Back Propagation Neural Network (BPNN) as the recognition method is having good recognition rate over -10 to 10 dB. Both speaker independent and speaker dependent recognition systems had been designed, implemented and tested.

引用

页码：666 / +

页数：2

共 9 条

[1]

ABDULLA WH, AUDITORY BASED FEATU

[2]

A quantitative model of the ''effective'' signal processing in the auditory system .2. Simulations and measurements [J].

Dau, T ;

Puschel, D ;

Kohlrausch, A .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 99 (06) :3623-3631

[3]

A quantitative model of the ''effective'' signal processing in the auditory system .1. Model structure [J].

Dau, T ;

Puschel, D ;

Kohlrausch, A .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 99 (06) :3615-3622

[4]

GANDHIRAJ R, P 1 NAT C SIGN SYST

[5]

Haykin S., 1999, NEURAL NETWORK COMPR

[6]

HWUNG TH, 1997, IEEE INT C AC SPEECH

[7]

INDREBO KM, 3 ORDER MOMENTS FILT

[8]

Combining speech enhancement and auditory feature extraction for robust speech recognition [J].

Kleinschmidt, M ;

Tchorz, J ;

Kollmeier, B .

SPEECH COMMUNICATION, 2001, 34 (1-2) :75-91

[9]

Rao R. M., WAVELET TRANSFORMS I

← 1 →