An iterative longest matching segment approach to speech enhancement with additive noise and channel distortion

被引:9
作者
Ming, Ji [1 ]
Crookes, Danny [1 ]
机构
[1] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT7 1NN, Antrim, North Ireland
基金
英国工程与自然科学研究理事会;
关键词
Corpus-based speech modeling; Longest matching segment; Noisy speech; Channel distortion; Speech enhancement; Speech recognition; HISTOGRAM EQUALIZATION; MODEL; RECOGNITION; REPRESENTATION; COMPENSATION;
D O I
10.1016/j.csl.2014.04.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new approach to speech enhancement from single-channel measurements involving both noise and channel distortion (i.e., convolutional noise), and demonstrates its applications for robust speech recognition and for improving noisy speech quality. The approach is based on finding longest matching segments (LMS) from a corpus of clean, wideband speech. The approach adds three novel developments to our previous LMS research. First, we address the problem of channel distortion as well as additive noise. Second, we present an improved method for modeling noise for speech estimation. Third, we present an iterative algorithm which updates the noise and channel estimates of the corpus data model. In experiments using speech recognition as a test with the Aurora 4 database, the use of our enhancement approach as a preprocessor for feature extraction significantly improved the performance of a baseline recognition system. In another comparison against conventional enhancement algorithms, both the PESQ and the segmental SNR ratings of the LMS algorithm were superior to the other methods for noisy speech enhancement. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1269 / 1286
页数:18
相关论文
共 53 条
[1]  
Alam MJ, 2013, INT CONF ACOUST SPEE, P8071, DOI 10.1109/ICASSP.2013.6639237
[2]  
[Anonymous], 2000, INTERSPEECH, DOI DOI 10.1016/S0167-6393(03)00016-5
[3]  
[Anonymous], 2000, P ANN C INT SPEECH C
[4]  
Chinaev A, 2012, INT CONF ACOUST SPEE, P4041, DOI 10.1109/ICASSP.2012.6288805
[5]   Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging [J].
Cohen, I .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05) :466-475
[6]   Speech enhancement using super-Gaussian speech models and noncausal a priori SNR estimation [J].
Cohen, I .
SPEECH COMMUNICATION, 2005, 47 (03) :336-350
[7]  
Couvreur C, 2000, INT CONF ACOUST SPEE, P1719, DOI 10.1109/ICASSP.2000.862083
[8]   Histogram equalization of speech representation for robust speech recognition [J].
de la Torre, A ;
Peinado, AM ;
Segura, JC ;
Pérez-Córdoba, JL ;
Benítez, MC ;
Rubio, AJ .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03) :355-366
[9]   Enhancement of log Mel power spectra of speech using a phase-sensitive model of the-acoustic environment and sequential estimation of the corrupting noise [J].
Deng, L ;
Droppo, J ;
Acero, A .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (02) :133-143
[10]   ON THE APPLICATION OF HIDDEN MARKOV-MODELS FOR ENHANCING NOISY SPEECH [J].
EPHRAIM, Y ;
MALAH, D ;
JUANG, BH .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (12) :1846-1856