Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

被引:60
作者
Kuklasinski, Adam [1 ,2 ]
Doclo, Simon [3 ,4 ]
Jensen, Soren Holdt [2 ]
Jensen, Jesper [1 ,2 ]
机构
[1] Oticon AS, DK-2765 Smrum, Denmark
[2] Aalborg Univ, Dept Elect Syst, Signal & Informat Proc Sect, DK-9220 Aalborg, Denmark
[3] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, D-26111 Oldenburg, Germany
[4] Carl von Ossietzky Univ Oldenburg, Cluster Excellence Hearing4all, D-26111 Oldenburg, Germany
关键词
Cramer-Rao lower bound; maximum likelihood estimation; microphone array; PSD estimation; reverberation; MULTICHANNEL WIENER FILTER; SQUARE ERROR ESTIMATION; COEFFICIENTS; REDUCTION;
D O I
10.1109/TASLP.2016.2573591
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this contribution, we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML PSD estimation scheme that is suitable for sound scenes which besides speech and reverberation consists of an additional noise component whose second-order statistics are known. The proposed algorithm is shown to outperform an existing similar algorithm in terms of PSD estimation accuracy. Moreover, it is shown numerically that the mean-squared estimation error achieved by the proposed method is near the limit set by the corresponding Cramer-Rao lower bound. The speech dereverberation performance of a multichannel Wiener filter based on the proposed PSD estimators is measured using several instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements.
引用
收藏
页码:1599 / 1612
页数:14
相关论文
共 51 条
[1]   MULTI-MICROPHONE SIGNAL-PROCESSING TECHNIQUE TO REMOVE ROOM REVERBERATION FROM SPEECH SIGNALS [J].
ALLEN, JB ;
BERKLEY, DA ;
BLAUERT, J .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 (04) :912-915
[2]  
[Anonymous], 1985, Matrix Analysis
[3]  
[Anonymous], P WORKSH CIRC SYST S
[4]  
[Anonymous], 2012, Technical University of Denmark, DOI DOI 10.1017/CBO9780511470943.008
[5]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[6]  
[Anonymous], 2011, INT ENCY STAT SCI
[7]  
Benesty J, 2005, SIG COM TEC, P1, DOI 10.1007/3-540-27489-8_1
[8]   On the importance of early reflections for speech in rooms [J].
Bradley, JS ;
Sato, H ;
Picard, M .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 113 (06) :3233-3244
[9]  
Braun S, 2013, 2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO)
[10]   Speech enhancement using a noncausal a priori SNR estimator [J].
Cohen, I .
IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (09) :725-728