Under-Determined Reverberant Audio Source Separation Using Local Observed Covariance and Auditory-Motivated Time-Frequency Representation

被引:14
作者
Duong, Ngoc Q. K. [1 ]
Vincent, Emmanuel [1 ]
Gribonval, Remi [1 ]
机构
[1] Ctr Inria Rennes Bretagne Atlantique, INRIA, Rennes, France
来源
LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION | 2010年 / 6365卷
关键词
Under-determined convolutive source separation; nonuniform time-frequency representation; ERB transform; full-rank spatial covariance model;
D O I
10.1007/978-3-642-15995-4_10
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the local Gaussian modeling framework for under-determined convolutive audio source separation, where the spatial image of each source is modeled as a zero-mean Gaussian variable with full-rank time- and frequency-dependent covariance. We investigate two methods to improve the accuracy of parameter estimation, based on the use of local observed covariance and auditory-motivated time-frequency representation. We derive an iterative expectation-maximization (EM) algorithm with a suitable initialization scheme. Experimental results over stereo synthetic reverberant mixtures of speech show the effectiveness of the proposed methods.
引用
收藏
页码:73 / 80
页数:8
相关论文
共 12 条
[1]  
Burred J., 2006, P 121 AES CONV OCT
[2]  
Deville Y., 2003, Proc. ICA, P1059
[3]  
Duong N.Q.K., 2010, IEEE T AUDI IN PRESS
[4]   Maximum likelihood approach for blind audio source separation using time-frequency Gaussian source models [J].
Févotte, C ;
Cardoso, JF .
2005 WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2005, :78-81
[5]   Speech segregation based on sound localization [J].
Roman, N ;
Wang, DL ;
Brown, GJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 114 (04) :2236-2252
[6]   Grouping separated frequency components by estimating propagation model parameters in frequency-domain blind source separation [J].
Sawada, Hiroshi ;
Araki, Shoko ;
Mukai, Ryo ;
Makino, Shoji .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05) :1592-1604
[7]   Musical source separation using time-frequency source priors [J].
Vincent, E .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01) :91-98
[8]  
Vincent E., MACHINE AUD IN PRESS
[9]  
Vincent E, 2007, LECT NOTES COMPUT SC, V4666, P552
[10]  
Vincent E, 2009, LECT NOTES COMPUT SC, V5441, P775, DOI 10.1007/978-3-642-00599-2_97