The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech

被引：174

作者：

Araki, S ^{[1
]}

Mukai, R

Makino, S

Nishikawa, T

Saruwatari, H

机构：

[1] NTT Corp, NTT Commun Sci Labs, Kyoto 6190237, Japan

[2] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara 6300192, Japan

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2003年 / 11卷 / 02期

关键词：

blind source separation; convolutive mixture; frame size; frequency domain; independent component analysis; reverberant speech;

D O I：

10.1109/TSA.2003.809193

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Despite several recent proposals to achieve blind source separation (BSS) for realistic acoustic signals, the separation performance is still not good. enough. In particular, when the impulse responses are long, performance is highly limited. In this paper, we consider a two-input, two-output convolutive BSS problem. First, we show that it is not good to be constrained by the condition T > P, where T is the frame length of the DFT and P is the length of the room impulse responses. We show that there is an optimum frame size. that is determined by the trade-off between maintaining the number of samples in each frequency bin to estimate statistics and covering the whole reverberation. We also clarify the reason for. the poor performance of BSS in long reverberant environments, highlighting that the framework of BSS works as two sets of frequency-domain adaptive beamformers. Although BSS can reduce reverberant sounds to some extent like adaptive beamformers, they mainly remove the sounds from the jammer direction. This is the reason for the difficulty of BSS in reverberant environments.

引用

页码：109 / 116

页数：8

共 18 条

[1] Natural gradient works efficiently in learning
Amari, S
[J]. NEURAL COMPUTATION, 1998, 10 (02) : 251 - 276
[2] ARAKI S, 2001, P EUR 2001, P2595
[3] AN INFORMATION MAXIMIZATION APPROACH TO BLIND SEPARATION AND BLIND DECONVOLUTION
BELL, AJ
SEJNOWSKI, TJ
[J]. NEURAL COMPUTATION, 1995, 7 (06) : 1129 - 1159
[4] BLIND BEAMFORMING FOR NON-GAUSSIAN SIGNALS
CARDOSO, JF
SOULOUMIAC, A
[J]. IEE PROCEEDINGS-F RADAR AND SIGNAL PROCESSING, 1993, 140 (06) : 362 - 370
[5] GERVEN S, 1995, IEEE T SIGNAL PROCES, V43, P1602
[6] Haykin S., 2000, Unsupervised Adaptive Filtering-Volume 1: Blind Source Separation
[7] Ikeda S., 1999, P INT WORKSH IND COM, P365
[8] Ikram MZ, 2000, INT CONF ACOUST SPEE, P1041
[9] KAWAMOTO M, 1999, P 1 INT WORKSH IND C, P347
[10] Kurita S, 2000, INT CONF ACOUST SPEE, P3140, DOI 10.1109/ICASSP.2000.861203

← 1 2 →