Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors

被引:203
作者
Araki, Shoko
Sawada, Hiroshi
Mukai, Ryo
Makino, Shoji
机构
[1] NTT Corp, NTT Commun Sci Labs, Kyoto 6190237, Japan
[2] Hokkaido Univ, Grad Sch Informat Sci & Technol, Kita Ku, Sapporo, Hokkaido 0600814, Japan
关键词
blind source separation; sparseness; clustering; normalization; binary mask; speech separation; reverberation;
D O I
10.1016/j.sigpro.2007.02.003
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a new method for blind sparse source separation. Some sparse source separation methods, which. rely on source sparseness and an anechoic mixing model, have already been proposed. These methods utilize level ratios and phase differences between sensor observations as their features, and they separate signals by classifying them. However, some of the features cannot form clusters with a well-known clustering algorithm, e.g., the k-means. Moreover, most previous methods utilize a linear sensor array (or only two sensors), and therefore they cannot separate symmetrically positioned sources. To overcome such problems, we propose a new feature that can be clustered by the k-means algorithm and that can be easily applied to more than three sensors arranged non-linearly. We have obtained promising results for two- and three-dimensionally distributed speech separation with non-linear/non-uniform sensor arrays in a real room even in underdetermined situations. We also investigate the way in which the performance of such methods is affected by room reverberation, which may cause the sparseness and anechoic assumptions to collapse. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:1833 / 1847
页数:15
相关论文
共 30 条
  • [1] Multichannel blind deconvolution and equalization using the natural gradient
    Amari, S
    Douglas, SC
    Cichocki, A
    Yang, HH
    [J]. FIRST IEEE SIGNAL PROCESSING WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS, 1997, : 101 - 104
  • [2] ANEMULLER J, 2000, P 2 INT WORKSH IND C, P215
  • [3] [Anonymous], 1997, 3382 ISO
  • [4] [Anonymous], 2006, P ICASSP
  • [5] [Anonymous], 2001, P IEEE INT C IND COM
  • [6] Aoki M., 2001, Acoustical Science and Technology, V22, P149, DOI 10.1250/ast.22.149
  • [7] Araki S, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS, P881
  • [8] The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech
    Araki, S
    Mukai, R
    Makino, S
    Nishikawa, T
    Saruwatari, H
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (02): : 109 - 116
  • [9] ARAKI S, 2005, P INT WORKSH AC ECH, P117
  • [10] Araki S., 2005, P IEEE INT C AC SPEE, VIII, P81