Permutation correction in the frequency domain in blind separation of speech mixtures

被引:8
作者
Serviere, Ch.
Pham, D. T.
机构
[1] Lab Images & Signaux, F-38402 St Martin Dheres, France
[2] Lab Modelisat & Calcul, F-38041 Grenoble, France
关键词
Frequency Domain; Quantum Information; Speech Signal; Frequency Bandwidth; Domain Approach;
D O I
10.1155/ASP/2006/75206
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a method for blind separation of convolutive mixtures of speech signals, based on the joint diagonalization of the time varying spectral matrices of the observation records. The main and still largely open problem in a frequency domain approach is permutation ambiguity. In an earlier paper of the authors, the continuity of the frequency response of the unmixing filters is exploited, but it leaves some frequency permutation jumps. This paper therefore proposes a new method based on two assumptions. The frequency continuity of the unmixing filters is still used in the initialization of the diagonalization algorithm. Then, the paper introduces a new method based on the time-frequency representations of the sources. They are assumed to vary smoothly with frequency. This hypothesis of the continuity of the time variation of the source energy is exploited on a sliding frequency bandwidth. It allows us to detect the remaining frequency permutation jumps. The method is compared with other approaches and results on real world recordings demonstrate superior performances of the proposed algorithm. Copyright (C) 2006 Hindawi Publishing Corporation. All rights reserved.
引用
收藏
页数:16
相关论文
共 31 条
[1]  
ANEMULLER J, 2000, P 2 INT WORKSH IND C, P215
[2]  
Asano F, 2001, INT CONF ACOUST SPEE, P2729, DOI 10.1109/ICASSP.2001.940210
[3]  
Ikeda S, 1998, ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, P737
[4]  
IKRAM MZ, 2002, P ICASSP 02, V1, P881
[5]  
Kamata K, 2004, LECT NOTES COMPUT SC, V3195, P849
[6]  
KNAAK M, 2003, P INT S IND COMP AN, P951
[7]  
Kurita S, 2000, INT CONF ACOUST SPEE, P3140, DOI 10.1109/ICASSP.2000.861203
[8]   A NEURAL-NET FOR BLIND SEPARATION OF NONSTATIONARY SIGNALS [J].
MATSUOKA, K ;
OHYA, M ;
KAWAMOTO, M .
NEURAL NETWORKS, 1995, 8 (03) :411-419
[9]  
Matsuoka K., 2001, P INT C IND COMP AN, P722
[10]  
MITIANOUDIS N, 2004, P INT C IND COMP AN, P669