改进的说话人聚类初始化和GMM的多说话人识别

被引：5

作者：

曹洁 ^{[1
]}

余丽珍 ^{[2
]}

机构：

[1] 兰州理工大学计算机与通信学院

[2] 兰州理工大学电气工程与信息工程学院

来源：

计算机应用研究 | 2012年 / 29卷 / 02期

关键词：

多说话人识别; 改进的聚类初始化; 高斯混合模型; 平均类纯度;

D O I：

暂无

中图分类号：

TN912.34 [语音识别与设备];

学科分类号：

0711 ;

摘要：

针对多说话人聚类线性初始化方法精度较差的问题,提出了一种改进的聚类初始化方法。该方法引入BIC对由线性初始化产生的初始类进行检测分割,有效提升了说话人初始类纯度。最后将该方法应用到高斯混合模型(GMM)多说话人识别系统。实验结果表明,所提方法使说话人平均类纯度(ACP)提高了48.51%,系统的错误识别率平均降低12.09%。

引用

页码：590 / 593

页数：4

共 9 条

[1]

Towards audio-visual on-line diarization ofparticipants in group meetings. HUNG H,FRIEDLAND G. Proc of Workshop on Multi-cameraand Multi-modal Sensor Fusion Algorithms and Applications . 2008

[2]

Multi-modal speaker diarizationof real-world meetings using compressed-domain video features. FRIEDLAND G,HUNG H,YEO C. Proc of International Conference on Audio,Speech and Signal Proces-sing . 2009

[3] 基于乘积HMM的双模态语音识别方法 [J].

赵晖 ;

顾亚强 ;

唐朝京 .

计算机工程, 2010, 36 (08) :7-9

[4]

Using audio and visual cues for speakerdiarisation initialization. GARAU G,BOURLARD H. Proc of International Conference onAcoustics,Speech and Signal Processing . 2010

[5]

Estimating the dom-inant person in multi-party conversations using speaker diarizationstrategies. HUNG H,HUANG Yan,FRIEDLAND G,et al. Proc of International Conference on Acoustics,Speechand Signal Processing . 2008

[6]

Speaker diarization for mul-tiple-distant-microphone meetings using several sources of information. PARDO J,XNGUERA X,WOOTERS C. IEEE Transactions on Communications . 2007

[7]

Audio-visual synchroni-sation for speaker diarisation. GARAU G,DIELMANN A,BOURLARD H. Proc of International Conferenceon Speech and Language Processing . 2010

[8]

Multi-modal speaker di-arisation. NOULAS A,ENGLEBIENNE G,KROSE B. IEEE Trans on Pattern Analysis and Machine In-telligence . 2011

[9]

Estimating domi-nance in multi-party meetings using speaker diarization. HUNG H,HUANG Yan,FRIEDLAND G,et al. IEEETrans on Audio,Speech and Language Processing . 2010

← 1 →