改进的说话人聚类初始化和GMM的多说话人识别

被引:5
作者
曹洁 [1 ]
余丽珍 [2 ]
机构
[1] 兰州理工大学计算机与通信学院
[2] 兰州理工大学电气工程与信息工程学院
关键词
多说话人识别; 改进的聚类初始化; 高斯混合模型; 平均类纯度;
D O I
暂无
中图分类号
TN912.34 [语音识别与设备];
学科分类号
0711 ;
摘要
针对多说话人聚类线性初始化方法精度较差的问题,提出了一种改进的聚类初始化方法。该方法引入BIC对由线性初始化产生的初始类进行检测分割,有效提升了说话人初始类纯度。最后将该方法应用到高斯混合模型(GMM)多说话人识别系统。实验结果表明,所提方法使说话人平均类纯度(ACP)提高了48.51%,系统的错误识别率平均降低12.09%。
引用
收藏
页码:590 / 593
页数:4
相关论文
共 9 条
[1]  
Towards audio-visual on-line diarization ofparticipants in group meetings. HUNG H,FRIEDLAND G. Proc of Workshop on Multi-cameraand Multi-modal Sensor Fusion Algorithms and Applications . 2008
[2]  
Multi-modal speaker diarizationof real-world meetings using compressed-domain video features. FRIEDLAND G,HUNG H,YEO C. Proc of International Conference on Audio,Speech and Signal Proces-sing . 2009
[3]   基于乘积HMM的双模态语音识别方法 [J].
赵晖 ;
顾亚强 ;
唐朝京 .
计算机工程, 2010, 36 (08) :7-9
[4]  
Using audio and visual cues for speakerdiarisation initialization. GARAU G,BOURLARD H. Proc of International Conference onAcoustics,Speech and Signal Processing . 2010
[5]  
Estimating the dom-inant person in multi-party conversations using speaker diarizationstrategies. HUNG H,HUANG Yan,FRIEDLAND G,et al. Proc of International Conference on Acoustics,Speechand Signal Processing . 2008
[6]  
Speaker diarization for mul-tiple-distant-microphone meetings using several sources of information. PARDO J,XNGUERA X,WOOTERS C. IEEE Transactions on Communications . 2007
[7]  
Audio-visual synchroni-sation for speaker diarisation. GARAU G,DIELMANN A,BOURLARD H. Proc of International Conferenceon Speech and Language Processing . 2010
[8]  
Multi-modal speaker di-arisation. NOULAS A,ENGLEBIENNE G,KROSE B. IEEE Trans on Pattern Analysis and Machine In-telligence . 2011
[9]  
Estimating domi-nance in multi-party meetings using speaker diarization. HUNG H,HUANG Yan,FRIEDLAND G,et al. IEEETrans on Audio,Speech and Language Processing . 2010