Robust speaker verification based on multi stage vector quantization of MFCC parameters on narrow bandwidth channels

被引:8
作者
Homayounpour, M. Mehdi [1 ]
Rezaian, Iman [1 ]
机构
[1] Amirkabir Univ Technol Tehran Polytech, Comp Engn & Informat Technol Dept, Lab Intelligent Sount & Speech Proc, Tehran, Iran
来源
10TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY, VOLS I-III: INNOVATIONS TOWARD FUTURE NETWORKS AND SERVICES | 2008年
关键词
speaker verification; robust speaker verification; multi stage vector quantization; codebook design; noisy conditions;
D O I
10.1109/ICACT.2008.4493773
中图分类号
TP31 [计算机软件];
学科分类号
081202 [计算机软件与理论]; 0835 [软件工程];
摘要
This paper presents a very low bit rate and robust client-server-based speaker verification system using MFCC parameters. Two aspects are proposed and assessed including very low bit rate transmission of test utterance feature vectors from client to server, and robust speaker verification in situations where training and test environment noise conditions including noise types and SNRs are different and unknown for speaker verification system. Very low bit rate transmission of feature vectors are achieved using multi stage vector quantization technique (MSVQ). This technique is used for quantization of MFCC feature vectors obtained from speaker's utterance in client side. This leads to significant bits per frame (bpf) reduction from 416 bpf for transmission of 13 dimensional MFCC feature vectors to 36 bpf i.e. 3600 bps. Robust speaker verification is achieved when instead of training only a speaker model using clean data, several speaker models are trained using a limited number of noises in different SNRs. This leads to very good performances even for conditions where test environment noise types and SNRs are different from those of training phase. The results of conducted experiments approve the effectiveness of the proposed methods.
引用
收藏
页码:336 / 340
页数:5
相关论文
共 7 条
[1]
[Anonymous], 1988, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
[2]
Chu WaiC., 2003, SPEECH CODING ALGORI
[3]
GODDEAU D, 1997, P EUR RHOD GREEC SEP, P685
[4]
GRASSI S, 2002, P 3 COST 276 WORKSH, P120
[5]
RAMASWAMY GN, 1998, P ICASSP 98
[6]
ROSE RC, 1990, P IEEE INT C AC SPEE, V1, P293
[7]
SOKOLOV M, 1997, P EUR RHOD GREEC SEP, P847