基于动态时间规整的语音样例快速检索算法

被引:7
作者
张连海
冯志远
陈琦
李勃昊
机构
[1] 信息工程大学信息系统工程学院
关键词
语音样例检索; 音素后验概率; 分段累积近似下界估计; 动态时间规整; 内积距离;
D O I
暂无
中图分类号
TN912.3 [语音信号处理];
学科分类号
0711 ;
摘要
为了提高基于DTW算法的语音检索系统的速度,提出了一种基于分段累积近似下界估计的动态时间规整算法,实现语音样例快速检索。该方法首先提取查询样例和测试集的音素后验概率作为特征参数,然后计算语音样例和测试集中所有候选分段实际动态规整得分的分段累积近似下界估计,最后采用K-最近邻算法与动态时间规整算法搜索与语音样例相似度最高的区域。实验结果表明,此算法的检索速度比直接运用DTW算法快6.32倍,而对其检索精度无任何影响。
引用
收藏
页码:1688 / 1692
页数:5
相关论文
共 17 条
[1]  
An Inner-product Lower-bound Estimate for Dynamic Time Warping. Yaodong Zhang,James R Glass. Proc of the Internationa Conference on ICASSP . 2011
[2]  
Phoneme recognition based on long temporal context. Schwarz P. . 2008
[3]  
Exact indexing of dynamic time warping. KEOGH E. Proc of the28th International Conference on Very Large Data Bases . 2002
[4]  
The NICO artificial neural network toolkit. STROM N. http://nico.nikkostrom.com .
[5]  
Towards multi-speaker unsupervised speech pattern discovery. ZHANG Yao-dong,GLASS J R. Proc of IEEE International Conference on Acoustics,Speech,and Signal Processing . 2010
[6]  
Analysis of phone posterior feature space exploiting class specific sparsity and MLP based similarity measure. ASAEI A,PICART B,BOURLARD H. Proc of IEEE International Conference on Acoustics,Speech,and Signal Processing . 2010
[7]  
Novel methods for query selection and query combination in query-by-example spoken term detection. TEJEDOR J,SZKE I,FAPO M. Proc of International Workshop on Searching Spontaneous Conversational Speech . 2010
[8]  
Lattice-based search for spoken utterance retrieval. SARACLAR M,SPROAT R W. Proc of Human Language Technologies:The Annual Conference of the North American Chapter of the Association for Computational Linguistics . 2004
[9]  
A comparison of query-byexample methods for spoken term detection. SHEN Wa-de,WHITE C M,HAZEN T J. Proc of the 10th Annual Conference International Speech Communication Association . 2009
[10]   Pitch histograms in audio and symbolic music information retrieval [J].
Tzanetakis, G ;
Ermolinskyi, A ;
Cook, P .
JOURNAL OF NEW MUSIC RESEARCH, 2003, 32 (02) :143-152