Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition

被引:77
作者
Gemmeke, Jort Florent [1 ]
Van Hamme, Hugo [2 ]
Cranen, Bert [1 ]
Boves, Lou [1 ]
机构
[1] Radboud Univ Nijmegen, Ctr Language & Speech Technol, NL-6500 HD Nijmegen, Netherlands
[2] Katholieke Univ Leuven, Dept Elect Engn, ESAT, B-3001 Heverlee, Belgium
关键词
Automatic speech recognition (ASR); compressive sensing (CS); missing data techniques; noise robustness; MASK ESTIMATION; SPARSE;
D O I
10.1109/JSTSP.2009.2039171
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An effective way to increase the noise robustness of automatic speech recognition is to label noisy speech features as either reliable or unreliable ( missing), and to replace ( impute) the missing ones by clean speech estimates. Conventional imputation techniques employ parametric models and impute the missing features on a frame-by-frame basis. At low signal-to-noise ratios (SNRs), these techniques fail, because too many time frames may contain few, if any, reliable features. In this paper, we introduce a novel non-parametric, exemplar-based method for reconstructing clean speech from noisy observations, based on techniques from the field of Compressive Sensing. The method, dubbed sparse imputation, can impute missing features using larger time windows such as entire words. Using an overcomplete dictionary of clean speech exemplars, the method finds the sparsest combination of exemplars that jointly approximate the reliable features of a noisy utterance. That linear combination of clean speech exemplars is used to replace the missing features. Recognition experiments on noisy isolated digits show that sparse imputation outperforms conventional imputation techniques at SNR = - dB when using an ideal 'oracle' mask. With error-prone estimated masks sparse imputation performs slightly worse than the best conventional technique.
引用
收藏
页码:272 / 287
页数:16
相关论文
共 55 条
  • [51] VANHAMME H, 2004, P ICSLP JEJ ISL KOR, P101
  • [52] VANSEGBROECK M, 2007, P ICSLP, P910
  • [53] Vizinho A., 1999, P EUR, V99, P2407
  • [54] WRIGHT J, 2009, IEEE T INF THEORY
  • [55] Robust Face Recognition via Sparse Representation
    Wright, John
    Yang, Allen Y.
    Ganesh, Arvind
    Sastry, S. Shankar
    Ma, Yi
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (02) : 210 - 227