Bidirectional LSTM Networks for Context-Sensitive Keyword Detection in a Cognitive Virtual Agent Framework

被引:62
作者
Woellmer, Martin [1 ]
Eyben, Florian [1 ]
Graves, Alex [2 ]
Schuller, Bjoern [1 ]
Rigoll, Gerhard [1 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-80290 Munich, Germany
[2] Tech Univ Munich, Inst Comp Sci 6, D-85748 Munich, Germany
关键词
Keyword spotting; Long short-term memory; Dynamic bayesian networks; Cognitive systems; Virtual agents; LONG-TERM DEPENDENCIES; ARCHITECTURES; EMOTION;
D O I
10.1007/s12559-010-9041-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robustly detecting keywords in human speech is an important precondition for cognitive systems, which aim at intelligently interacting with users. Conventional techniques for keyword spotting usually show good performance when evaluated on well articulated read speech. However, modeling natural, spontaneous, and emotionally colored speech is challenging for today's speech recognition systems and thus requires novel approaches with enhanced robustness. In this article, we propose a new architecture for vocabulary independent keyword detection as needed for cognitive virtual agents such as the SEMAINE system. Our word spotting model is composed of a Dynamic Bayesian Network (DBN) and a bidirectional Long Short-Term Memory (BLSTM) recurrent neural net. The BLSTM network uses a self-learned amount of contextual information to provide a discrete phoneme prediction feature for the DBN, which is able to distinguish between keywords and arbitrary speech. We evaluate our Tandem BLSTM-DBN technique on both read speech and spontaneous emotional speech and show that our method significantly outperforms conventional Hidden Markov Model-based approaches for both application scenarios.
引用
收藏
页码:180 / 190
页数:11
相关论文
共 65 条
[1]  
[Anonymous], 2007, INT C NEUR INF PROC
[2]  
[Anonymous], 2005, Neural Netw.
[3]  
[Anonymous], P 4 INT WORKSH HUM C
[4]  
[Anonymous], P IEEE INT C AC SPEE
[5]  
[Anonymous], 1996, An introduction to Bayesian networks
[6]  
[Anonymous], 2006, HTK BOOK V3 4
[7]   A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T ;
SOULES, G ;
WEISS, N .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01) :164-&
[8]  
Benayed Y, 2003, P ICASSP, P588
[9]  
Bengio S., 2003, ADV NIPS, V15, P1
[10]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166