An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email

被引:90
作者
Walker, MA [1 ]
机构
[1] AT&T Labs Res, Shannon Lab, Florham Pk, NJ 07932 USA
关键词
D O I
10.1613/jair.713
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a novel method by which a spoken dialogue system can learn to choose an optimal dialogue strategy from its experience interacting with human users. The method is based on a combination of reinforcement learning and performance modeling of spoken dialogue systems. The reinforcement learning component applies Q-learning (Watkins, 1989), while the performance modeling component applies the PARADISE evaluation framework (Walker et al., 1997) to learn the performance function (reward) used in reinforcement learning. We illustrate the method with a spoken dialogue system named ELVIS (EmaiL Voice Interactive System), that supports access to email over the phone. We conduct a set of experiments for training an optimal dialogue strategy on a corpus of 219 dialogues in which human users interact with ELVIS over the phone. We then test that strategy on a corpus of 18 dialogues. We show that ELVIS can learn to optimize its strategy selection for agent initiative, for reading messages, and for summarizing email folders.
引用
收藏
页码:387 / 416
页数:30
相关论文
共 66 条
[1]  
ALLEN JF, 1979, PLAN BASED APPROACH
[2]  
[Anonymous], AUTOMATIC SPEECH SPE, DOI DOI 10.1007/978-1-4613-1367-0_1
[3]  
[Anonymous], 1972, UNDERSTANDING NATURA
[4]  
[Anonymous], P EUR C SPEECH COMM
[5]  
[Anonymous], 1994, SPOKEN NATURAL LANGU
[6]  
BAGGIA P, 1998, INTERACTIVE VOICE TE, P97
[7]   LEARNING TO ACT USING REAL-TIME DYNAMIC-PROGRAMMING [J].
BARTO, AG ;
BRADTKE, SJ ;
SINGH, SP .
ARTIFICIAL INTELLIGENCE, 1995, 72 (1-2) :81-138
[8]   DYNAMIC PROGRAMMING [J].
BELLMAN, R .
SCIENCE, 1966, 153 (3731) :34-&
[9]  
Biermann Alan W., 1996, Proceedings of the 1996 International Symposium on Spoken Dialogue, P97
[10]  
BRUCE B, 1975, AI21