Scaling question answering to the Web

被引:135
作者
Kwok, C [1 ]
Etzioni, O [1 ]
Weld, D [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
algorithms; design; experimentation; human factors; languages; performance; search engines; natural language processing; query formulation; answer extraction; answer selection;
D O I
10.1145/502115.502117
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The wealth of information on the web makes it an attractive resource for seeking quick answers to simple, factual questions such as "who was the first American in space?" or "what is the second tallest mountain in the world?" Yet today's most advanced web search services (e.g., Google and AskJeeves) make it surprisingly tedious to locate answers to such questions, In this paper, we extend question-answering techniques, first studied in the information retrieval literature, to the web and experimentally evaluate their performance, First we introduce MULDER, which we believe to be the first general-purpose, fully-automated question-answering system available on the web. Second, we describe MULDER's architecture, which relies on multiple search-engine queries, natural-language parsing, and a novel voting procedure to yield reliable answers coupled with high recall. Finally, we compare MULDER's performance to that of Google and AskJeeves on questions drawn from the TREC-8 question answering track. We find that MULDER's recall is more than a factor of three higher than that of AskJeeves, In addition, we find that Google requires 6.6 times as much user effort to achieve the same level of recall as MULDER.
引用
收藏
页码:242 / 262
页数:21
相关论文
共 30 条
[1]  
Akmajian Adrian., 1975, Introduction to the Principles of Transformational Syntax
[2]  
[Anonymous], INT J LEXICOGRAPHY
[3]  
[Anonymous], 2000, P 23 ANN INT ACM SIG
[4]  
[Anonymous], CS9912 BROWN U
[5]  
Antworth EL, 1990, PC KIMMO 2 LEVEL PRO
[6]  
*ARPA, 1998, P 7 MESS UND C SAN F
[7]  
Bikel D.M., 1997, Proceedings of the fifth conference on Applied natural language processing. Association for Computational Linguistics, P194
[8]  
Brin S., 1998, 7 INT WORLD WIDE WEB
[9]  
Buckley C, 1995, NIST SPECIAL PUBLICA, P69
[10]  
BURKE RD, 1997, TR9705 U CHIC DEP CO