WHIRL: A word-based information representation language

被引:23
作者
Cohen, WW [1 ]
机构
[1] AT&T Labs Res, Shannon Lab, Florham Park, NJ 07932 USA
关键词
knowledge representation; information retrieval; textual similarity; heterogeneous databases; information integration; text categorization; information extraction;
D O I
10.1016/S0004-3702(99)00102-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe WHIRL, an "information representation language" that synergistically combines properties of logic-based and text-based representation systems. WHIRL is a subset of Datalog that has been extended by introducing an atomic type for textual entities, an atomic operation for computing textual similarity, and a "soft" semantics; that is, inferences in WHIRL are associated with numeric scores, and presented to the user in decreasing order by score. This paper briefly describes WHIRL, and then surveys a number of applications. We show that WHIRL strictly generalizes both ranked retrieval of documents, and logical deduction; that nontrivial queries about large databases can be answered efficiently; that WHIRL can be used to accurately integrate data from heterogeneous information sources, such as those found on the Web; that WHIRL can be used effectively for inductive classification of text; and finally, that WHIRL can be used to semi-automatically generate extraction programs for structured documents. (C) 2000 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:163 / 196
页数:34
相关论文
共 44 条
[1]   AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION [J].
APTE, C ;
DAMERAU, F ;
WEISS, SM .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) :233-251
[2]  
ARENS Y, 1996, ADV PLANNING TECHNOL
[3]   THE MANAGEMENT OF PROBABILISTIC DATA [J].
BARBARA, D ;
GARCIAMOLINA, H ;
PORTER, D .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1992, 4 (05) :487-502
[4]  
BAYARDO RJ, 1997, P 1997 ACM SIGMOD C
[5]  
COHEN W, 1998, P ACM SIGMOD C MAN D
[6]  
Cohen W. W., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P169
[7]  
COHEN WEW, 1999, AUTONOMOUS AGENTS MU, P65
[8]  
COHEN WW, 1997, ADV NEURAL PROCESSIN, V10
[9]  
COHEN WW, 1996, P 19 ANN INT ACM SIG, P307
[10]  
Cohen WW, 1995, MACHINE LEARNING