Exploiting Background Information in Knowledge Discovery from Text

被引:10
作者
Feldman R. [1 ]
Hirsh H. [2 ]
机构
[1] Math. and Comp. Science Department, Bar-Ilan University, Ramat-Gan
[2] Department of Computer Science, Rutgers University, Piscataway
基金
美国国家科学基金会;
关键词
Association mining; Background knowledge; Constraint processing; Query languages; Textual databases;
D O I
10.1023/A:1008693204338
中图分类号
学科分类号
摘要
This paper describes the FACT system for knowledge discovery from text. It discovers associations - patterns of co-occurrence - amongst keywords labeling the items in a collection of textual documents. In addition, when background knowledge is available about the keywords labeling the documents FACT is able to use this information in its discovery process. FACT takes a query-centered view of knowledge discovery, in which a discovery request is viewed as a query over the implicit set of possible results supported by a collection of documents, and where background knowledge is used to specify constraints on the desired results of this query process. Execution of a knowledge-discovery query is structured so that these background-knowledge constraints can be exploited in the search for possible results. Finally, rather than requiring a user to specify an explicit query expression in the knowledge-discovery query language, FACT presents the user with a simple-to-use graphical interface to the query language, with the language providing a well-defined semantics for the discovery actions performed by a user through the interface.
引用
收藏
页码:83 / 97
页数:14
相关论文
共 11 条
[1]  
Agrawal A., Imielinski T., Swami A., Mining association rules between sets of items in large databases, Proc. of the ACM SIGMOD Conference on Management of Data, pp. 207-216, (1993)
[2]  
Agrawal A., Srikant R., Fast algorithms for mining association rules, Proceedings of the VLDB Conference, (1994)
[3]  
Agrawal R., Mannila H., Srikant R., Toivonen H., Verkamo I., Fast Discovery of Association Rules, Advances in Knowledge Discovery and Data Mining, pp. 307-328, (1995)
[4]  
Apte C., Damerau F., Weiss S., Towards language independent automated learning of text categorization models, Proceedings of ACM-SIGIR Conference on Information Retrieval, (1994)
[5]  
Dagan I., Feldman R., Hirsh H., Keyword-based browsing and analysis of large document sets, Proceedings of SDAIR96, (1996)
[6]  
Feldman R., Dagan I., KDT - Knowledge discovery in texts, Proceedings of the First International Conference on Knowledge Discovery (KDD-95), (1995)
[7]  
Feldman R., Dagan I., Kloesgen W., Efficient algorithms for mining and manipulating associations in texts, Proceedings of EMCSR96, (1996)
[8]  
Imielinski T., Invited talk, The First International Conference on Knowledge Discovery (KDD-95), (1995)
[9]  
Iwayama M., Tokunaga T., A probabilistic model for text categorization based on a single random variable with multiple values, Proceedings of the 4th Conference on Applied Natural Language Processing, (1994)
[10]  
Klemettinen M., Mannila H., Ronkainen P., Toivonen H., Verkamo A., Finding interesting rules from large sets of discovered association rules, Proceedings of the 3rd International Conference on Information and Knowledge Management, (1994)