Learning to Understand Information on the Internet: An Example-Based Approach

被引:20
作者
Perkowitz M. [1 ]
Doorenbos R.B. [1 ]
Etzioni O. [1 ]
Weld D.S. [1 ]
机构
[1] Dept. of Comp. Sci. and Engineering, Box 352350, University of Washington, Seattle
基金
美国国家科学基金会;
关键词
Internet; Machine learning;
D O I
10.1023/A:1008672508721
中图分类号
学科分类号
摘要
The explosive growth of the Web has made intelligent software assistants increasingly necessary for ordinary computer users. Both traditional approaches - search engines, hierarchical indices - and intelligent software agents require significant amounts of human effort to keep up with the Web. As an alternative, we investigate the problem of automatically learning to interact with information sources on the Internet. We report on ShopBot and ILA, two implemented agents that learn to use such resources. ShopBot learns how to extract information from online vendors using only minimal knowledge about product domains. Given the home pages of several online stores, ShopBot autonomously learns how to shop at those vendors. After its learning is complete, ShopBot is able to speedily visit over a dozen software stores and CD vendors, extract product information, and summarize the results for the user. ILA learns to translate information from Internet sources into its own internal concepts. ILA builds a model of an information source that specifies the translation between the source's output and ILA's model of the world. ILA is capable of leveraging a small amount of knowledge about a domain to learn models of many information sources. We show that ILA's learning is fast and accurate, requiring only a small number of queries per information source.
引用
收藏
页码:133 / 153
页数:20
相关论文
共 28 条
[1]  
Agre P., Chapman D., Pengi: An implementation of a theory of activity, Proc. 6th Nat. Conf. on AI, (1987)
[2]  
Agre P., Horswill I., Cultural support for improvisation, Proc. 10th Nat. Conf. on AI, pp. 363-368, (1992)
[3]  
Arens Y., Chee C.Y., Hsu C.-N., Knoblock C.A., Retrieving and Integrating Data from Multiple Information Sources, International Journal on Intelligent and Cooperative Information Systems, 2, 2, pp. 127-158, (1993)
[4]  
Armstrong R., Freitag D., Joachims T., Mitchell T., Webwatcher: A learning apprentice for the world wide web, Working Notes of the AAAI Spring Symposium: Information Gathering from Heterogeneous, Distributed Environments, pp. 6-12, (1995)
[5]  
Berwick R.C., Pilato S., Learning Syntax by Automata Induction, Machine Learning, 2, pp. 9-38, (1987)
[6]  
Dent L., Boticario J., McDermott J., Mitchell T., Zabowski D., A personal learning apprentice, Proc. 10th Nat. Conf. on AI, pp. 96-103, (1992)
[7]  
Doorenbos R.B., Etzioni O., Weld D.S., A Scalable Comparison-Shopping Agent for the World-Wide Web, (1996)
[8]  
Etzioni O., Weld D., A Softbot-Based Interface to the Internet, CACM, 37, 7, pp. 72-76, (1994)
[9]  
Hammond K., Burke R., Martin C., Lytinen S., FAQ finder: A case-based approach to knowledge navigation, Working Notes of the AAAI Spring Symposium: Information Gathering from Heterogeneous, Distributed Environments, pp. 69-73, (1995)
[10]  
Horswill I., Analysis of Adaptation and Environment, Artificial Intelligence, 73, 1-2, pp. 1-30, (1995)