Semi-automatic wrapper generation for Internet information sources

被引:27
作者
Ashish, N
Knoblock, CA
机构
来源
PROCEEDINGS OF THE SECOND IFCIS INTERNATIONAL CONFERENCE ON COOPERATIVE INFORMATION SYSTEMS - COOPIS'97 | 1997年
关键词
D O I
10.1109/COOPIS.1997.613813
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To simplify the task of obtaining information from the vast number of information sources that are available on the World Wide Web (WWW), we are building information mediators for extracting and integrating data from multiple Web sources. In a mediator based approach, wrappers are built around individual information sources to translate between the mediator query language and the individual sources. We present an approach for semi-automatically generating wrappers for structured internet sources. The key idea is to exploit formatting information in Web pages to hypothesize the underlying structure of a page. From this structure the system generates a wrapper that facilitates querying of a source and possibly integrating it with other sources. We demonstrate the ease with which we are able to build wrappers for a number of Web sources using our implemented wrapper generation toolkit.
引用
收藏
页码:160 / 169
页数:10
相关论文
empty
未找到相关数据