Learning rules for conceptual structure on theWeb

被引:16
作者
Han, H [1 ]
Elmasri, R
机构
[1] Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USA
[2] Univ Texas, Dept Comp Sci & Engn, Arlington, TX 76019 USA
关键词
ontology learning; information extraction; relational rule learning; knowledge discovery;
D O I
10.1023/B:JIIS.0000019278.84222.b7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
This paper presents an infrastructure and methodology to extract conceptual structure from Web pages, which are mainly constructed by HTML tags and incomplete text. Human beings can easily read Web pages and grasp an idea about the conceptual structure of underlying data, but cannot handle excessive amounts of data due to lack of patience and time. However, it is extremely difficult for machines to accurately determine the content of Web pages due to lack of understanding of context and semantics. Our work provides a methodology and infrastructure to process Web data and extract the underlying conceptual structure, in particular relationships between ontological concepts using Inductive Logic Programming in order to help with automating the processing of the excessive amount of Web data by capturing its conceptual structures.
引用
收藏
页码:237 / 256
页数:20
相关论文
共 23 条
[1]
AGICHTEIN E, 2001, P 5 ACM INT C DIG LI
[2]
BRILL E, 1995, COMPUTATIONAL LINGUI
[3]
BRIN S, 1998, ACM WEBDB WORKSH
[4]
CALIFF ME, 1998, THESIS U TEXAS AUSTI
[5]
Relational learning with statistical predicate invention: Better models for hypertext [J].
Craven, M ;
Slattery, S .
MACHINE LEARNING, 2001, 43 (1-2) :97-119
[6]
CRAVEN M., 1999, ARTIFICIAL INTELLIGE
[7]
Elmasri R., 2000, Fundamentals of Database Systems, V3
[8]
EMBLEY DW, 2001, P 20 INT C CONC MOD
[9]
EMBLEY DW, 1998, P 2 INT C CONC MOD
[10]
FLORESCU D, 1998, ACM SIGMOD RECORD, V27