Learning domain ontologies from document warehouses and dedicated web sites

被引：188

作者：

Navigli, R ^{[1
]}

Velardi, P ^{[1
]}

机构：

[1] Univ Roma La Sapienza, Dipartimento Informat, I-00198 Rome, Italy

来源：

COMPUTATIONAL LINGUISTICS | 2004年 / 30卷 / 02期

关键词：

D O I：

10.1162/089120104323093276

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a method and a tool, OntoLearn, aimed at the extraction of domain ontologies from Web sites, and more generally from documents shared among the members of virtual organizations. OntoLearn first extracts a domain terminology from available documents. Then, complex domain terms are semantically interpreted and arranged in a hierarchical fashion. Finally, a general-purpose ontology, WordNet, is trimmed and enriched with the detected domain concepts. The major novel aspect of this approach is semantic interpretation, that is, the association of a complex concept with a complex term. This involves finding the appropriate WordNet concept for each word of a terminological string and the appropriate conceptual relations that hold among the concept components. Semantic interpretation is based on a new word sense disambiguation algorithm, called structural semantic interconnections.

引用

页码：151 / 179

页数：29

共 33 条

[1]

AGIRRE E, 2000, ECAI ONT LEARN WORKS

[2]

ALFONSECA E, 2002, LANGUAGE RESOURCES E

[3] An empirical symbolic approach to natural language processing [J].

Basili, R ;

Pazienza, MT ;

Velardi, P .

ARTIFICIAL INTELLIGENCE, 1996, 85 (1-2) :59-99

[4]

BASILI R, 1998, P EUR C ART INT ECAI

[5]

Berland M., 1999, P 37 ANN M ASS COMP

[6]

Berners-Lee Tim., 1999, WEAVING WEB ORIGINAL

[7]

Bunke H., 1990, SYNTACTIC STRUCTURAL

[8]

Church K. W., 1989, ACL 89

[9]

DAELEMANS W, 1999, ILK9901 TIB U

[10]

FARQUHAR A, 1998, COLLABORATIVE ONTOLO

← 1 2 3 4 →