Mining the Web to create specialized glossaries

被引：24

作者：

Velardi, Paola ^{[1
]}

Navigli, Roberto ^{[1
]}

D'Amadio, Pierluigi

机构：

[1] Univ Roma La Sapienza, Dept Comp Sci, I-00185 Rome, Italy

来源：

IEEE INTELLIGENT SYSTEMS | 2008年 / 23卷 / 05期

关键词：

D O I：

10.1109/MIS.2008.88

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A step in establishing a Web community's knowledge domain involves collecting a glossary of domain-relevant terms that constitute the linguistic surface manifestation of domain concepts. TermExtractor and GlossExtractor are two two Web-mining-based applications that support glossary building by exploiting the Web's evolving nature to allow continuous updating of an emerging community's vocabulary. These tools acquire a glossary's two basic components, such as terms and definitions where the terms are harvested from domain text corpora and the definitions are extracted from different types of Web pages.

引用

页码：18 / 25

页数：8

共 13 条

[1]

Androutsopoulos I., 2004, P 20 INT C COMP LING, P1360

[2]

Androutsopoulos I., 2005, P HUM LANG TECHN C C, P323

[3]

Bontas E. P., 2005, P 3 BERL XML TAG HUM, P153

[4] Soft pattern matching models for definitional question answering [J].

Cui, Hang ;

Kan, Min-Yen ;

Chua, Tatseng .

ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2007, 25 (02)

[5]

Fujii A, 2000, 38TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P488

[6]

Hearst MA, 1992, P 14 INT C COMP LING, V2, P539, DOI DOI 10.3115/992133.992154

[7]

Klavans JL, 2001, J AM MED INFORM ASSN, P324

[8] Learning domain ontologies from document warehouses and dedicated web sites [J].

Navigli, R ;

Velardi, P .

COMPUTATIONAL LINGUISTICS, 2004, 30 (02) :151-179

[9]

Ng HT, 2001, PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P67

[10]

PARK Y, 2002, P 19 INT C COMP LING, P1, DOI DOI 10.3115/1072228.1072351

← 1 2 →