A New Model to Compute the Information Content of Concepts from Taxonomic Knowledge

被引:41
作者
Sanchez, David [1 ]
Batet, Montserrat [1 ]
机构
[1] Univ Rovira & Virgili, Comp Sci & Math Dept, Tarragona, Catalonia, Spain
关键词
Computational Linguistics; Information Content; Knowledge Management; Ontologies; Semantic Similarity; SEMANTIC SIMILARITY ESTIMATION; RELATEDNESS;
D O I
10.4018/jswis.2012040102
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
The Information Content (IC) of a concept quantifies the amount of information it provides when appearing in a context. In the past, IC used to be computed as a function of concept appearance probabilities in corpora, but corpora-dependency and data sparseness hampered results. Recently, some other authors tried to overcome previous approaches, estimating IC from the knowledge modeled in an ontology. In this paper, the authors develop this idea, by proposing a new model to compute the IC of a concept exploiting the taxonomic knowledge modeled in an ontology. In comparison with related works, their proposal aims to better capture semantic evidences found in the ontology. To test the authors' approach, they have applied it to well-known semantic similarity measures, which were evaluated using standard benchmarks. Results show that the use of the authors' model produces, in most cases, more accurate similarity estimations than related works.
引用
收藏
页码:34 / 50
页数:17
相关论文
共 53 条
[1]
[Anonymous], 2009, N AM CHAPTER ASS COM
[2]
[Anonymous], 1997, P 10 RES COMPUTATION
[3]
Batet M., APPL INTELL IN PRESS
[4]
An ontology-based measure to compute semantic similarity in biomedicine [J].
Batet, Montserrat ;
Sanchez, David ;
Valls, Aida .
JOURNAL OF BIOMEDICAL INFORMATICS, 2011, 44 (01) :118-125
[5]
BUDANITSKY A, 2001, WORKSH WORDNET OTH L, P10
[6]
Budanitsky A, 2006, COMPUT LINGUIST, V32, P13, DOI 10.1162/coli.2006.32.1.13
[7]
Curran JR, 2002, PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P222
[8]
Devitt A., 2004, P 2 GLOBAL WORDNET C, P106
[9]
Ding L., 2004, P 13 ACM INT C INF K, P652, DOI DOI 10.1145/1031171.1031289
[10]
Inductive Classification of Semantically Annotated Resources through Reduced Coulomb Energy Networks [J].
Fanizzi, Nicola ;
d'Amato, Claudia ;
Esposito, Floriana .
INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2009, 5 (04) :19-38