Unveiling the relationship between complex networks metrics and word senses

被引:35
作者
Amancio, Diego R. [1 ]
Oliveira, Osvaldo N., Jr. [1 ]
Costa, Luciano da F. [1 ]
机构
[1] Univ Sao Paulo, Inst Phys Sao Carlos, POB 369, BR-13560970 Sao Paulo, Brazil
基金
巴西圣保罗研究基金会;
关键词
D O I
10.1209/0295-5075/98/18002
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The automatic disambiguation of word senses (i.e., the identification of which of the meanings is used in a given context for a word that has multiple meanings) is essential for such applications as machine translation and information retrieval, and represents a key step for developing the so-called Semantic Web. Humans disambiguate words in a straightforward fashion, but this does not apply to computers. In this paper we address the problem of Word Sense Disambiguation (WSD) by treating texts as complex networks, and show that word senses can be distinguished upon characterizing the local structure around ambiguous words. Our goal was not to obtain the best possible disambiguation system, but we nevertheless found that in half of the cases our approach outperforms traditional shallow methods. We show that the hierarchical connectivity and clustering of words are usually the most relevant features for WSD. The results reported here shed light on the relationship between semantic and structural parameters of complex networks. They also indicate that when combined with traditional techniques the complex network approach may be useful to enhance the discrimination of senses in large texts. Copyright (C) EPLA, 2012
引用
收藏
页数:6
相关论文
共 40 条
[1]   Hierarchical structures induce long-range dynamical correlations in written texts [J].
Alvarez-Lacalle, E. ;
Dorow, B. ;
Eckmann, J. -P. ;
Moses, E. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (21) :7956-7961
[2]   Using complex networks to quantify consistency in the use of words [J].
Amancio, D. R. ;
Oliveira, O. N., Jr. ;
da F Costa, L. .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2012,
[3]   Using metrics from complex networks to evaluate machine translation [J].
Amancio, D. R. ;
Nunes, M. G. V. ;
Oliveira, O. N., Jr. ;
Pardo, T. A. S. ;
Antiqueira, L. ;
Costa, L. da F. .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2011, 390 (01) :131-142
[4]   Comparing intermittency and network measurements of words and their dependence on authorship [J].
Amancio, Diego Raphael ;
Altmann, Eduardo G. ;
Oliveira, Osvaldo N., Jr. ;
Costa, Luciano da Fontoura .
NEW JOURNAL OF PHYSICS, 2011, 13
[5]  
[Anonymous], 1955, TRANSLATION
[6]  
[Anonymous], 2006, PATTERN RECOGN
[7]  
[Anonymous], 2010, Diversity and complexity
[8]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[9]  
[Anonymous], 2010, Networks: An Introduction, DOI 10.1162/artl_r_00062
[10]   A complex network approach to text summarization [J].
Antiqueira, Lucas ;
Oliveira, Osvaldo N., Jr. ;
Costa, Luciano da Fontoura ;
Volpe Nunes, Maria das Gracas .
INFORMATION SCIENCES, 2009, 179 (05) :584-599