Similarity measurement using term negative weight and its application to word similarity

被引:11
作者
Atlam, ES [1 ]
Fuketa, M [1 ]
Morita, K [1 ]
Aoe, J [1 ]
机构
[1] Univ Tokushima, Dept Informat Sci & Intelligent Syst, Tokushima 7708506, Japan
关键词
Functions - Information management;
D O I
10.1016/S0306-4573(00)00009-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A "term weighting" is a useful technique for keyword extraction and document classification. The traditional approach depends on high frequency terms, called positive weight (PW) function. This paper presents a new weighting method that depends on low frequency terms, called negative weight (NW) function. In this paper word similarity for typical verbs and objects is focused as an example for the application field. Negative weighted inverse verb frequency (NWIVF) function is well defined in this study and new similarity measurement is presented by combining the NWIVF and PWIVF (positive weighted inverse verb frequency) functions. The proposed method is applied to 11,000 relationships between verbs and nouns extracted from a large tagged corpus. By using this new method both recall and precision have improved by 33% and 18% respectively, over the positive weight method. (C) 2000 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:717 / 736
页数:20
相关论文
共 19 条
[1]  
[Anonymous], COMPUTATIONAL LINGUI
[2]  
BIBER MJ, 1992, P ANN INT ACM SIGIR, P51
[3]  
DAHALGREN K, 1982, NAIVE SEMANTICS NATL
[4]  
DARLING BC, 1988, THESIS U CHICAGO
[5]  
Fillmore Charles J., 1968, CASE CASE
[6]   MODELS FOR RETRIEVAL WITH PROBABILISTIC INDEXING [J].
FUHR, N .
INFORMATION PROCESSING & MANAGEMENT, 1989, 25 (01) :55-72
[7]  
Jin Y., 1995, NLPRS, V32, P357
[8]  
Li H., 1997, J NATURAL LANGUAGE P, V4, P71
[9]  
Lim K., 1994, P INT C COMP PROC OR, P263
[10]   MACHINE TRANSLATION FROM JAPANESE INTO ENGLISH [J].
NAGAO, M ;
TSUJII, JI ;
NAKAMURA, JI .
PROCEEDINGS OF THE IEEE, 1986, 74 (07) :993-1012