An approach for measuring semantic similarity between words using multiple information sources

被引:36
作者
Li, YH
Bandar, ZA
McLean, D
机构
[1] Univ Manchester, Manchester Sch Engn, Manchester M13 9PL, Lancs, England
[2] Manchester Metropolitan Univ, Intelligent Syst Grp, Dept Comp & Math, Manchester M1 5GD, Lancs, England
关键词
semantic similarity; lexical database; information content; corpus statistics;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Semantic similarity between words is becoming a generic problem for many applications of computational linguistics and artificial intelligence. This paper explores the determination of semantic similarity by a number of information sources, which consist of structural semantic information from a lexical taxonomy and information content from a corpus. To investigate how information sources could be used effectively, a variety of strategies for using various possible information sources are implemented. A new measure is then proposed which combines information sources nonlinearly. Experimental evaluation against a benchmark set of human similarity ratings demonstrates that the proposed measure significantly outperforms traditional similarity measures.
引用
收藏
页码:871 / 882
页数:12
相关论文
共 25 条
[1]
Abney Steven, 1999, P ACL WORKSH UNS LEA, P1
[2]
Agirre E., 1996, P 16 INT C COMP LING
[3]
Asymmetries of comparison [J].
Aguilar, CM ;
Medin, DL .
PSYCHONOMIC BULLETIN & REVIEW, 1999, 6 (02) :328-337
[4]
[Anonymous], THESIS U ELECTROCOMM
[5]
BUDANITSKY A, 2001, P WORKSH WORDN OTH L
[6]
BUDANITSKY A, 1999, CSRG390 U TOR DEP CO
[7]
Edwards A.L., 1976, INTRO LINEAR REGRESS
[8]
Francis W.N., 1979, BROWN CORPUS MANUAL
[9]
Building hypertext links by computing semantic similarity [J].
Green, SJ .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1999, 11 (05) :713-730
[10]
JIANG J.J, 1997, P ROCLING 10