Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language

被引:1106
作者
Resnik, P [1 ]
机构
[1] Univ Maryland, Dept Linguist, College Pk, MD 20742 USA
[2] Univ Maryland, Inst Adv Comp Studies, College Pk, MD 20742 USA
关键词
D O I
10.1613/jair.514
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents a measure of semantic similarity in an is-a taxonomy based on the notion of shared information content. Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach. The article presents algorithms that take advantage of taxonomic similarity in resolving syntactic and semantic ambiguity, along with experimental results demonstrating their effectiveness.
引用
收藏
页码:95 / 130
页数:36
相关论文
共 72 条
[21]  
Grefenstette Gregory, 1994, EXPLORATIONS AUTOMAT
[22]  
Hearst M., 1991, P 7 ANN C U WAT CTR
[23]  
Hindle D., 1993, Computational Linguistics, V19, P103
[24]  
JI D, 1998, COLING ACL 98 U MONT, P600
[25]   ESTIMATION OF PROBABILITIES FROM SPARSE DATA FOR THE LANGUAGE MODEL COMPONENT OF A SPEECH RECOGNIZER [J].
KATZ, SM .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (03) :400-401
[26]  
KHINCHIN AI, 1957, MATH FDN INFORMATION
[27]  
Klavans J., 1995, Machine Translation, V10, P185, DOI 10.1007/BF00981486
[28]  
KOBAYASI Y, 1994, P 15 INT C COMP LING
[29]   LEXICAL AMBIGUITY AND INFORMATION-RETRIEVAL [J].
KROVETZ, R ;
CROFT, WB .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1992, 10 (02) :115-141
[30]  
KUROHASHI S, 1992, P 14 INT C COMP LING