Empirical distributional semantics: Methods and biomedical applications

被引:90
作者
Cohen, Trevor [1 ]
Widdows, Dominic [1 ]
机构
[1] Arizona State Univ, Sch Comp & Informat, Dept Biomed Informat, Ctr Decis Making & Cognit, Phoenix, AZ 85004 USA
关键词
Distributional semantics; Methodological review; Latent semantic analysis; Natural language processing; Semantic similarity; Random indexing; Context vectors; INFORMATION; PROTEIN; GENE; REPRESENTATION; RETRIEVAL; ABSTRACTS; KNOWLEDGE;
D O I
10.1016/j.jbi.2009.02.002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Over the past 15 years, a range of methods have been developed that are able to learn human-like estimates of the semantic relatedness between terms from the way in which these terms are distributed in a Corpus Of unannotated natural language text. These methods have also been evaluated in a number of applications in the cognitive science, computational linguistics and the information retrieval literatures. In this paper, we review the available methodologies for derivation of semantic relatedness front free text, as well as their evaluation in a variety of biomedical and Other applications. Recent methodological developments, and their applicability to several existing applications are also discussed. (C) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:390 / 405
页数:16
相关论文
共 121 条
[1]  
[Anonymous], 2005, PARAMETER ESTIMATION
[2]  
[Anonymous], 1993, 31 ANN M ASS COMPUTA, DOI [10.3115/981574.981598, DOI 10.3115/981574.981598]
[3]  
[Anonymous], HDB PARALLEL COMPUTI
[4]  
[Anonymous], 2005, Advances in Neural Information Processing Systems
[5]  
[Anonymous], Latent Dirichlet allocation
[6]  
[Anonymous], 1995, 33 ANN M ASS COMP LI, DOI 10.3115/981658.981684
[7]  
[Anonymous], 2007, HDB LATENT SEMANTIC
[8]  
[Anonymous], 1993, Statistical Language Learning
[9]  
[Anonymous], 1994, Corpus-derivedfirst, second, and third-order word affinities
[10]  
*ARB, ARB PROJ KNOWC