A New Multiword Expression Metric and Its Applications

被引:1
作者
布凡 [1 ]
朱小燕 [1 ]
李明 [2 ]
机构
[1] State Key Laboratory of Intelligent Technology and Systems,Tsinghua National Laboratory for Information Science and Technology,Department of Computer Science and Technology,Tsinghua University
[2] David RCheriton School of Computer Science,University of Waterloo,Waterloo NL G,Canada
关键词
multiword expressions; information distance; question answering; named entity extraction;
D O I
暂无
中图分类号
TP391.1 [文字信息处理];
学科分类号
081203 ; 0835 ;
摘要
<正>Multiword Expressions(MWEs) appear frequently and ungrammatically in natural languages.Identifying MWEs in free texts is a very challenging problem.This paper proposes a knowledge-free,unsupervised,and language-independent Multiword Expression Distance(MED).The new metric is derived from an accepted physical principle,measures the distance from an n-gram to its semantics,and outperforms other state-of-the-art methods on MWEs in two applications: question answering and named entity extraction.
引用
收藏
页码:3 / 13
页数:11
相关论文
共 6 条
[1]   New Information Distance Measure and Its Application in Question Answering System [J].
张显 ;
郝宇 ;
朱小燕 ;
李明 .
JournalofComputerScience&Technology, 2008, (04) :557-572
[2]   Improving effectiveness of mutual information for substantival multiword expression extraction [J].
Zhang, Wen ;
Yoshida, Taketoshi ;
Tang, Xijin ;
Ho, Tu-Bao .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (08) :10919-10930
[3]  
An information-based sequence distance and its application to whole mitochondrial genome phylogeny[J] . Ming Li,Jonathan H. Badger,Xin Chen,Sam Kwong,Paul Kearney.Bioinformatics . 2001
[4]  
Technical terminology: some linguistic properties and an algorithm for identification in text[J] . John S. Justeson,Slava M. Katz.Natural Language Engineering . 1995 (1)
[5]  
Shared informationandprogram plagiarism detection. X. Chen,B. Francia,M. Li,B. McKinnon,A. Seker. IEEE Transactions on Information Theory . 2004
[6]  
“Is It the Right Answer? Exploiting Web Redundancy for Answer Validation”. Magnini, B,Negri, M,Prevete, R,Tanev, H. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002) . 2002