Automatic recognition of multi-word terms: The C-value/NC-value method

被引:140
作者
Frantzi K. [1 ]
Ananiadou S. [1 ]
Mima H. [2 ]
机构
[1] Centre for Computational Linguistics, UMIST, Manchester, M60 1QD
[2] Dept. of Information Science, University of Tokyo, Bunkyo-ku, Tokyo 113
关键词
Automatic extraction; Automatic term recognition (ATR); Domain independence; Linguistic and statistical information; Terms;
D O I
10.1007/s007999900023
中图分类号
学科分类号
摘要
Technical terms (henceforth called terms), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora. The method, (C-value/NC-value), combines linguistic and statistical information. The first part, C-value, enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms); 2) the incorporation of information from term context words to the extraction of terms. © 2000 Springer-Verlag.
引用
收藏
页码:115 / 130
页数:15
相关论文
共 29 条
[21]  
Lauriston A., Automatic Term Recognition: Performance of Linguistic and Statistical Techniques, (1996)
[22]  
Lehrberger J., Sublanguage analysis, Analyzing Language In Restricted Domains, pp. 19-38, (1986)
[23]  
Lipschutz S., Theory and Problems of Probability. Schaum's Outline Series, (1974)
[24]  
Penn treebank annotation, Computational Linguistics, (1993)
[25]  
Rohatgi V.K., An Introduction to Probability Theory and Mathematical Statistics, Wiley Series In Probability and Mathematical Statistics, (1976)
[26]  
Sager J.C., Commentary by Prof. Juan Carlos Sager, Actes Table Ronde Sur Les Problemes Du Decoupage Du Terms, Montreal, pp. 39-74, (1978)
[27]  
Sager J.C., A Practical Course In Terminology Processing, (1990)
[28]  
Sager J.C., Dungworth D., McDonald P.F., English Special Languages: Principles and Practice In Science and Technology, (1980)
[29]  
Salton G., Introduction to modern information retrieval, Computer Science, (1983)