NEW TECHNIQUE FOR DETECTING PATTERNS OF TERM USAGE IN TEXT CORPORA

被引:3
作者
OLNEY, J [1 ]
LAM, V [1 ]
YEARWOOD, B [1 ]
机构
[1] SYST DEV CORP, SANTA MONICA, CA 90406 USA
关键词
D O I
10.1016/0306-4573(76)90064-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Term occurrence A is included in term occurrence B if A is a substring of B. By making a single pass through a slightly non-standard KWIC index, every recurring phrase can be detected, and its inclusion relationships with other phrases and/or single words can be computed. Results obtained by processing a corpus of 2675 medical titles indicate that several properties definable in terms of inclusion relationships among terms have significance for vocabulary control. Preliminary results from a corpus of more than 62,000 medical titles have confirmed this finding.
引用
收藏
页码:235 / 250
页数:16
相关论文
empty
未找到相关数据