Deriving concept hierarchies from text

被引:232
作者
Sanderson, M [1 ]
Croft, B [1 ]
机构
[1] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
来源
SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 1999年
关键词
concept hierarchy; subsumption; term co-occurrence; multi-document summary;
D O I
10.1145/312624.312679
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a means of automatically deriving a hierarchical organization of concepts from a set of documents without use of training data or standard clustering techniques. Instead, salient words and phrases extracted from the documents are organized hierarchically using a type of co-occurrence known as subsumption. The resulting structure is displayed as a series of hierarchical menus. When generated from a set of retrieved documents, a user browsing the menus is provided with a detailed overview of their content in a manner distinct from existing overview and summarization techniques. The methods used to build the structure are simple, but appear to be effective: a small-scale user study reveals that the generated hierarchy possesses properties expected of such a structure in that general terms are placed at the top levels leading to related and more specific terms below. The formation and presentation of the hierarchy is described along with the user study and some other informal evaluations.
引用
收藏
页码:206 / 213
页数:8
相关论文
共 17 条
[1]  
[Anonymous], 1996, P 19 ANN INT ACM SIG, DOI DOI 10.1145/243199.243202
[2]  
BOURDONCLE F, 1997, P RIAO, P651
[3]  
DOYLE LB, 1961, J ACM, V8, P553, DOI 10.1145/321088.321095
[4]  
Forsyth R., 1986, ADDING EDGE MACHINE, P198
[5]  
GREFENSTETTE G, 1997, P RECH INF ASS ORD R, P500
[6]  
GREFFENSTETTE G, 1994, EXPLORATIONS AUTOMAT
[7]  
Hearst M. A., 1998, AUTOMATED DISCOVERY
[8]  
HEARST MA, 1996, P ACM SIGIR, V19
[9]   WORDNET - A LEXICAL DATABASE FOR ENGLISH [J].
MILLER, GA .
COMMUNICATIONS OF THE ACM, 1995, 38 (11) :39-41
[10]  
NG HT, 1996, P ACL, V34, P40