AN ALGEBRA FOR HIERARCHICALLY ORGANIZED TEXT-DOMINATED DATABASES

被引:14
作者
BURKOWSKI, FJ
机构
[1] Department of Computer Science, University of Waterloo, Waterloo
关键词
D O I
10.1016/0306-4573(92)90079-F
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Structured documents are usually comprised of nested text elements; for example, reports contain chapters, chapters contain sections, . . . , sentences contain words. The containment relationships of these text elements define a text hierarchy that can be exploited during search activities such as database browsing and full-text retrieval. During a database load the system typically constructs concordance lists, each list maintaining the locations of all occurrences of a particular type of text element. Although not necessarily constructed in practice, a complete set of concordance lists would constitute an equivalent representation of the database, namely its inverted form. This paper describes an algebra based on various primitive operators that use concordance lists as operands. These primitives can be used to define higher level filter operators that specify whether a contiguous text extent will be selected or rejected during a search. The main contribution of the paper is the presentation of this algebra as a theoretical model that can be used to define a conceptual schema for the database. This theoretical model provides both a mathematically well defined abstraction for the database and a basis for database implementation since it may be utilized to formally define the search protocols between the database query facilities and the underlying retrieval engine.
引用
收藏
页码:333 / 348
页数:16
相关论文
共 19 条
[1]  
ABITEBOUL S, 1986, INT C DATABASE THEOR
[2]  
BROWN H, 1989, INTRO OFFICE DOCUMEN
[3]  
BRYAN M, 1988, SGML AUTHORS GUIDE S
[4]  
BURKOWSKI FJ, 1991, APR C P RIAO 91 INT, P264
[5]  
BURKOWSKI FJ, 1990, 13TH P INT ACM SIGIR, P211
[6]  
COLBY LS, 1989, 282 IND U COMP SCI D
[7]  
Date C.J., 1990, RELATIONAL DATABASE
[8]  
GONNET GH, 1987, 13TH P INT C VER LAR, P339
[9]   AN ALGEBRA FOR STRUCTURED OFFICE DOCUMENTS [J].
GUTING, RH ;
ZICARI, R ;
CHOY, DM .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1989, 7 (02) :123-157
[10]  
GYSSENS M, 1989, P SIGMOD, P263