ProbView: A flexible probabilistic database system

被引:154
作者
Lakshmanan, LVS
Leone, N
Ross, R
Subrahmanian, VS
机构
[1] VIENNA TECH UNIV, DEPT INFORMAT SYST, A-1040 VIENNA, AUSTRIA
[2] UNIV MARYLAND, DEPT COMP SCI, COLLEGE PK, MD 20742 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 1997年 / 22卷 / 03期
关键词
algorithms; languages; performance; theory; algebra; data complexity; performance evaluation; probabilistic databases; view maintenance; query optimization;
D O I
10.1145/261124.261131
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Probability theory is mathematically the best understood paradigm for modeling and manipulating uncertain information. Probabilities of complex events can be computed from those of basic events on which they depend, using any of a number of strategies. Which strategy is appropriate depends very much on the known interdependencies among the events involved. Previous work on probabilistic databases has assumed a fixed and restrictive combination strategy (e.g., assuming all events are pairwise independent). In this article, we characterize, using postulates, whole classes of strategies for conjunction, disjunction, and negation, meaningful from the viewpoint of probability theory. (1) We propose a probabilistic relational data model and a generic probabilistic relational algebra that neatly captures various strategies satisfying the postulates, within a single unified framework. (2) We show that as long as the chosen strategies can be computed in polynomial time, queries in the positive fragment of the probabilistic relational algebra have essentially the same data complexity as classical relational algebra. (3) We establish various containments and equivalences between algebraic expressions, similar in spirit to those in classical algebra. (4) We develop algorithms for maintaining materialized probabilistic views. (5) Based on these ideas, we have developed a prototype probabilistic database system called ProbView on top of Dbase V.0. We validate our complexity results with experiments and show that rewriting certain types of queries to other equivalent forms often yields substantial savings.
引用
收藏
页码:419 / 469
页数:51
相关论文
共 36 条
[1]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[2]  
[Anonymous], OPERATIONS RES
[3]   THE MANAGEMENT OF PROBABILISTIC DATA [J].
BARBARA, D ;
GARCIAMOLINA, H ;
PORTER, D .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1992, 4 (05) :487-502
[4]   MIXED-INTEGER PROGRAMMING METHODS FOR COMPUTING NONMONOTONIC DEDUCTIVE DATABASES [J].
BELL, C ;
NERODE, A ;
NG, RT ;
SUBRAHMANIAN, VS .
JOURNAL OF THE ACM, 1994, 41 (06) :1178-1215
[5]   UPDATING DERIVED RELATIONS - DETECTING IRRELEVANT AND AUTONOMOUSLY COMPUTABLE UPDATES [J].
BLAKELEY, JA ;
COBURN, N ;
LARSON, PA .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1989, 14 (03) :369-400
[6]  
Boole G., 1854, The Laws of Thought
[7]  
CAVALLO R, 1987, 13TH P INT C VER LAR, P71
[8]   VIEW DEFINITION AND GENERALIZATION FOR DATABASE INTEGRATION IN A MULTIDATABASE SYSTEM [J].
DAYAL, U ;
HWANG, HY .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1984, 10 (06) :628-645
[9]  
DAYAL U, 1989, 2ND INT WORKSH DAT P, P80
[10]   DEFAULT REASONING AND POSSIBILITY THEORY [J].
DUBOIS, D ;
PRADE, H .
ARTIFICIAL INTELLIGENCE, 1988, 35 (02) :243-257