Relative Information Completeness

被引:25
作者
Fan, Wenfei [2 ]
Geerts, Floris [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Lab Fdn Comp Sci, Edinburgh EH8 9AB, Midlothian, Scotland
[2] Harbin Inst Technol, Harbin, Peoples R China
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2010年 / 35卷 / 04期
基金
英国工程与自然科学研究理事会;
关键词
Design; Languages; Reliability; Theory; Incomplete information; relative completeness; master data management; partially closed databases; complexity;
D O I
10.1145/1862919.1862924
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article investigates the question of whether a partially closed database has complete information to answer a query. In practice an enterprise often maintains master data D-m, a closed-world database. We say that a database D is partially closed if it satisfies a set V of containment constraints of the form q(D) subset of p(D-m), where q is a query in a language L-C and p is a projection query. The part of D not constrained by (D-m, V) is open, from which some tuples may be missing. The database D is said to be complete for a query Q relative to (D-m, V) if for all partially closed extensions D' of D, Q(D') = Q(D), i.e., adding tuples to D either violates some constraints in V or does not change the answer to Q. We first show that the proposed model can also capture the consistency of data, in addition to its relative completeness. Indeed, integrity constraints studied for data consistency can be expressed as containment constraints. We then study two problems. One is to decide, given D-m, V, a query Q in a language L-Q, and a partially closed database D, whether D is complete for Q relative to (D-m, V). The other is to determine, given D-m, V and Q, whether there exists a partially closed database that is complete for Q relative to (D-m, V). We establish matching lower and upper bounds on these problems for a variety of languages L-Q and L-C. We also provide characterizations for a database to be relatively complete, and for a query to allow a relatively complete database, when L-Q and L-C are conjunctive queries.
引用
收藏
页数:44
相关论文
共 33 条
  • [1] Abiteboul S., 1998, Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1998, P254, DOI 10.1145/275487.275516
  • [2] Abiteboul S., 1995, Foundations of databases, V8
  • [3] [Anonymous], 2007, Data quality and record linkage techniques
  • [4] [Anonymous], 1991, Problem of Incomplete Information in Relational Databases
  • [5] [Anonymous], VLDB J
  • [6] [Anonymous], 1979, Computers and Intractablity: A Guide to the Theory of NP-Completeness
  • [7] [Anonymous], 2006, Data Quality: Concepts, Methodologies and Techniques, DOI [DOI 10.1007/3-540-33173-5_1, DOI 10.1007/3-540-33173-5]
  • [8] Arenas M., 1999, Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, P68, DOI 10.1145/303976.303983
  • [9] BRAVO L, 2007, P INT C VER LARG DAT, P243
  • [10] CALI A, 2003, P ACM S PRINC DAT SY, P260