Solutions and query rewriting in data exchange

被引:4
作者
Arenas, Marcelo [1 ]
Barcelo, Pablo [2 ]
Fagin, Ronald [3 ]
Libkin, Leonid [4 ]
机构
[1] Pontificia Univ Catolica Chile, Dept Comp Sci, Santiago, Chile
[2] Univ Chile, Dept Comp Sci, Santiago, Chile
[3] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
[4] Univ Edinburgh, Lab Fdn Comp Sci, Edinburgh EH8 9AB, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
Data exchange; Universal solutions; Query answering; Query rewriting; Locality; INFORMATION; QUANTIFIERS;
D O I
10.1016/j.ic.2013.06.002
中图分类号
TP301 [理论、方法];
学科分类号
080201 [机械制造及其自动化];
摘要
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema. Given a source instance, there may be many solutions - target instances that satisfy the constraints of the data exchange problem. Previous work has identified two classes of desirable solutions: canonical universal solutions, and their cores. Query answering in data exchange amounts to rewriting a query over the target schema to another query that, over a materialized target instance, gives the result that is semantically consistent with the source (specifically, the "certain answers"). Basic questions are then: (1) how do these solutions compare in terms of query rewriting? and (2) how can we determine whether a query is rewritable over a particular solution? Our goal is to answer these questions. Our first main result is that, in terms of rewritability by relational algebra queries, the core is strictly less expressive than the canonical universal solution, which in turn is strictly less expressive than the source. To develop techniques for proving queries non-rewritable, we establish structural properties of solutions; in fact they are derived from the technical machinery developed in the rewritability proofs. Our second result is that both the canonical universal solution and the core preserve the local structure of the data, and that every target query rewritable over any of these solutions cannot distinguish tuples whose neighborhoods in the source are similar. This gives us a first simple tool for checking whether a query is non-rewritable over the canonical universal solution or over the core. We also show that these tools generalize to arbitrary transformations that preserve the local structure of the data, and investigate an alternative semantics of query answering in data exchange. (c) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:28 / 61
页数:34
相关论文
共 32 条
[1]
Afrati F., 2008, P 27 ACM SIGMOD SIGA, P129
[2]
[Anonymous], 1982, Proceedings of the Herbrand Symposium. Ed. by
[3]
Arenas M., 1999, Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, P68, DOI 10.1145/303976.303983
[4]
Arenas M., 2010, RELATIONAL XML DATA
[5]
Arenas M., 2004, P 23 ACM SIGMOD SIGA, P229
[6]
Barceló P, 2009, SIGMOD REC, V38, P49, DOI 10.1145/1558334.1558341
[7]
A PROOF PROCEDURE FOR DATA DEPENDENCIES [J].
BEERI, C ;
VARDI, MY .
JOURNAL OF THE ACM, 1984, 31 (04) :718-741
[8]
Duschka OM, 1997, INT JOINT CONF ARTIF, P778
[9]
Ehrenfeucht Andrzej, 1961, Fund. Math, V49, P129, DOI DOI 10.2307/2271711
[10]
Counting quantifiers, successor relations, and logarithmic space [J].
Etessami, K .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 54 (03) :400-411