On the Expressiveness of Implicit Provenance in Query and Update Languages

被引:38
作者
Buneman, Peter [1 ]
Cheney, James [1 ]
Vansummeren, Stijn [2 ,3 ]
机构
[1] Univ Edinburgh, Informat Forum, Sch Informat, LFCS, Edinburgh EH8 9AB, Midlothian, Scotland
[2] Hasselt Univ, Theoret Comp Sci Grp, B-3590 Diepenbeek, Belgium
[3] Transnatl Univ Limburg, Limburg, Germany
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2008年 / 33卷 / 04期
基金
英国工程与自然科学研究理事会;
关键词
Languages; Reliability; Theory; Provenance; nested relational calculus; nested update language; conservativity;
D O I
10.1145/1412331.1412340
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information describing the origin of data, generally referred to as provenance, is important in scientific and curated databases where it is the basis for the trust one puts in their contents. Since such databases are constructed using operations of both query and update languages, it is of paramount importance to describe the effect of these languages on provenance. In this article we study provenance for query and update languages that are closely related to SQL, and compare two ways in which they can manipulate provenance so that elements of the input are rearranged to elements of the output: implicit provenance, where a query or update only provides the rearranged output, and provenance is provided implicitly by a default provenance semantics; and explicit provenance, where a query or update provides both the output and the description of the provenance of each component of the output. Although explicit provenance is in general more expressive, we show that the classes of implicit provenance operations expressible by query and update languages correspond to natural semantic subclasses of the explicit provenance queries. One of the consequences of this study is that provenance separates the expressive power of query and update languages. The model is also relevant to annotation propagation schemes in which annotations on the input to a query or update have to be transferred to the output or vice versa.
引用
收藏
页数:47
相关论文
共 31 条
[1]  
Abiteboul S., 1995, Foundations of databases, V1st
[2]  
[Anonymous], 2006, VLDB
[3]  
Beeri C., 1996, Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1996, P104, DOI 10.1145/237661.237689
[4]  
BEERI C, 1998, P 6 INT WORKSH DAT P, P239
[5]   An annotation management system for relational databases [J].
Bhagwat, D ;
Chiticariu, L ;
Tan, WC ;
Vijayvargiya, G .
VLDB JOURNAL, 2005, 14 (04) :373-396
[6]   Lineage retrieval for scientific data processing: A survey [J].
Bose, R ;
Frew, J .
ACM COMPUTING SURVEYS, 2005, 37 (01) :1-28
[7]  
Buneman P, 2001, LECT NOTES COMPUT SC, V1973, P316
[8]   PRINCIPLES OF PROGRAMMING WITH COMPLEX OBJECTS AND COLLECTION TYPES [J].
BUNEMAN, P ;
NAQVI, S ;
TANNEN, V ;
WONG, LS .
THEORETICAL COMPUTER SCIENCE, 1995, 149 (01) :3-48
[9]  
BUNEMAN P, 2006, P 2006 ACM SIGMOD IN, P539
[10]   On the expressiveness of implicit provenance in query and update languages [J].
Buneman, Peter ;
Cheney, James ;
Vansummeren, Stijn .
DATABASE THEORY - ICDT 2007, PROCEEDINGS, 2006, 4353 :209-+