The LLUNATIC Data-Cleaning Framework

被引:96
作者
Geerts, Floris [1 ]
Mecca, Giansalvatore [2 ]
Papotti, Paolo [3 ]
Santoro, Donatello [2 ,4 ]
机构
[1] Univ Antwerp, Antwerp, Belgium
[2] Univ Basilicata, Potenza, Italy
[3] Qatar Comp Res Inst, Doha, Qatar
[4] Univ Rome Tre, Rome, Italy
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 09期
关键词
Data cleaning - Minimal solutions - Uniform framework;
D O I
10.14778/2536360.2536363
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.
引用
收藏
页码:625 / 636
页数:12
相关论文
共 25 条
[1]  
Abiteboul Serge, 1995, FDN DATABASES
[2]   Fast and simple relational processing of uncertain data [J].
Antova, Lyublena ;
Jansen, Thomas ;
Koch, Christoph ;
Olteanu, Dan .
2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, :983-992
[3]  
Arenas M., 1999, Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, P68, DOI 10.1145/303976.303983
[4]   A PROOF PROCEDURE FOR DATA DEPENDENCIES [J].
BEERI, C ;
VARDI, MY .
JOURNAL OF THE ACM, 1984, 31 (04) :718-741
[5]  
Bertossi L., 2011, P ICDT, P268, DOI DOI 10.1145/1938551.1938585
[6]  
Bertossi L. E, 2011, DATABASE REPAIRING C
[7]   Sampling the Repairs of Functional Dependency Violations under Hard Constraints [J].
Beskales, George ;
Ilyas, Ihab F. ;
Golab, Lukasz .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01) :197-207
[8]  
Bohannon P., 2005, P ACM SIGMOD INT C M, P143
[9]  
Chu X., 2013, ICDE
[10]  
Cong G., 2007, P 33 INT C VERY LARG, P315