Adaptive rank-aware query optimization in relational databases

被引:23
作者
Ilyas, Ihab F.
Aref, Walid G.
Elmagarmid, Ahmed K.
Elmongui, Hicham G.
Shah, Rahul
Vitter, Jeffrey Scott
机构
[1] Univ Waterloo, Waterloo, ON N2L 3G1, Canada
[2] Purdue Univ, W Lafayette, IN 47907 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2006年 / 31卷 / 04期
关键词
algorithms; design; experimentation; performance; advanced query processing; ranking; top-k; adaptive processing; rank-aware optimization;
D O I
10.1145/1189769.1189772
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Rank-aware query processing has emerged as a key requirement in modern applications. In these applications, efficient and adaptive evaluation of top-k queries is an integral part of the application semantics. In this article, we introduce a rank-aware query optimization framework that fully integrates rank-join operators into relational query engines. The framework is based on extending the System R dynamic programming algorithm in both enumeration and pruning. We define ranking as an interesting physical property that triggers the generation of rank-aware query plans. Unlike traditional join operators, optimizing for rank-join operators depends on estimating the input cardinality of these operators. We introduce a probabilistic model for estimating the input cardinality, and hence the cost of a rank-join operator. To our knowledge, this is the first effort in estimating the needed input size for optimal rank aggregation algorithms. Costing ranking plans is key to the full integration of rank-join operators in real-world query processing engines. Since optimal execution strategies picked by static query optimizers lose their optimality due to estimation errors and unexpected changes in the computing environment, we introduce several adaptive execution strategies for top-k queries that respond to these unexpected changes and costing errors. Our reactive reoptimization techniques change the execution plan at runtime to significantly enhance the performance of running queries. Since top-k query plans are usually pipelined and maintain a complex ranking state, altering the execution strategy of a running ranking query is an important and challenging task. We conduct an extensive experimental study to evaluate the performance of the proposed framework. The experimental results are twofold: ( 1) we show the effectiveness of our cost-based approach of integrating ranking plans in dynamic programming cost-based optimizers; and ( 2) we show a significant speedup ( up to 300%) when using our adaptive execution of ranking plans over the state-of-the-art mid-query reoptimization strategies.
引用
收藏
页码:1257 / 1304
页数:48
相关论文
共 37 条
[1]  
AMSALEG L, 1996, DISTRIB PARALLEL DAT, P208
[2]  
Avnur R., SIGMOD REC, P261, DOI 10.1145/342009.335420
[3]  
BRUNO N, 2002, ACM T DATABASE SYST, V27, P369
[4]  
BRUNO N, 2002, P IEEE 18 INT C DAT, P153
[5]  
BRUNO N, 2002, P ACM SIGMOD INT C M, P263
[6]  
Carey M. J., 1997, SIGMOD Record, V26, P219, DOI 10.1145/253262.253302
[7]  
Carey M. J., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P158
[8]   Evaluating refined queries in top-k retrieval systems [J].
Chakrabarti, K ;
Ortega-Binderberger, M ;
Mehrotra, S ;
Porkaew, K .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (02) :256-270
[9]  
Chang K. C.-C., 2002, PROC ACM SIGMOD INT, P346
[10]  
Deshpande Amol, 2004, Proceedings of the Thirtieth International Conference on Very Large Data Bases-Volume 30, P948