Probe minimization by schedule optimization: Supporting top-k queries with expensive predicates

被引:9
作者
Hwang, Seung-won
Chang, Kevin Chen-Chuan
机构
[1] Pohang Univ Sci & Technol, Dept Comp Sci & Engn, Pohang 790784, South Korea
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
关键词
database query processing; distributed information systems; database systems;
D O I
10.1109/TKDE.2007.1007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of evaluating ranked top-k queries with expensive predicates. As major DBMSs now all support expensive user-defined predicates for Boolean queries, we believe such support for ranked queries will be even more important: First, ranked queries often need to model user-specific concepts of preference, relevance, or similarity, which call for dynamic user-defined functions. Second, middleware systems must incorporate external predicates for integrating autonomous sources typically accessible only by per-object queries. Third, ranked queries often accompany Boolean ranking conditions, which may turn predicates into expensive ones, as the index structure on the predicate built on the base table may be no longer effective in retrieving the filtered objects in order. Fourth, fuzzy joins are inherently expensive, as they are essentially user-defined operations that dynamically associate multiple relations. These predicates, being dynamically defined or externally accessed, cannot rely on index mechanisms to provide zero-time sorted output, and must instead require per-object probe to evaluate. To enable probe minimization, we develop the problem as cost-based optimization of searching over potential probe schedules. In particular, we decouple probe scheduling into object and predicate scheduling problems and develop an analytical object scheduling optimization and a dynamic predicate scheduling optimization, which combined together form a cost-effective probe schedule.
引用
收藏
页码:646 / 662
页数:17
相关论文
共 23 条
[1]  
Agrawal R, 2000, P ACM SIGMOD
[2]  
[Anonymous], P ACM SIGMOD
[3]  
BRUNO N, 2002, P INT C DAT ENG
[4]  
BRUNO N, 2002, ACM T DATABASE SYSTE
[5]  
CAO P, 2004, PODC
[6]  
CHANG KC, 2002, P ACM SIGMOD
[7]  
CHAUDHURI S, 1996, P ACM SIGMOD
[8]  
Cormen T. H., 2001, Introduction to Algorithms, V2nd
[9]   Optimal aggregation algorithms for middleware [J].
Fagin, R ;
Lotem, A ;
Naor, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :614-656
[10]  
FAGIN R, 1996, P ACM S PRINC DAT SY