Speeding up materialized view selection in data warehouses using a randomized algorithm

被引:50
作者
Lee, MS [1 ]
Hammer, J [1 ]
机构
[1] Univ Florida, Dept Comp & Informat Sci & Engn, Gainesville, FL 32611 USA
关键词
data warehouse; genetic algorithm; view maintenance; view materialization; view selection; warehouse configuration;
D O I
10.1142/S0218843001000370
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A data warehouse stores information that is collected from multiple, heterogeneous information sources for the purpose of complex querying and analysis. Information in the warehouse is typically stored in the form of materialized views, which represent precomputed portions of frequently asked queries. One of the most important tasks when designing a warehouse is the selection of materialized views to be maintained in the warehouse. The goal is to select a set of views in such a. way as to minimize the total query response time over all queries, given a limited amount of time for maintaining the views (maintenance-cost view selection problem). In this paper, we propose an efficient solution to the maintenance-cost view selection problem using a genetic algorithm for computing a near-optimal set of views. Specifically, we explore the maintenance-cost view selection problem in the context of OR view graphs. We show that our approach represents a dramatic improvement in time complexity over existing search-based approaches using heuristics. Our analysis shows that the algorithm consistently yields a solution that lies within 10% of the optimal query benefit while at the same time exhibiting only a linear increase in execution time. We have implemented a prototype version of our algorithm which is used to simulate the measurements used in the analysis of our approach.
引用
收藏
页码:327 / 353
页数:27
相关论文
共 36 条
[1]  
Aarts E., 1989, Wiley-Interscience Series in Discrete Mathematics and Optimization
[2]  
Aho A. V., 1983, DATA STRUCTURES ALGO
[3]  
[Anonymous], 1989, GENETIC ALGORITHM SE
[4]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[5]  
[Anonymous], P 2 ACM INT WORKSH D
[6]  
AUGIER S, 1995, P 1 INT C KNOWL DISC, P21
[7]  
BLAKELEY JA, 1986, P ACM SIGMOD INT C M, P61
[8]  
CHAUDHURI S, 1995, PROC INT CONF DATA, P190, DOI 10.1109/ICDE.1995.380392
[9]  
COOK SA, 1971, ANN ACM SIGACT S THE, V3, P151
[10]  
FLOCKHART IW, 1997, P 2 INT C KNOWL DISC, P299