High performance OLAP and data mining on parallel computers

被引:21
作者
Goil, S [1 ]
Choudhary, A
机构
[1] Northwestern Univ, Dept Elect & Comp Engn, Evanston, IL 60201 USA
[2] Northwestern Univ, Ctr Parallel & Distributed Comp, Evanston, IL 60201 USA
基金
美国国家科学基金会;
关键词
Data Cube; parallel computing; high performance; data mining; Attribute Focusing;
D O I
10.1023/A:1009777418785
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
On-Line Analytical Processing (OLAP) techniques are increasingly being used in decision support systems to provide analysis of data. Queries posed on such systems are quire complex and require different views of data. Analytical models need to capture the multidimensionality of the underlying data, a task for which multidimensional databases are well suited. Multidimensional OLAP systems store data in multidimensional arrays on which analytical operations are performed. Knowledge discovery and data mining requires complex operations on the underlying data which can be very expensive in terms of computation time.. High performance parallel systems can reduce this analysis rime. Precomputed aggregate calculations in a Data Cube can provide efficient query processing for OLAP applications. In this article, we present algorithms for construction of data cubes on distributed-memory parallel computers. Data is loaded from a relational database into a multidimensional array. We present two methods, sort-based and hash-based for loading the base cube and compare their performances. Data cubes are used to perform consolidation queries used in roil-up operations using dimension hierarchies. Finally, we show how data cubes are used for data mining using Attribute Focusing techniques. We present results fbr these on the IBM-SP2 parallel machine. Results show that our algorithms and techniques for OLAP and data mining on parallel systems are scalable to a large number of processors, providing a high performance platform for such applications.
引用
收藏
页码:391 / 417
页数:27
相关论文
共 14 条
  • [1] A CASE-STUDY OF SOFTWARE PROCESS IMPROVEMENT DURING DEVELOPMENT
    BHANDARI, I
    HALLIDAY, M
    TARVER, E
    BROWN, D
    CHAAR, J
    CHILLAREGE, R
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1993, 19 (12) : 1157 - 1170
  • [2] BHANDARI I, 20136 RC IBM TJ WATS
  • [3] BHANDARI I, 1996, 20443 RC IBM TJ WATS
  • [4] Codd E. F., 1993, PROVIDING OLAP USER
  • [5] FAYYAD UM, DATA MINING KNOWLEDG, P1
  • [6] GOIL S, IN PRESS 4 INT C HIG
  • [7] GRAY J, 1996, P INT C DAT ENG
  • [8] GUTING A, 1994, VLDB J, V3, P357
  • [9] HARINARAYAN V, P SIGMOD 96
  • [10] Kumar V., 1994, INTRO PARALLEL COMPU, V400