Efficient rule-based attribute-oriented induction for data mining

被引:24
作者
Cheung, DW [1 ]
Hwang, HY
Fu, AW
Han, JW
机构
[1] Univ Hong Kong, Dept Comp Sci & Informat Syst, Hong Kong, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
[3] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC V5A 1S6, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
data mining; knowledge discovery in databases; rule-based concept generalization; rule-based concept hierarchy; attribute-oriented induction; inductive learning; learning and adaptive systems;
D O I
10.1023/A:1008778107391
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data mining has become an important technique which has tremendous potential in many commercial and industrial applications. Attribute-oriented induction is a powerful mining technique and has been successfully implemented in the data mining system DBMiner (Han et al. Proc. 1996 Int'l Conf. on Data Mining and Knowledge Discovery (KDD'96), Portland, Oregon, 1996). However, its induction capability is limited by the unconditional concept generalization. In this paper, we extend the concept generalization to rule-based concept hierarchy, which enhances greatly its induction power. When previously proposed induction algorithm is applied to the more general rule-based case, a problem of induction anomaly occurs which impacts its efficiency. We have developed an efficient algorithm to facilitate induction on the rule-based case which can avoid the anomaly. Performance studies have shown that the algorithm is superior than a previously proposed algorithm based on backtracking.
引用
收藏
页码:175 / 200
页数:26
相关论文
共 27 条
  • [1] AGRAWAL R, 1992, PROC INT CONF VERY L, P560
  • [2] DATABASE MINING - A PERFORMANCE PERSPECTIVE
    AGRAWAL, R
    IMIELINSKI, T
    SWAMI, A
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1993, 5 (06) : 914 - 925
  • [3] [Anonymous], P 1987 AAAI C SEATTL
  • [4] BRODIE ML, 1992, INT J INTELLIGENT CO, V1, P233
  • [5] CHAUDHURI S, 1997, ACM SIGMOD RECORD, V26, pP65
  • [6] CHEUNG D, 1996, P 4 INT C PAR DISTR
  • [7] CHEUNG DW, 1994, P 1994 INT S METH IN, P164
  • [8] CHEUNG DW, 1996, P 1996 IEEE INT C DA
  • [9] FAYYAD U., 1995, ADV KNOWLEDGE DISCOV
  • [10] Frawley W. J., 1991, Knowledge discovery in databases, P1