CLIP4: Hybrid inductive machine learning algorithm that generates inequality rules

被引:26
作者
Cios, KJ
Kurgan, LA
机构
[1] Univ Colorado, Dept Comp Sci & Engn, Denver, CO 80217 USA
[2] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2V4, Canada
[3] Univ Colorado, Dept Comp Sci, Boulder, CO 80309 USA
[4] Univ Colorado, Hlth Sci Ctr, Denver, CO 80262 USA
[5] 4cData, Golden, CO 80401 USA
关键词
CLIP4; inductive machine learning; inequality rules; feature selection; feature ranking; selector ranking;
D O I
10.1016/j.ins.2003.03.015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper describes a hybrid inductive machine learning algorithm called CLIP4. The algorithm first partitions data into subsets using a tree structure and then generates production rules only from subsets stored at the leaf nodes. The unique feature of the algorithm is generation of rules that involve inequalities. The algorithm works with the data that have large number of examples and attributes, can cope with noisy data, and can use numerical, nominal, continuous, and missing-value attributes. The algorithm's flexibility and efficiency are shown on several well-known benchmarking data sets, and the results are compared with other machine learning algorithms. The benchmarking results in each instance show the CLIP4's accuracy, CPU time, and rule complexity. CLIP4 has built-in features like tree pruning, methods for partitioning the data (for-data with large number of examples and attributes, and for data containing noise), data-independent mechanism for dealing with missing values, genetic operators to improve accuracy on small data, and the discretization schemes. CLIP4 generates model of data that consists of well-generalized rules, and ranks attributes and selectors that can be used for feature selection. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:37 / 83
页数:47
相关论文
共 71 条
  • [41] KIRA K, 1992, AAAI-92 PROCEEDINGS : TENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, P129
  • [42] KOHAVI R, 1996, WRAPPERS FEATURE SUB, P202
  • [43] KONONENKO I, 1994, P 1994 EUR C MACH LE
  • [44] Polychotomous regression
    Kooperberg, C
    Bose, S
    Stone, CJ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (437) : 117 - 127
  • [45] Kurgan L, 2001, IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, P980
  • [46] CAIM discretization algorithm
    Kurgan, LA
    Cios, KJ
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (02) : 145 - 153
  • [47] Knowledge discovery approach to automated cardiac SPECT diagnosis
    Kurgan, LA
    Cios, KJ
    Tadeusiewicz, R
    Ogiela, M
    Goodenday, LS
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2001, 23 (02) : 149 - 169
  • [48] KURGAN LA, 2003, UNPUB META MINING AR
  • [49] KURGAN LA, 2003, THESIS U COLORADO BO
  • [50] KURGAN LA, 2002, UNPUB DATASQUEEZER A