Global discretization of continuous attributes as preprocessing for machine learning

被引:199
作者
Chmielewski, MR [1 ]
GrzymalaBusse, JW [1 ]
机构
[1] UNIV KANSAS,DEPT ELECT ENGN & COMP SCI,LAWRENCE,KS 66045
关键词
discretization; quantization; continuous attributes; learning from examples; rough set theory;
D O I
10.1016/S0888-613X(96)00074-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-life data usually are presented in databases by real numbers. On the other hand, most inductive learning methods require a small number of attribute values. Thus it is necessary to convert input data sets with continuous attributes into input data sets with discrete attributes. Methods of discretization restricted to single continuous attributes will be called local, while methods that simultaneously convert all continuous attributes will be called global. in this paper, a method of transforming any local discretization method into a global one is presented. A global discretization method, based on cluster analysis is presented and compared experimentally with three known local methods, transformed into global. Experiments include tenfold cross-validation and leaving-one-out methods for ten real-life data sets. (C) 1996 Elsevier Science Inc.
引用
收藏
页码:319 / 331
页数:13
相关论文
共 15 条
  • [1] [Anonymous], P IEEE C SYST MAN CY
  • [2] [Anonymous], 1980, CLUSTER ANAL
  • [3] [Anonymous], STAT ANAL DECISION M
  • [4] Breiman L., 1984, Classification and Regression Trees, DOI DOI 10.2307/2530946
  • [5] CATLETT J, 1991, LECT NOTES ARTIF INT, P164
  • [6] CHAN CC, 1991, TR9114 U KANS DEP CO, P20
  • [7] Chiu D. K. Y., 1991, Knowledge discovery in databases, P125
  • [8] ON THE HANDLING OF CONTINUOUS-VALUED ATTRIBUTES IN DECISION TREE GENERATION
    FAYYAD, UM
    IRANI, KB
    [J]. MACHINE LEARNING, 1992, 8 (01) : 87 - 102
  • [9] Grzymala-Busse J.W., 1992, INTELLIGENT DECISION
  • [10] LENARCIK A, 1992, P 1 INT WORKSH ROUGH