Logical analysis of numerical data

被引:144
作者
Boros, E
Hammer, PL
Ibaraki, T
Kogan, A
机构
[1] RUTGERS STATE UNIV, RUTCOR, NEW BRUNSWICK, NJ 08903 USA
[2] KYOTO UNIV, GRAD SCH ENGN, DEPT APPL MATH & PHYS, KYOTO 606, JAPAN
[3] RUTGERS STATE UNIV, FAC MANAGEMENT, DEPT ACCOUNTING & INFORMAT SYST, NEWARK, NJ 07102 USA
关键词
data analysis; Boolean functions; machine learning; binarization; set covering; monotonicity; thresholdness; computational complexity;
D O I
10.1007/BF02614316
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
''Logical analysis of data'' (LAD) is a methodology developed since the late eighties, aimed at discovering hidden structural information in data sets. LAD was originally developed for analyzing binary data by using the theory of partially defined Boolean functions, An extension of LAD for the analysis of numerical data sets is achieved through the process of ''binarization'' consisting in the replacement of each numerical variable by binary ''indicator'' variables, each showing whether the value of the original variable is above or below a certain level. Binarization was successfully applied to the analysis of a variety of real life data sets, This paper develops the theoretical foundations of the binarization process studying the combinatorial optimization problems related to the minimization of the number of binary variables, To provide an algorithmic framework for the practical solution of such problems, we construct compact linear integer programming formulations of them. We develop polynomial time algorithms for some of these minimization problems, and prove NP-hardness of others, (C) 1997 The Mathematical Programming Society, Inc. Published by Elsevier Science B.V.
引用
收藏
页码:163 / 190
页数:28
相关论文
共 24 条
[1]  
Angluin D., 1988, Machine Learning, V2, P319, DOI 10.1007/BF00116828
[2]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[3]  
BALAS E, 1980, MATH PROGRAM STUD, V12, P37, DOI 10.1007/BFb0120886
[4]  
Bellman R., 1960, Introduction to matrix analysis
[5]   COMPLEXITY OF IDENTIFICATION AND DUALIZATION OF POSITIVE BOOLEAN FUNCTIONS [J].
BIOCH, JC ;
IBARAKI, T .
INFORMATION AND COMPUTATION, 1995, 123 (01) :50-63
[6]   DECOMPOSABILITY OF PARTIALLY DEFINED BOOLEAN FUNCTIONS [J].
BOROS, E ;
GURVICH, V ;
HAMMER, PL ;
IBARAKI, T ;
KOGAN, A .
DISCRETE APPLIED MATHEMATICS, 1995, 62 (1-3) :51-75
[7]  
BOROS E, 1995, 1495 RUTCOR RRR RUTG
[8]  
BOROS E, 1997, 0497 RUTCOR RRR RUTG
[9]  
BOROS E, 1996, 0696 RUTCOR RRR RUTG
[10]  
BOROS E, 1996, 2296 RUTCOR RRR RUTG