The maximum box problem and its application to data analysis

被引:47
作者
Eckstein, J
Hammer, PL
Liu, Y
Nediak, M
Simeone, B
机构
[1] Rutgers State Univ, Sch Business, Piscataway, NJ 08854 USA
[2] Rutgers State Univ, RUTCOR, Piscataway, NJ 08854 USA
[3] Univ Roma La Sapienza, Dept Stat, I-00185 Rome, Italy
基金
美国国家科学基金会;
关键词
discrete optimization; branch and bound; data analysis; patterns;
D O I
10.1023/A:1020546910706
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Given two finite sets of points X+ and X- in R-n, the maximum box problem consists of finding an interval (" box") B = {x:l less than or equal to x less than or equal to u} such that B boolean AND X- =0, and the cardinality of B boolean AND X- is maximized. A simple generalization can be obtained by instead maximizing a weighted sum of the elements of B boolean AND X-. While polynomial for any fixed n, the maximum box problem is NP-hard in general. We construct an efficient branch-and-bound algorithm for this problem and apply it to a standard problem in data analysis. We test this method on nine data sets, seven of which are drawn from the UCI standard machine learning repository.
引用
收藏
页码:285 / 298
页数:14
相关论文
共 10 条
[1]  
BEALE EML, 1977, DISCRETE OPTIMIZATIO, V2
[2]  
Bealie EML., 1979, ANN DISCRETE MATH, V5, P201, DOI [10.1016/S0167-5060(08)70351-0, DOI 10.1016/S0167-5060(08)70351-0]
[3]  
BIXBY RE, 1999, MIP THEORY PRACTICE, P19
[4]  
Blake C.L., 1998, UCI repository of machine learning databases
[5]   Logical analysis of numerical data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A .
MATHEMATICAL PROGRAMMING, 1997, 79 (1-3) :163-190
[6]   An implementation of logical analysis of data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A ;
Mayoraz, E ;
Muchnik, I .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (02) :292-306
[7]  
BOROS E, 2000, 6 INT S ART INT MATH
[8]  
Crama Y., 1988, Annals of Operations Research, V16, P299, DOI 10.1007/BF02283750
[9]  
HAMMER PL, 2001, 72001 RUTCOR
[10]   A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms [J].
Lim, TS ;
Loh, WY ;
Shih, YS .
MACHINE LEARNING, 2000, 40 (03) :203-228