Spanned patterns for the logical analysis of data

被引:45
作者
Alexe, G [1 ]
Hammer, PL [1 ]
机构
[1] Rutgers State Univ, RUTCOR Rutgers, Piscataway, NJ 08854 USA
关键词
pattern; consensus; logical analysis of data; classification;
D O I
10.1016/j.dam.2005.03.031
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In a finite dataset consisting of positive and negative observations represented as real valued n-vectors, a positive (negative) pattern is an interval in R-n with the property that it contains sufficiently many positive (negative) observations, and sufficiently few negative (positive) ones. A pattern is spanned if it does not include properly any other interval containing the same set of observations. Although large collections of spanned patterns can provide highly accurate classification models within the framework of the Logical Analysis of Data, no efficient method for their generation is currently known. We propose in this paper, an incrementally polynomial time algorithm for the generation of all spanned patterns in a dataset, which runs in linear time in the output; the algorithm resembles closely the Blake and Quine consensus method for finding the prime implicants of Boolean functions. The efficiency of the proposed algorithm is tested on various publicly available datasets. In the last part of the paper, we present the results of a series of computational experiments which show the high degree of robustness of spanned patterns. (c) 2005 Published by Elsevier B.V.
引用
收藏
页码:1039 / 1049
页数:11
相关论文
共 16 条
[1]   Consensus algorithms for the generation of all maximal bicliques [J].
Alexe, G ;
Alexe, S ;
Crama, Y ;
Foldes, S ;
Hammer, PL ;
Simeone, B .
DISCRETE APPLIED MATHEMATICS, 2004, 145 (01) :11-21
[2]  
ALEXE G, IN PRESS DISCRETE AP
[3]   Accelerated algorithm for pattern detection in logical analysis of data [J].
Alexe, S ;
Hammer, PL .
DISCRETE APPLIED MATHEMATICS, 2006, 154 (07) :1050-1063
[4]   Coronary risk prediction by logical analysis of data [J].
Alexe, S ;
Blackstone, E ;
Hammer, PL ;
Ishwaran, H ;
Lauer, MS ;
Snader, CEP .
ANNALS OF OPERATIONS RESEARCH, 2003, 119 (1-4) :15-42
[5]  
Blake A., 1937, Ph.D. dissertation
[6]   Logical analysis of numerical data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A .
MATHEMATICAL PROGRAMMING, 1997, 79 (1-3) :163-190
[7]   An implementation of logical analysis of data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A ;
Mayoraz, E ;
Muchnik, I .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (02) :292-306
[8]  
Crama Y., 1988, Annals of Operations Research, V16, P299, DOI 10.1007/BF02283750
[9]   The maximum box problem and its application to data analysis [J].
Eckstein, J ;
Hammer, PL ;
Liu, Y ;
Nediak, M ;
Simeone, B .
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2002, 23 (03) :285-298
[10]   Disjunctive and conjunctive normal forms of pseudo-Boolean functions [J].
Foldes, S ;
Hammer, PL .
DISCRETE APPLIED MATHEMATICS, 2000, 107 (1-3) :1-26