Analysis of the consistency of a mixed integer programming-based multi-category constrained discriminant model

被引:15
作者
Brooks, J. Paul [2 ]
Lee, Eva K. [1 ]
机构
[1] Georgia Inst Technol, Sch Ind & Syst Engn, Atlanta, GA 30332 USA
[2] Virginia Commonwealth Univ, Dept Stat Sci & Operat Res, Richmond, VA 23284 USA
基金
美国国家科学基金会;
关键词
Constrained discriminant analysis; Mixed integer program; Multi-category classification; Multi-group classification; Consistency; Reserved judgment; UNIFORM-CONVERGENCE; CLASSIFICATION;
D O I
10.1007/s10479-008-0424-0
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Classification is concerned with the development of rules for the allocation of observations to groups, and is a fundamental problem in machine learning. Much of previous work on classification models investigates two-group discrimination. Multi-category classification is less-often considered due to the tendency of generalizations of two-group models to produce misclassification rates that are higher than desirable. Indeed, producing "good" two-group classification rules is a challenging task for some applications, and producing good multi-category rules is generally more difficult. Additionally, even when the "optimal" classification rule is known, inter-group misclassification rates may be higher than tolerable for a given classification model. We investigate properties of a mixed-integer programming based multi-category classification model that allows for the pre-specification of limits on inter-group misclassification rates. The mechanism by which the limits are satisfied is the use of a reserved judgment region, an artificial category into which observations are placed whose attributes do not sufficiently indicate membership to any particular group. The method is shown to be a consistent estimator of a classification rule with misclassification limits, and performance on simulated and real-world data is demonstrated.
引用
收藏
页码:147 / 168
页数:22
相关论文
共 26 条
[1]  
ANDERSON JA, 1969, J ROY STAT SOC B, V31, P123
[2]  
[Anonymous], 1998, UCI REPOSITORY MACHI
[3]  
[Anonymous], 2000, Pattern Classification
[4]  
BECKMAN RJ, 2006, J AM STAT ASSOC, V76, P671
[5]   DISTRIBUTION-FREE PARTIAL DISCRIMINANT-ANALYSIS [J].
BROFFITT, JD ;
RANDLES, RH ;
HOGG, RV .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1976, 71 (356) :934-939
[6]  
Cover T.M., 1968, P HAWAII INT C SYSTE, P413
[7]   INTERNATIONAL APPLICATION OF A NEW PROBABILITY ALGORITHM FOR THE DIAGNOSIS OF CORONARY-ARTERY DISEASE [J].
DETRANO, R ;
JANOSI, A ;
STEINBRUNN, W ;
PFISTERER, M ;
SCHMID, JJ ;
SANDHU, S ;
GUPPY, KH ;
LEE, S ;
FROELICHER, V .
AMERICAN JOURNAL OF CARDIOLOGY, 1989, 64 (05) :304-310
[8]   DNA motifs associated with aberrant CpG island methylation [J].
Feltus, F. Alex ;
Lee, Eva K. ;
Costello, Joseph F. ;
Plass, Christoph ;
Vertino, Paula M. .
GENOMICS, 2006, 87 (05) :572-579
[9]   Predicting aberrant CpG island methylation [J].
Feltus, FA ;
Lee, EK ;
Costello, JF ;
Plass, C ;
Vertino, PM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (21) :12253-12258
[10]  
Gallagher R J, 1996, Proc AMIA Annu Fall Symp, P209