Solving multi-instance problems with classifier ensemble based on constructive clustering

被引:115
作者
Zhou, Zhi-Hua [1 ]
Zhang, Min-Ling [1 ]
机构
[1] Nanjing Univ, Natl Lab Novel Software Technol, Nanjing 210093, Peoples R China
关键词
machine learning; multi-instance learning; classification; clustering; ensemble learning; knowledge representation; constructive induction;
D O I
10.1007/s10115-006-0029-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-instance learning, the training set is composed of labeled bags each consists of many unlabeled instances, that is, an object is represented by a set of feature vectors instead of only one feature vector. Most current multi-instance learning algorithms work through adapting single-instance learning algorithms to the multi-instance representation, while this paper proposes a new solution which goes at an opposite way, that is, adapting the multi- instance representation to single-instance learning algorithms. In detail, the instances of all the bags are collected together and clustered into d groups first. Each bag is then re-represented by d binary features, where the value of the ith feature is set to one if the concerned bag has instances falling into the ith group and zero otherwise. Thus, each bag is represented by one feature vector so that single-instance classifiers can be used to distinguish different classes of bags. Through repeating the above process with different values of d, many classifiers can be generated and then they can be combined into an ensemble for prediction. Experiments show that the proposed method works well on standard as well as generalized multi- instance problems.
引用
收藏
页码:155 / 170
页数:16
相关论文
共 42 条
[1]   C-Net: A Method for Generating Non-deterministic and Dynamic Multivariate Decision Trees [J].
H. A. Abbass ;
M. Towsey ;
G. Finn .
Knowledge and Information Systems, 2001, 3 (2) :184-197
[2]   Filtering multi-instance problems to reduce dimensionality in relational learning [J].
Alphonse, E ;
Matwin, S .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2004, 22 (01) :23-40
[3]  
AMAR RA, 2001, P 18 INT C MACH LEAR, P3
[4]  
Andrews S, 2002, Advances in Neural Information Processing Systems, V2, P561, DOI 10.5555/2968618.2968690
[5]   Approximating hyper-rectangles: Learning and pseudorandom sets [J].
Auer, P ;
Long, PM ;
Srinivasan, A .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1998, 57 (03) :376-388
[6]  
Auer P., 1997, P 14 INT C MACHINE L, P21
[7]  
Blake C.L., 1998, UCI repository of machine learning databases
[8]   Data-driven constructive induction [J].
Bloedorn, E ;
Michalski, RS .
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1998, 13 (02) :30-37
[9]   A note on learning from multiple-instance examples [J].
Blum, A ;
Kalai, A .
MACHINE LEARNING, 1998, 30 (01) :23-29
[10]  
Chen YX, 2004, J MACH LEARN RES, V5, P913