Multiple instance learning with bag dissimilarities

被引:137
作者
Cheplygina, Veronika [1 ]
Tax, David M. J. [1 ]
Loog, Marco [1 ,2 ]
机构
[1] Delft Univ Technol, Pattern Recognit Lab, NL-2628 CD Delft, Netherlands
[2] Univ Copenhagen, Image Grp, DK-2100 Copenhagen, Denmark
关键词
Multiple instance learning; Dissimilarity representation; Point set distance; Image classification; Drug activity prediction; Text categorization; ACCURACY; DISTANCE;
D O I
10.1016/j.patcog.2014.07.022
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Multiple instance learning (MIL) is concerned with learning from sets (bags) of objects (instances), where the individual instance labels are ambiguous. In this setting, supervised learning cannot be applied directly. Often, specialized MIL methods learn by making additional assumptions about the relationship of the bag labels and instance labels. Such assumptions may fit a particular dataset, but do not generalize to the whole range of MIL problems. Other MIL methods shift the focus of assumptions from the labels to the overall (dis)similarity of bags, and therefore learn from bags directly. We propose to represent each bag by a vector of its dissimilarities to other bags in the training set, and treat these dissimilarities as a feature representation. We show several alternatives to define a dissimilarity between bags and discuss which definitions are more suitable for particular MIL problems. The experimental results show that the proposed approach is computationally inexpensive, yet very competitive with state-of-the-art algorithms on a wide range of MIL datasets. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:264 / 275
页数:12
相关论文
共 45 条
[1]
[Anonymous], 2002, PROC 15 INT C NEURAL
[2]
[Anonymous], 2000, ICML
[3]
[Anonymous], MATLAB TOOLBOX MULTI
[4]
[Anonymous], MATLAB TOOLBOX PATTE
[5]
artner T., 2002, ICML, P179
[6]
A theory of learning with similarity functions [J].
Balcan, Maria-Florina ;
Blum, Avrim ;
Srebro, Nathan .
MACHINE LEARNING, 2008, 72 (1-2) :89-112
[7]
The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[8]
Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach [J].
Briggs, Forrest ;
Lakshminarayanan, Balaji ;
Neal, Lawrence ;
Fern, Xiaoli Z. ;
Raich, Raviv ;
Hadley, Sarah J. K. ;
Hadley, Adam S. ;
Betts, Matthew G. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (06) :4640-4650
[9]
On Accuracy of PDF Divergence Estimators and Their Applicability to Representative Data Sampling [J].
Budka, Marcin ;
Gabrys, Bogdan ;
Musial, Katarzyna .
ENTROPY, 2011, 13 (07) :1229-1266
[10]
LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)