Learning scale-variant and scale-invariant features for deep image classification

被引：125

作者：

van Noord, Nanne ^{[1
]}

Postma, Eric ^{[1
]}

机构：

[1] Tilburg Univ, Tilburg Ctr Commun & Cognit, Warandelaan 2, NL-5037 AB Tilburg, Netherlands

来源：

PATTERN RECOGNITION | 2017年 / 61卷

关键词：

Convolutional Neural Networks; Multi-scale; Artist Attribution; Scale-variant Features; VAN GOGH;

D O I：

10.1016/j.patcog.2016.06.005

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Networks (CNNs) require large image corpora to be trained on classification tasks. The variation in image resolutions, sizes of objects and patterns depicted, and image scales, hampers CNN training and performance, because the task-relevant information varies over spatial scales. Previous work attempting to deal with such scale variations focused on encouraging scale-invariant CNN representations. However, scale-invariant representations are incomplete representations of images, because images contain scale-variant information as well. This paper addresses the combined development of scale-invariant and scale-variant representations. We propose a multi-scale CNN method to encourage the recognition of both types of features and evaluate it on a challenging image classification task involving task-relevant characteristics at multiple scales. The results show that our multi-scale CNN outperforms single-scale CNN. This leads to the conclusion that encouraging the combined development of a scale-invariant and scale-variant representation in CNNs is beneficial to image recognition performance. (C) 2016 The Authors. Published by Elsevier Ltd.

引用

页码：583 / 592

页数：10

共 43 条

[21]

[Anonymous], 2014, 2 INT C LEARN REPR I

[22]

Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110

[23] Multi-column deep neural network for traffic sign classification [J].

Ciresan, Dan ;

Meier, Ueli ;

Masci, Jonathan ;

Schmidhuber, Juergen .

NEURAL NETWORKS, 2012, 32 :333-338

[24] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[25]

Gong Y., 2014, ICLR

[26] Quantification of artistic style through sparse coding analysis in the drawings of Pieter Bruegel the Elder [J].

Hughes, James M. ;

Graham, Daniel J. ;

Rockmore, Daniel N. .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (04) :1279-1283

[27]

Jaderberg M., 2015, NIPS 15 P 28 INT C N, DOI DOI 10.48550/ARXIV.1506.02025

[28] Image processing for artist identification [J].

Johnson, C. Richard, Jr. ;

Hendriks, Ella ;

Berezhnoy, Igor J. ;

Brevdo, Eugene ;

Hughes, Shannon M. ;

Daubechies, Ingrid ;

Li, Jia ;

Postma, Eric ;

Wang, James Z. .

IEEE SIGNAL PROCESSING MAGAZINE, 2008, 25 (04) :37-48

[29]

Kanazawa A., 2014, ARXIV14125104

[30]

Krizhevsky A., 2017, COMMUN ACM, V60, P84, DOI DOI 10.1145/3065386

← 1 2 3 4 5 →