Invariant Scattering Convolution Networks

被引:1100
作者
Bruna, Joan [1 ]
Mallat, Stephane [2 ]
机构
[1] NYU, Courant Inst, New York, NY 10003 USA
[2] Ecole Normale Super, F-75005 Paris, France
关键词
Classification; convolution networks; deformations; invariants; wavelets; MODELS;
D O I
10.1109/TPAMI.2012.230
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A wavelet scattering network computes a translation invariant image representation which is stable to deformations and preserves high-frequency information for classification. It cascades wavelet transform convolutions with nonlinear modulus and averaging operators. The first network layer outputs SIFT-type descriptors, whereas the next layers provide complementary invariant information that improves classification. The mathematical analysis of wavelet scattering networks explains important properties of deep convolution networks for classification. A scattering representation of stationary processes incorporates higher order moments and can thus discriminate textures having the same Fourier power spectrum. State-of-the-art classification results are obtained for handwritten digits and texture discrimination, with a Gaussian kernel SVM and a generative PCA classifier.
引用
收藏
页码:1872 / 1886
页数:15
相关论文
共 37 条
[11]  
Broadhurst R.E., 2005, P WORKSH TEXT AN SYN
[12]  
Bruna J., 2012, THESIS CMAP
[13]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[14]   Using Basic Image Features for Texture Classification [J].
Crosier, M. ;
Griffin, L. D. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (03) :447-460
[15]  
Fei-Fei Li, 2004, COMPUT VIS IMAGE UND, P178, DOI [DOI 10.1016/J.CVIU.2005.09.012, DOI 10.1109/CVPR.2004.383]
[16]   Rotation invariant texture classification using LBP variance (LBPV) with global matching [J].
Guo, Zhenhua ;
Zhang, Lei ;
Zhang, David .
PATTERN RECOGNITION, 2010, 43 (03) :706-719
[17]  
Haasdonk B., 2002, P 16 INT C PATT REC
[18]  
Hayman E., 2004, P EUR C COMP VIS
[19]   What is the Best Multi-Stage Architecture for Object Recognition? [J].
Jarrett, Kevin ;
Kavukcuoglu, Koray ;
Ranzato, Marc'Aurelio ;
LeCun, Yann .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :2146-2153
[20]   Deformation models for image recognition [J].
Keysers, Daniel ;
Deselaers, Thomas ;
Gollan, Christian ;
Ney, Hermann .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (08) :1422-1435