Learning a Generative Model of Images by Factoring Appearance and Shape

被引:51
作者
Le Roux, Nicolas [1 ]
Heess, Nicolas [2 ]
Shotton, Jamie [1 ]
Winn, John [1 ]
机构
[1] Microsoft Res Cambridge, Machine Learning & Percept, Cambridge CB3 0FB, England
[2] Univ Edinburgh, Sch Informat, Inst Adapt & Neural Computat, Neuroinformat & Computat Neurosci Doctoral Traini, Edinburgh EH8 9AB, Midlothian, Scotland
关键词
SEGMENTATION; LIKELIHOOD; EMERGENCE; ALGORITHM;
D O I
10.1162/NECO_a_00086
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Computer vision has grown tremendously in the past two decades. Despite all efforts, existing attempts at matching parts of the human visual system's extraordinary ability to understand visual scenes lack either scope or power. By combining the advantages of general low-level generative models and powerful layer-based and hierarchical models, this work aims at being a first step toward richer, more flexible models of images. After comparing various types of restricted Boltzmann machines (RBMs) able to model continuous-valued data, we introduce our basic model, the masked RBM, which explicitly models occlusion boundaries in image patches by factoring the appearance of any patch region from its shape. We then propose a generative model of larger images using a field of such RBMs. Finally, we discuss how masked RBMs could be stacked to form a deep model able to generate more complicated structures and suitable for various tasks such as segmentation or object recognition.
引用
收藏
页码:593 / 650
页数:58
相关论文
共 42 条
[1]
ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2]
[Anonymous], 2006, IEEE Conference on Computer Vision and Pattern Recognition
[3]
[Anonymous], ADV NEURAL INFORM PR
[4]
[Anonymous], 2010, ICML
[5]
[Anonymous], 2008, ADV NEURAL INFORM PR
[6]
[Anonymous], 2009, P ACM INT C MACH LEA
[7]
Bienenstock Elie., 1997, Neural Information Processing Systems, P838
[8]
BOUCHARD G, 2005, COMPUTER VISION PATT, V1, P710
[9]
MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]
FIDLER S, 2007, 2007 IEEE COMP SOC C