Learning recognition and segmentation using the Cresceptron

被引：56

作者：

Weng, J ^{[1
]}

Ahuja, N ^{[1
]}

Huang, TS ^{[1
]}

机构：

[1] UNIV ILLINOIS, BECKMAN INST, URBANA, IL 61801 USA

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 1997年 / 25卷 / 02期

基金：

美国国家科学基金会;

关键词：

visual learning; face recognition; face detection; object recognition; object segmentation; feature selection; feature extraction; shape representation; self-organization; associative memory;

D O I：

10.1023/A:1007967800668

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];

摘要：

This paper presents a framework called Cresceptron for view-based learning, recognition and segmentation. Specifically, it recognizes and segments image patterns that are similar to those learned, using a stochastic distortion model and view-based interpolation, allowing other view points that are moderately different from those used in learning. The learning phase is interactive. The user trains the system using a collection of training images. For each training image, the user manually draws a polygon outlining the region of interest and types in the label of its class. Then, from the directional edges of each of the segmented regions, the Cresceptron uses a hierarchical self-organization scheme to grow a sparsely connected network automatically, adaptively and incrementally during the learning phase. At each level, the system detects new image structures that need to be learned and assigns a new neural plane for each new feature. The network grows by creating new nodes and connections which memorize the new image structures and their context as they are detected. Thus, the structure of the network is a function of the training exemplars. The Cresceptron incorporates both individual learning and class learning; with the former, each training example is treated as a different individual while with the latter, each example is a sample of a class. In the performance phase, segmentation and recognition are tightly coupled. No foreground extraction is necessary, which is achieved by backtracking the response of the network down the hierarchy to the image parts contributing to recognition. Several stochastic shape distortion models are analyzed to show why multilevel matching such as that in the Cresceptron can deal with more general stochastic distortions that a single-level matching scheme cannot. The system is demonstrated using images from broadcast television and other video segments to learn faces and other objects, and then later to locate and to recognize similar, but possibly distorted, views of the same objects.

引用

页码：109 / 143

页数：35

共 72 条

[1]

Anderson J.R., 1980, COGNITIVE PSYCHOL IT

[2]

[Anonymous], 1988, SELF ORG ASS MEMORY

[3]

[Anonymous], 1993, P 4 INT C COMP VIS B

[4]

ARMAN F, 1991, JUN IEEE WORKSH DIR, P124