On image classification: City images vs. landscapes

被引:257
作者
Vailaya, A
Jain, A [1 ]
Zhang, HJ
机构
[1] Michigan State Univ, Dept Comp Sci, E Lansing, MI 48824 USA
[2] Broadband Informat Syst Lab, HP Labs, Palo Alto, CA 94304 USA
关键词
image classification; clustering; salient features; similarity; image database; content-based retrieval;
D O I
10.1016/S0031-3203(98)00079-X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Grouping images into semantically meaningful categories using low-level Visual features is a challenging and important problem in content-based image retrieval. Based on these groupings, effective indices can be built for an image database. In this paper, we show how a specific high-level classification problem (city images vs landscapes) can be solved from relatively simple low-level features geared for the particular classes. We have developed a procedure to qualitatively measure the saliency of a feature towards a classification problem based on the plot of the intra-class and inter-class distance distributions. We use this approach to determine the discriminative power of the following features: color histogram, color coherence vector, DCT coefficient, edge direction histogram, and edge direction coherence vector. We determine that the edge direction-based features have the most discriminative power for the classification problem of interest here. A weighted k-NN classifier is used for the classification which results in an accuracy of 93.9% when evaluated on an image database of 2716 images using the leave-one-out method. This approach has been extended to further classify 528 landscape images into forests, mountains, and sunset/sunrise classes. First, the input images are classified as sunset/sunrise images vs forest & mountain images (94.5% accuracy) and then the forest & mountain images are classified as forest images or mountain images (91.7% accuracy). We are currently identifying further semantic classes to assign to images as well as extracting low level features which are salient for these classes. Our final goal is to combine multiple 2-class classifiers into a single hierarchical classifier. (C) 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:1921 / 1935
页数:15
相关论文
共 21 条
[1]  
[Anonymous], 1995, PROC ICJAI, DOI DOI 10.1145/217279.215068
[2]  
[Anonymous], 1998, IEEE INT WORKSH CONT
[3]  
BOLLE RM, 1996, IN PRESS IBM J RES D
[4]  
Faloutsos C., 1994, Journal of Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, V3, P231, DOI 10.1007/BF00962238
[5]  
FORSYTH DA, 1996, INT WORKSH OBJ REC C
[6]  
GORKANI MM, 1994, 12 INT C PATT REC JE, P459
[7]   Virage video engine [J].
Hampapur, A ;
Gupta, A ;
Horowitz, B ;
Shu, CF ;
Fuller, C ;
Bach, J ;
Gorkani, M ;
Jain, R .
STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 :188-198
[8]   VISUAL-PATTERN RECOGNITION BY MOMENT INVARIANTS [J].
HU, M .
IRE TRANSACTIONS ON INFORMATION THEORY, 1962, 8 (02) :179-&
[9]   Image retrieval using color and shape [J].
Jain, AK ;
Vailaya, A .
PATTERN RECOGNITION, 1996, 29 (08) :1233-1244
[10]  
Jain K, 1988, Algorithms for clustering data