Automatic linguistic indexing of pictures by a statistical modeling approach

被引:617
作者
Li, J [1 ]
Wang, JZ
机构
[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[2] Penn State Univ, Sch Informat Sci & Technol, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
content-based image retrieval; image classification; hidden Markov model; computer vision; statistical learning; wavelets;
D O I
10.1109/TPAMI.2003.1227984
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of statistical models each representing a concept. Images of any given concept are regarded as instances of a stochastic process that characterizes the concept. To measure the extent of association between an image and the textual description of a concept, the likelihood of the occurrence of the image based on the characterizing stochastic process is computed. A high likelihood indicates a strong association. In our experimental implementation, we focus on a particular group of stochastic processes, that is, the two-dimensional multiresolution hidden Markov models (2D MHMMs). We implemented and tested our ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 different concepts, each with about 40 training images. The system is evaluated quantitatively using more than 4,600 images outside the training database and compared with a random annotation scheme. Experiments have demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images.
引用
收藏
页码:1075 / 1088
页数:14
相关论文
共 27 条
[1]  
[Anonymous], 1993, MARKOV RANDOM FIELDS
[2]  
Barnard K, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL II, PROCEEDINGS, P408, DOI 10.1109/ICCV.2001.937654
[3]   Efficient image retrieval with multiple distance measures [J].
Berman, A ;
Shapiro, L .
STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 :12-21
[4]   A region-based fuzzy feature matching approach to content-based image retrieval [J].
Chen, YX ;
Wang, JZ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (09) :1252-1267
[5]  
Daubechies I., 1993, Ten Lectures of Wavelets, V28, P350
[6]   DESCRIPTION OF A RANDOM FIELD BY MEANS OF CONDITIONAL PROBABILITIES AND CONDITIONS OF ITS REGULARITY [J].
DOBRUSCH.PL .
THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1968, 13 (02) :197-&
[7]  
Duygulu P, 2002, LECT NOTES COMPUT SC, V2353, P97
[8]  
DUYGULU P, 2001, COMPUTER VISION PATT, V2, P434
[9]  
Forsyth DA, 2002, COMPUTER VISION MODE
[10]   STOCHASTIC RELAXATION, GIBBS DISTRIBUTIONS, AND THE BAYESIAN RESTORATION OF IMAGES [J].
GEMAN, S ;
GEMAN, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (06) :721-741