Generalized Hamming distance

被引:100
作者
Bookstein, A [1 ]
Kulyukin, VA
Raita, T
机构
[1] Univ Chicago, Ctr Informat & Language Studies, Chicago, IL 60637 USA
[2] Utah State Univ, Dept Comp Sci, Logan, UT 84322 USA
[3] Univ Turku, Dept Comp Sci, FIN-20520 Turku, Finland
来源
INFORMATION RETRIEVAL | 2002年 / 5卷 / 04期
关键词
information retrieval; hamming distance; metrics; computer vision; image retrieval; object recognition; robot vision;
D O I
10.1023/A:1020499411651
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many problems in information retrieval and related fields depend on a reliable measure of the distance or similarity between objects that, most frequently, are represented as vectors. This paper considers vectors of bits. Such data structures implement entities as diverse as bitmaps that indicate the occurrences of terms and bitstrings indicating the presence of edges in images. For such applications, a popular distance measure is the Hamming distance. The value of the Hamming distance for information retrieval applications is limited by the fact that it counts only exact matches, whereas in information retrieval, corresponding bits that are close by can still be considered to be almost identical. We define a "Generalized Hamming distance" that extends the Hamming concept to give partial credit for near misses, and suggest a dynamic programming algorithm that permits it to be computed efficiently. We envision many uses for such a measure. In this paper we define and prove some basic properties of the "Generalized Hamming distance", and illustrate its use in the area of object recognition. We evaluate our implementation in a series of experiments, using autonomous robots to test the measure's effectiveness in relating similar bitstrings.
引用
收藏
页码:353 / 375
页数:23
相关论文
共 26 条
[1]   COMPRESSION OF CORRELATED BIT-VECTORS [J].
BOOKSTEIN, A ;
KLEIN, ST .
INFORMATION SYSTEMS, 1991, 16 (04) :387-400
[2]  
Bookstein A., 1990, Database and Expert Systems Applications. Proceedings of the International Conference, P1
[3]  
Bookstein A, 1998, J AM SOC INFORM SCI, V49, P102
[4]  
Cormen T. H., 1990, INTRO ALGORITHMS
[5]  
Crochemore M., 1994, TEXT ALGORITHMS
[6]  
DOYLE LB, 1961, J ACM, V8, P553, DOI 10.1145/321088.321095
[7]  
HAMMING RW, 1980, CODING INFORMATION T
[8]  
Hearst M. A., 1993, P 16 ANN INT ACM SIG, P59
[9]  
Jansen B. J., 1998, SIGIR Forum, V32, P5, DOI 10.1145/281250.281253
[10]  
Knuth Donald E., 1973, ART COMPUTER PROGRAM, V1