Multiple Binary Codes for Fast Approximate Similarity Search

Shinichi SHIRAKAWA

doi:10.1587/transinf.2014EDP7212

Multiple Binary Codes for Fast Approximate Similarity Search

Shinichi SHIRAKAWA

Full Text Views

0

Share
Cite this

Summary :

One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.

Publication: IEICE TRANSACTIONS on Information Vol.E98-D No.3 pp.671-680

Publication Date: 2015/03/01

Publicized: 2014/12/11

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2014EDP7212

Type of Manuscript: PAPER

Category: Pattern Recognition

Authors

Shinichi SHIRAKAWA
Aoyama Gakuin University

Keyword

Cite this

Copy

Shinichi SHIRAKAWA, "Multiple Binary Codes for Fast Approximate Similarity Search" in IEICE TRANSACTIONS on Information, vol. E98-D, no. 3, pp. 671-680, March 2015, doi: 10.1587/transinf.2014EDP7212.
Abstract: One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.2014EDP7212/_p

Copy

@ARTICLE{e98-d_3_671,
author={Shinichi SHIRAKAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Multiple Binary Codes for Fast Approximate Similarity Search},
year={2015},
volume={E98-D},
number={3},
pages={671-680},
abstract={One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.},
keywords={},
doi={10.1587/transinf.2014EDP7212},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Multiple Binary Codes for Fast Approximate Similarity Search
T2 - IEICE TRANSACTIONS on Information
SP - 671
EP - 680
AU - Shinichi SHIRAKAWA
PY - 2015
DO - 10.1587/transinf.2014EDP7212
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2015
AB - One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.
ER -