One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.
Shinichi SHIRAKAWA
Aoyama Gakuin University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Shinichi SHIRAKAWA, "Multiple Binary Codes for Fast Approximate Similarity Search" in IEICE TRANSACTIONS on Information,
vol. E98-D, no. 3, pp. 671-680, March 2015, doi: 10.1587/transinf.2014EDP7212.
Abstract: One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.2014EDP7212/_p
Copy
@ARTICLE{e98-d_3_671,
author={Shinichi SHIRAKAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Multiple Binary Codes for Fast Approximate Similarity Search},
year={2015},
volume={E98-D},
number={3},
pages={671-680},
abstract={One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.},
keywords={},
doi={10.1587/transinf.2014EDP7212},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Multiple Binary Codes for Fast Approximate Similarity Search
T2 - IEICE TRANSACTIONS on Information
SP - 671
EP - 680
AU - Shinichi SHIRAKAWA
PY - 2015
DO - 10.1587/transinf.2014EDP7212
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2015
AB - One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.
ER -