文摘
Although distance between binary codes can be computed fast in hamming space, linear search is not practical for large scale dataset. Therefore attention has been paid to the efficiency of performing approximate nearest neighbor search, in which Hierarchical Clustering Trees (HCT) is the state-of-the-art method. However, HCT builds index with the whole binary codes, which degrades search performance. In this paper, we first propose an algorithm to compress binary codes by extracting distinctive bits according to the standard deviation of each bit. Then, a new index is proposed using com-pressed binary codes based on hierarchical decomposition of binary spaces. Experiments conducted on reference datasets and a dataset of one billion binary codes demonstrate the effectiveness and efficiency of our method.