Linear unsupervised hashing for ANN search in Euclidean space
详细信息    查看全文
文摘
Approximate nearest neighbors (ANN) search for large scale data has attracted considerable attention due to the fact that large amounts of data are easily available. Recently, hashing has been widely adopted for similarity search because of its good potential for low storage cost and fast query speed. Among of them, when semantic similarity information is available, supervised hashing methods show better performance than unsupervised ones. However, supervised hashing methods need explicit similarity information which is not available in some scenarios. In addition, they have the problems of difficult optimization and time consuming for training, which make them unpracticable to large scale data. In this paper, we propose an unsupervised hashing method – Unsupervised Euclidean Hashing (USEH), which learns and generates hashing codes to preserve the Euclidean distance relationship between data. Specifically, USEH first utilizes Locality-Sensitive Hashing (LSH) to generate pseudo labels; then, it adopts a sequential learning strategy to learn the hash functions, one bit at a time, which can generate very discriminative codes. Moreover, USEH avoids explicitly computing the similarity matrix by decomposing it into the product of a label matrix and its transposition, which makes the training complexity of USEH linear to the size of training samples when the number of training samples is much greater than the dimension of feature. Thus, it can efficiently work on large scale data. We test USEH on two large scale datasets – SIFT1M and GIST1M. Experimental results show that USEH is comparable to state-of-the-art unsupervised hashing methods.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700