Effective optimizations of cluster-based nearest neighbor search in high-dimensional space
详细信息    查看全文
文摘
Nearest neighbor (NN) search in high-dimensional space plays a fundamental role in large-scale image retrieval. It seeks efficient indexing and search techniques, both of which are simultaneously essential for similarity search and semantic analysis. However, in recent years, there has been a rare breakthrough. Achievement of current techniques for NN search is far from satisfactory, especially for exact NN search. A recently proposed method, HB, addresses the exact NN search efficiently in high-dimensional space. It benefits from cluster-based techniques which can generate more compact representation of the data set than other techniques by exploiting interdimensional correlations. However, HB suffers from huge cost for lower bound computations and provides no further pruning scheme for points in candidate clusters. In this paper, we extend the HB method to address exact NN search in correlated, high-dimensional vector data sets extracted from large-scale image database by introducing two new pruning/selection techniques and we call it HB+. The first approach aims at selecting more quickly the subset of hyperplanes/clusters that must be considered. The second technique prunes irrelevant points in the selected subset of clusters. Performed experiments show the improvement of HB+ with respect to HB in terms of efficiency (I/O cost and CPU response time) and also demonstrate the superiority over other exact NN indexes.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700