基于Faster RCNNH的多任务分层图像检索技术

英文篇名：Estimating Graphlets via Two Common Substructures Aware Sampling in Social NetworksMultitask Hierarchical Image Retrieval Technology Based on Faster RCNNH
作者：何霞 ; 汤一平 ; 王丽冉 ; 陈朋 ; 袁公萍
英文作者：HE Xia;TANG Yi-ping;WANG Li-ran;CHEN Peng;YUAN Gong-ping;School of Information Engineering,Zhejiang University of Technology;
关键词：深度哈希算法 ; 大规模图像检索 ; 多任务深度学习 ; 感兴趣区域 ; 哈希码
英文关键词：Deep hash algorithm;;Large-scale image retrieval;;Multitask deep learning;;Region of interest;;Hash code
中文刊名：JSJA
英文刊名：Computer Science
机构：浙江工业大学信息工程学院;
出版日期：2019-03-15
出版单位：计算机科学
年：2019
期：v.46
基金：国家自然科学基金项目(61070134,61379078)资助
语种：中文;
页：JSJA201903045
页数：11
CN：03
ISSN：50-1075/TP
分类号：309-319

摘要

针对已有的以图搜图技术中自动化和智能化水平低、缺乏深度学习、难以获取精确的检索结果、检索技术存储空间消耗大、检索速度慢且难以满足大数据时代的图像检索需求等问题,提出了一种基于Faster RCNNH(Faster RCNN Hash)的多任务分层图像检索方法。首先利用选择性检索网络在特征图上进行逻辑回归,得到图像中各感兴趣区域的概率向量,在此基础上结合紧凑量化网络对其进行编码,得到图像紧凑量化哈希码;其次利用再次筛选网络获取各感兴趣区域中响应最大的区域感知语义特征;接着针对每个感兴趣区域,基于量化哈希h矩阵的精检索策略来对图像进行快速比对;最后选出与查询图像中的对应感兴趣区域最相似的图像。提出的多任务学习方法不仅能同时得到图像紧凑量化哈希码和区域感知语义特征,还能有效去除图像背景和其他对象信息的干扰。实验结果表明:所提方法能实现端到端的训练,自动选出更高质量的感兴趣区域特征,提高了大规模图像检索的自动化和智能化水平,其检索精度(0.9478)与检索速度(0.306s)均明显优于现有的大规模图像检索技术。
Aiming at the problems of low-level automation and intelligence,lack of deep learning,being difficult to obtain high retrieval accuracy,large storage space,slow retrieval speed and hardly meeting the search requirements of big data era for the existing search technologies,this paper proposed a multitask hierarchical image retrieval technology based on faster RCNNH(Faster RCNN Hash).Firstly,the logical regression is performed on the feature map by using the selective retrieval network to obtain the probability vectors of each region of interest in the image.On this basis,the compact quantization network is combined to encode the probability vector and obtain the compact and quantitative hash of the image.Secondly,the re-screening network is utilized to obtain the region-aware semantic features of each region of interest.Then,aprecise search strategy based on quantitative hashing matrix is applied into each region of interest to compare the images fast.Finally,the image that is most similar to the corresponding region of interest in the query image is selected.Meanwhile,the proposed multitask learning method not only can simultaneously obtain compact and quantized hash codes and region-aware semantic features,but also can effectively remove the interference of the background and other objects.The experimental results show that the proposed method can achieve end-to-end training,and the network can automatically select the features with higher quality of the region of interest,thereby improving the automation and intelligence of large-scale image retrieval.The retrieval accuracy(0.9478)and search speed(0.306 s)of the proposed method are both significantly better than the existing large-scale image search technologies.

引文

[1]SMEULDERS A W M,WORRING M,SANTINI S,et al.Content-based image retrieval at the end of the early years[J].IEEETransactions on Pattern Analysis&Machine Intelligence,2000,22(12):1349-1380.
    [2]WAN J,WANG D,HOI S C H,et al.Deep Learning for Content-Based Image Retrieval:A Comprehensive Study[C]∥Acm International Conference on Multimedia.ACM,2014:157-166.
    [3]YUAN J,ZHENG Y,ZHANG C,et al.An interactive-voting based map matching algorithm[C]∥Proceedings of the 2010Eleventh International Conference on Mobile Data Management.IEEE Computer Society,2010:43-52.
    [4]BAY H,TUYTELAARS T,GOOL L V.SURF:Speeded Up Robust Features[J].Computer Vision&Image Understanding,2006,110(3):404-417.
    [5]QIU G.Indexing chromatic and achromatic patterns for contentbased colour image retrieval[J].Pattern Recognition,2002,35(8):1675-1686.
    [6]HAYKIN S,KOSKO B.Gradient Based Learning Applied to Document Recognition[M].New York:Wiley-IEEE Press.2009:306-351.
    [7]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1097-1105.
    [8]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].Computer Science,2014.arxiv:1409.1556
    [9]GIONIS A,INDYK P,MOTWANI R.Similarity Search in High Dimensions via Hashing[C]∥International Conference on Very Large Data Bases.Morgan Kaufmann Publishers Inc.,2000:518-529.
    [10]WEISS Y,TORRALBA A,FERGUS R.Spectral Hashing[C]∥Proceedings of the Twenty-second Annual Conference on Neural Information Processing Systems.Curran Associates Inc.,2008.
    [11]CHANG S F.Supervised hashing with kernels[C]∥IEEE Conference on Computer Vision and Pattern Recongnition.2012.
    [12]GONG Y,LAZEBNIK S.Iterative quantization:A procrustean approach to learning binary codes[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2011:817-824.
    [13]KULIS B,GRAUMANK.Kernelized locality-sensitive hashing for scalable image search[C]∥IEEE International Conference on Computer Vision.IEEE,2009:2130-2137.
    [14]XIA R,PAN Y,LAI H,et al.Supervised hashing for image retrieval via image representation learning[C]∥AAAI Conference on Artificial Intelligence.2014.
    [15]LI J Y,LI J H.Supervised hashing binary code with deep CNNfor image retrieval[C]∥International Conference on Biomedical Engineering and Informatics.2015:649-655.
    [16]LAI H,PAN Y,LIU Y,et al.Simultaneous feature learning and hash coding with deep neural networks[C]∥IEEE Conference on Computer Vision and Patter Recongnition.2015:3270-3278.
    [17]LIN K,YANG H F,HSIAO J H,et al.Deep learning of binary hash codes for fast image retrieval[C]∥Computer Vision and Pattern Recognition Workshops.IEEE,2015:27-35.
    [18]REN S Q,HE K M,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]∥Advances in neural information processing systems(NIPS).Palais des Congrès de Montréal:2015:91-99.
    [19]OLIVA A,TORRALBA A.Chapter 2 Building the gist of a scene:the role of global image features in recognition[J].Progress in Brain Research,2006,155(2):23.
    [20]DENG J,DONG W,SOCHER R,et al.ImageNet:A large-scale hierarchical image database[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009).IEEE,2009:248-255.
    [21]RAGINSKY M.Locality-Sensitive Binary Codes from Shift-Invariant Kernels[J].Advances in Neural Information Processing Systems,2009:1509-1517.
    [22]GONG Y,LAZEBNIK S.Iterative quantization:A procrustean approach to learning binary codes[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2011:817-824.
    [23]YU F X,KUMAR S,GONG Y,et al.Circulant Binary Embedding[J].Computer Science,2014:946-954.arxiv:1405.3162
    [24]BERG T,LIU J,LEE S W,et al.Birdsnap:Large-Scale FineGrained Visual Categorization of Birds[C]∥Computer Vision and Pattern Recognition.IEEE,2014:2019-2026.
    [25]WEISS Y,TORRALBA A,FERGUS R.Spectral hashing[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.2008:1753-1760.
    [26]JIN Z,LI C,LIN Y,et al.Density sensitive hashing[J].IEEETransactions on Cybernetics,2012,44(8):1362-1371.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700