摘要
针对大规模无序图像分类处理中成对图像的匹配和几何验证的计算量大的问题,该文通过研究和学习机器学习及图像识别领域先进的方法,提出了一种基于孪生神经网络的大规模图像有序化方法。该算法主要是:通过抽取已训练好的VGG19的网络模型的卷积层作为图像的特征,将提出的特征分别加权后,连接起来,再次卷积和池化,利用响应函数判定图像之间连通性,实现对输入图像对连通性判定。经实验证明,该算法可有效地识别具有场景重叠的图像对,效率和精度上也有所提高,无须执行详尽的推定匹配和几何验证,适用于运动恢复结构,图像连接等各种场景。
Aiming at the extensive computational work of paired image matching and geometric verification in large-scale disordered image classification,a large-scale image ordering method based on siamese network was proposed in this paper,by studying and learning advanced methods in the field of machine learning and image recognition.The algorithm was mainly:over-extracting the network model of trained VGG19.The convolutional layer was used as a feature of the image,and the proposed features were weighted separately,connected,re-convolved and pooled,and the response function was used to determine the connectivity between the images.It was proved by experiments that the algorithm could effectively identify image pairs with overlapping scenes,and the efficiency and precision were also improved,and it was not necessary to perform detailed presump matching and geometric verification.
引文
[1]卢俊.基于无序多视影像的三维重建关键技术研究[D].郑州:信息工程大学,2015.(LU Jun.Study on3Dreconstruction key technology of unordered multiview image[D].Zhengzhou:Information Engineering University,2015.)
[2]张皓,吴建鑫.基于深度特征的无监督图像检索研究综述[J].计算机研究与发展,2018,55(9):1829-1842.(ZHANGHao,WU Jianxin.A survey on unsupervised image retrieval using deep features[J].Journal of Computer Research and Development,2018,55(9):1829-1842.)
[3]RAGURAM R,TIGHE J,FRAHM J M.Improved geometric verification for large scale landmark image collections[C/OL]∥BMVC,2012:1-11[2019-02-28].http:∥www.cs.unc.edu/~jtighe/Papers/BMVC12/jtighe-bmvc12.pdf.
[4]LOU Y,SNAVELY N,GEHRKE J.Matchminer:efficient spanning structure mining in large image collections[C]∥Computer Vision-ECCV 2012.Berlin:Springer,2012:45-58.
[5]CAO S,SNAVELY N.Learning to match images in large-scale collections[C]∥European Conference on Computer Vision.Berlin:Springer,2012:259-270.
[6]WU C.Towards linear-time incremental structure from motion[C]∥2013International Conference on 3DVision.[S.l.]:IEEE,2013:127-134.
[7]GHERARDI R,FARENZENA M,FUSIELLO A.Improving the efficiency of hierarchical structure-andmotion[C]∥2010IEEE Computer Society Conference on Computer Vision and Pattern Recognition.[S.l.]:IEEE,2010:1594-1600.
[8]ZAGORUYKO S,KOMODAKIS N.Learning to compare image patches via convolutional neural networks[C/OL]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:4353-4361[2019-02-28].http:∥de.arxiv.org/pdf/1504.03641.
[9]MOU L,SCHMITT M,WANG Y,et al.Identifying corresponding patches in SAR and optical imagery with a convolutional neural network[C]∥2017IEEE International Geoscience and Remote Sensing Symposium(IGARSS).[S.l.]:IEEE,2017:5482-5485.
[10]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C/OL]∥ICLR2015.[2019-02-28].https:∥arxiv.org/pdf/1409.1556.pdf.
[11]BERTINETTO L,VALMADRE J,HENRIQUES J F,et al.Fully-convolutional siamese networks for object tracking[C]∥European Conference on Computer Vision.Cham:Springer,2016:850-865.
[12]RADENOVI C'F,TOLIAS G,CHUM O.Fine-tuning CNN image retrieval with no human annotation[J/OL].IEEE transactions on pattern analysis and machine intelligence,2018[2019-02-28].http:∥cmp.felk.cvut.cz/~radenfil/publications/Radenovic-arXiv17a.pdf.
[13]TOLIAS G,SICRE R,JGOU H.Particular object retrieval with integral max-pooling of CNN activations[C/OL]∥ICLR 2016.[2019-02-28].https:∥arxiv.org/pdf/1511.05879.pdf.
[14]XIE X,YANG T,LI J,et al.Fast and seamless largescale aerial 3Dreconstruction using graph framework[C]∥Proceedings of the 2018International Conference on Image and Graphics Processing.New York:ACM,2018:126-130.
[15]SCHONBERGER J L,RADENOVIC F,CHUM O,et al.From single image query to detailed 3dreconstruction[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.[S.l.]:IEEE,2015:5126-5134.
[16]WILSON K,SNAVELY N.Robust global translations with 1DSfM[C]∥European Conference on Computer Vision.Cham:Springer,2014:61-75.