实时自适应的立体匹配网络算法

英文篇名：Real-time Adaptation for Stereo Matching
作者：曾军英 ; 冯武林 ; 秦传波 ; 甘俊英 ; 翟懿奎 ; 王璠 ; 朱伯远
英文作者：Zeng Junying;Feng Wulin;Qin Chuanbo;Gan Junying;Zhai Yikui;Wang Fan;Zhu Boyuan;School of Information Engineering, Wuyi University;
关键词：立体匹配 ; 实时自适应 ; 无监督损失 ; 残差细化模块 ; 视差图
英文关键词：stereo matching;;real-time adaptive;;unsupervised loss;;residual refinement module;;disparity
中文刊名：XXCN
英文刊名：Journal of Signal Processing
机构：五邑大学信息工程学院;
出版日期：2019-05-25
出版单位：信号处理
年：2019
期：v.35;No.237
基金：国家自然科学基金(61771347);; 广东省特色创新类项目(2017KTSCX181);; 广东省青年创新人才类项目(2017KQNCX206);; 江门市科技计划项目(江科[2017]268号);; 五邑大学青年基金(2015zk11)
语种：中文;
页：XXCN201905016
页数：7
CN：05
ISSN：11-2406/TN
分类号：119-125

摘要

立体匹配是一个经典的计算机视觉问题。采用传统方法或卷积神经网络(CNN)方法的立体匹配,其精确度和实时性不能满足实际的在线应用。针对该问题,本文提出一种实时自适应的立体匹配网络算法,通过引入一种新的轻量级的、有效的结构模块自适应立体匹配网络(Modularly Adaptive Stereo Network,MASNet),在网络中嵌入无监督损失模块和残差细化模块,使立体匹配的准确性和实时性得到提高。实验结果表明,本文方法相比具有相似复杂度的模型,精确度更高,并且能以平均约25帧每秒的处理速度达到在线使用的要求。
Stereo matching is a classical computer vision problem. The accuracy and real-time performance of the traditional method or convolutional neural network(CNN) method for stereo matching cannot meet the requirements of online application. Therefore, a real-time adaptive stereo matching network algorithm was proposed in this paper. The accuracy and real-time performance was improved by introducing a new lightweight and effective architecture Modularly Adaptive Stereo Network(MASNet), embedded an unsupervised loss function and residual refinement module. Firstly, the multi-scale features are extracted by the pyramid network. Then the initial disparity estimation is carried out, and finally, the final disparity map is output by the residual refinement module. The experimental results show that the proposed method is more accurate than the model with similar complexity, and the processing speed of about 25 frames per second on average meets the requirements of online usage.

引文

[1] Zhang K,Fang Y,Min D,et al.Cross-scale cost aggregation for stereo matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014:1590-1597.
    [2] Hosni A,Rhemann C,Bleyer M,et al.Fast cost-volume filtering for visual correspondence and beyond[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(2):504-511.
    [3] Liu C,Yuen J,Torralba A.Sift flow:Dense correspondence across scenes and its applications[J].IEEE transactions on pattern analysis and machine intelligence,2011,33(5):978-994.
    [4] 任艳楠,刘琚,元辉,等.采用几何复杂度的室外场景图像分割和深度生成[J].信号处理,2018,34(5):531-538.Ren Yannan,Liu Ju,Yuan hui,et al.Outdoor scene image segmentation and depth generation based on geometric complexity[J].Journal of Signal Processing,2018,34(5):531-538.(in Chinese)
    [5] Zbontar J,LeCun Y.Stereo matching by training a convolutional neural network to compare image patches[J].Journal of Machine Learning Research,2016,17(1-32):2.
    [6] Shaked A,Wolf L.Improved stereo matching with constant highway networks and reflective confidence learning[C]//Proc.of the IEEE Conference on Computer Vision and Pattern Recognition,2017:4641- 4650.
    [7] Guney F,Geiger A.Displets:Resolving stereo ambiguities using object knowledge[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:4165- 4175.
    [8] Luo W,Schwing A G,Urtasun R.Efficient deep learning for stereo matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:5695-5703.
    [9] Mayer N,Ilg E,Hausser P,et al.A large dataset to train convolutional networks for disparity,optical flow,and scene flow estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:4040- 4048.
    [10] Kendall A,Martirosyan H,Dasgupta S,et al.End-to-end learning of geometry and context for deep stereo regression[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:66-75.
    [11] Khamis S,Fanello S,Rhemann C,et al.Stereonet:Guided hierarchical refinement for real-time edge-aware depth prediction[C]//Proceedings of the European Conference on Computer Vision(ECCV),Munich,Germany,2018:8-14.
    [12] Jie Z,Wang P,Ling Y,et al.Left-Right Comparative Recurrent Model for Stereo Matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:3838-3846.
    [13] Chang J R,Chen Y S.Pyramid Stereo Matching Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:5410-5418.
    [14] Tonioni A,Poggi M,Mattoccia S,et al.Unsupervised adaptation for deep stereo[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:1605-1613.
    [15] Geiger A,Lenz P,Urtasun R.Are we ready for autonomous driving?the kitti vision benchmark suite[C]//Computer Vision and Pattern Recognition(CVPR),2012 IEEE Conference on.IEEE,2012:3354-3361.
    [16] Menze M,Geiger A.Object scene flow for autonomous vehicles[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:3061-3070.
    [17] Sun D,Yang X,Liu M Y,et al.Pwc-net:Cnns for optical flow using pyramid,warping,and cost volume[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:8934- 8943.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700