摘要
立体匹配是一个经典的计算机视觉问题。采用传统方法或卷积神经网络(CNN)方法的立体匹配,其精确度和实时性不能满足实际的在线应用。针对该问题,本文提出一种实时自适应的立体匹配网络算法,通过引入一种新的轻量级的、有效的结构模块自适应立体匹配网络(Modularly Adaptive Stereo Network,MASNet),在网络中嵌入无监督损失模块和残差细化模块,使立体匹配的准确性和实时性得到提高。实验结果表明,本文方法相比具有相似复杂度的模型,精确度更高,并且能以平均约25帧每秒的处理速度达到在线使用的要求。
Stereo matching is a classical computer vision problem. The accuracy and real-time performance of the traditional method or convolutional neural network(CNN) method for stereo matching cannot meet the requirements of online application. Therefore, a real-time adaptive stereo matching network algorithm was proposed in this paper. The accuracy and real-time performance was improved by introducing a new lightweight and effective architecture Modularly Adaptive Stereo Network(MASNet), embedded an unsupervised loss function and residual refinement module. Firstly, the multi-scale features are extracted by the pyramid network. Then the initial disparity estimation is carried out, and finally, the final disparity map is output by the residual refinement module. The experimental results show that the proposed method is more accurate than the model with similar complexity, and the processing speed of about 25 frames per second on average meets the requirements of online usage.
引文
[1] Zhang K,Fang Y,Min D,et al.Cross-scale cost aggregation for stereo matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014:1590-1597.
[2] Hosni A,Rhemann C,Bleyer M,et al.Fast cost-volume filtering for visual correspondence and beyond[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(2):504-511.
[3] Liu C,Yuen J,Torralba A.Sift flow:Dense correspondence across scenes and its applications[J].IEEE transactions on pattern analysis and machine intelligence,2011,33(5):978-994.
[4] 任艳楠,刘琚,元辉,等.采用几何复杂度的室外场景图像分割和深度生成[J].信号处理,2018,34(5):531-538.Ren Yannan,Liu Ju,Yuan hui,et al.Outdoor scene image segmentation and depth generation based on geometric complexity[J].Journal of Signal Processing,2018,34(5):531-538.(in Chinese)
[5] Zbontar J,LeCun Y.Stereo matching by training a convolutional neural network to compare image patches[J].Journal of Machine Learning Research,2016,17(1-32):2.
[6] Shaked A,Wolf L.Improved stereo matching with constant highway networks and reflective confidence learning[C]//Proc.of the IEEE Conference on Computer Vision and Pattern Recognition,2017:4641- 4650.
[7] Guney F,Geiger A.Displets:Resolving stereo ambiguities using object knowledge[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:4165- 4175.
[8] Luo W,Schwing A G,Urtasun R.Efficient deep learning for stereo matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:5695-5703.
[9] Mayer N,Ilg E,Hausser P,et al.A large dataset to train convolutional networks for disparity,optical flow,and scene flow estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:4040- 4048.
[10] Kendall A,Martirosyan H,Dasgupta S,et al.End-to-end learning of geometry and context for deep stereo regression[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:66-75.
[11] Khamis S,Fanello S,Rhemann C,et al.Stereonet:Guided hierarchical refinement for real-time edge-aware depth prediction[C]//Proceedings of the European Conference on Computer Vision(ECCV),Munich,Germany,2018:8-14.
[12] Jie Z,Wang P,Ling Y,et al.Left-Right Comparative Recurrent Model for Stereo Matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:3838-3846.
[13] Chang J R,Chen Y S.Pyramid Stereo Matching Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:5410-5418.
[14] Tonioni A,Poggi M,Mattoccia S,et al.Unsupervised adaptation for deep stereo[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:1605-1613.
[15] Geiger A,Lenz P,Urtasun R.Are we ready for autonomous driving?the kitti vision benchmark suite[C]//Computer Vision and Pattern Recognition(CVPR),2012 IEEE Conference on.IEEE,2012:3354-3361.
[16] Menze M,Geiger A.Object scene flow for autonomous vehicles[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:3061-3070.
[17] Sun D,Yang X,Liu M Y,et al.Pwc-net:Cnns for optical flow using pyramid,warping,and cost volume[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:8934- 8943.