基于卷积神经网络的语义同时定位以及地图构建方法

英文篇名：Semantic Simultaneous Localization and Mapping Method Based on Convolutional Neural Network
作者：刘智杰 ; 赵一兵 ; 李琳辉 ; 张溪桐 ; 周雅夫
英文作者：LIU Zhi-jie;ZHAO Yi-bing;LI Lin-hui;ZHANG Xi-tong;ZHOU Ya-fu;Dalian University of Technology of Automotive Engineering,Faculty of Vehicle Engineering and Mechanics,State Key Laboratory of Structural (Dalian University of Technology);
关键词：智能车辆 ; 语义同时定位以及地图构建 ; 卷积神经网络 ; 立体视觉
英文关键词：autonomous vehicles;;semantic SLAM;;convolutional neural network;;stereo vision
中文刊名：KXJS
英文刊名：Science Technology and Engineering
机构：大连理工大学运载工程与力学学部汽车工程学院,工业装备结构分析国家重点实验室(大连理工大学);
出版日期：2019-03-28
出版单位：科学技术与工程
年：2019
期：v.19;No.478
基金：国家自然科学基金(51775082,61473057);; 中央高校基本科研业务费专项基金(DUT17LAB11,DUT15LK13)资助
语种：中文;
页：KXJS201909024
页数：6
CN：09
ISSN：11-4688/T
分类号：153-158

摘要

环境感知技术是智能车功能实现的前提,而在感知的基础上提高智能车对环境的认知能力是实现全自动驾驶的关键。针对室外交通场景,基于卷积神经网络提出了一种新的智能车同时定位以及语义地图构建方法。对智能车进行定位,并且构建稠密的3D语义地图,提高智能车的环境感知、认知能力。首先,基于双目ORB-SLAM提出了一种四线程的双目SLAM(simultaneous localization and mapping)方法构建稠密的3D点云地图,四线程分别为追踪线程,局部地图构建线程,回环检测线程以及稠密地图构建线程。其次,为提高智能车的环境认知能力,使用端对端的方法对图像进行语义分割,并且为提高语义分割精确率,将环境的几何信息也作为卷积神经网络输入。最后,将感知的能力与认知的能力相结合构建语义地图,为智能车实现全自动驾驶奠定基础。将算法在KITTI数据集上进行测试,整体算法速度为10帧/s,语义分割的全局精确率为73.1%,构建的语义地图显示本文提出算法能够在大规模室外场景下重构全局一致性地图,并且帮助智能车实现对环境的解析。
The technology of the environment perception is foundation for the autonomous vehicles,but to realize the fully self-driving it is very vital to improve the cognition ability of autonomous cars. a novel semantic Simultaneous Localization and Mapping way was presented based on convolutional neural networks( CNNs),locating the position of vehicles and building the dense 3 D semantic point cloud map when vehicles are outdoors. Firstly,a 4 thread SLAM method was presented to build dense sematic 3 D map including tracking thread,local mapping thread,loop closing thread and dense mapping thread based on original ORB-SLAM. Then to improve the cognition ability the pixel-wise sematic information was obtained by using an end-to-end deep learning way. Finally,perception and cognition abilities was fused to build dense semantic map. The method was tested on the KITTI dataset.the whole speed of system is about 10 Hz and the global accuracy of the convolutional neural network is 73. 1%.Results indicate the system is effective in the real-time building a globally consistent semantic map,and at same time it helps vehicle perceive the environment.

引文

1 Engel J,Stückler J,Cremers D.Largescale direct SLAM with stereo cameras[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).New York:IEEE,2015:1935-1942
    2 Mur-Artal R,Tardos J D.ORB-SLAM2:An open-source SLAM system for monocular,stereo,and RGB-D cameras[J].IEEE Transactions on Robotics,2016(99):1-8
    3 Endres F,Hess J,Sturm J,et al.3-D mapping with an RGB-D camera[J].IEEE Transactions on Robotics,2014,30(1):177-187
    4 Civera J,Gálvez-López D,Riazuelo L,et al.Towards semantic SLAM using a monocular camera[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems.New York:IEEE,2011:1277-1284
    5 Salasmoreno R F,Newcombe R A,Strasdat H,et al.SLAM++:Simultaneous localisation and mapping at the level of objects[C]//IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Computer Society,2013:1352-1359
    6 Krizhevsky A,Sutskever I,Hinton G E.Image Net classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.Canada:NIPS,2012:1097-1105
    7 Zhicheng Yan,Hao Zhang,Robinson Piramuthu,et al.HD-CNN:Hierarchical deep convolutional neural networks for large scale visual recognition[C]//2015 IEEE International Conference on Computer Vision(ICCV).New York:IEEE,2015:2740-2748
    8 Szegedy C,Liu W,Jia Y,et al.Going deeper with convolutions[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2015:7298594
    9 Mccormac J,Handa A,Davison A,et al.Semantic fusion:dense 3D semantic mapping with convolutional neural networks[C]//International Conference on Robotics and Automation.New York:IEEE,2016:4628-4635
    10 Kundu A,Li Y,Dellaert F,et al.Joint semantic segm-entation and 3 D reconstruction from monocular video[C]//European Conference on Computer Vision.Berlin:Springer International Publishing,2014:703-718
    11 Min D,Choi S,Lu J,et al.Fast global image smoothing based on weighted least squares.[J].IEEE Transactions on Image Processing,2014,23(12):5638-5653
    12 Rublee E,Rabaud V,Konolige K,et al.ORB:An efficient alternative to SIFT or SURF[C]//International Conference on Computer Vision.New York:IEEE,2012:2564-2571
    13 Cadena C,Carlone L,Carrillo H,et al.Past,present,and future of simultaneous localization and mapping:toward the robust-perception age[J].IEEE Transactions on Robotics,2016,32(6):1309-1332
    14 Galvez-López D,Tardos J D.Bags of binary words for fast place recognition in image sequences[J].IEEE Transactions on Robotics,2012,28(5):1188-1197
    15 Hornung A,Kai M W,Bennewitz M,et al.Octomap:an efficient probabilistic 3D mapping framework based on octrees[J].Autonomous Robots,2013,34(3):189-206
    16 Li L,Qian B,Lian J,et al.Traffic scene segmentation based on RGB-D image and deep learning[J].IEEE Transactions on Intelligent Transportation Systems,2017(99):1-6

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700