结合图像语义分割的增强现实型平视显示系统设计与研究

英文篇名：Design of Augmented Reality Head-up Display System Based on Image Semantic Segmentation
作者：安喆 ; 徐熙平 ; 杨进华 ; 乔杨 ; 刘洋
英文作者：An Zhe;Xu Xiping;Yang Jinhua;Qiao Yang;Liu Yang;School of Optoelectronic Engineering,Changchun University of Science and Technology;
关键词：图像处理 ; 增强现实 ; 图像语义分割 ; 虚实注册
英文关键词：image processing;;augmented reality;;image semantic segmentation;;virtual-real registration
中文刊名：GXXB
英文刊名：Acta Optica Sinica
机构：长春理工大学光电工程学院;
出版日期：2018-07-10
出版单位：光学学报
年：2018
期：v.38;No.436
基金：国家自然科学基金(61605016)
语种：中文;
页：GXXB201807011
页数：7
CN：07
ISSN：31-1252/O4
分类号：85-91

摘要

为了提高驾驶员在车辆行驶过程中的安全性,设计了一种结合图像语义分割的增强现实型平视显示(ARHUD)系统。首先,提出一种改进的单发多框检测器网络对道路场景图像进行语义分割,网络前端采用VGG-16提取图像特征,网络后端对获取的特征图进行上采样,从而对特征图进行像素分割。通过对网络的训练,得到场景目标的像素级分类结果,即环境的语义内容信息。随后,通过分析真实场景、光学显示系统、驾驶员之间的关系,将计算机产生的虚拟信息叠加到真实场景,并将显示内容注册到驾驶员视野中,从而提高行车安全。实验结果表明,语义分割算法的准确率能达到77.8%,虚实注册算法处理每帧图像的时间平均为45ms,约22frame·s-1。
In order to improve the security of drivers,an augmented reality head-up display(AR-HUD)system is designed based on image semantic segmentation.Firstly,we propose an improved single shot multibox detector network for semantic segmentation of road scene images.The front end of the network uses VGG-16 to extract the image features,and the back ends of the network are sampled on the feature maps.Thus,the feature map is segmented.Through the training of the network,the pixel level classification results of the scene objects are obtained,namely,the semantic content information of the environment.Then,with analysis of the relationship among real scene,optical display system,and drivers,the virtual information generated by computer is added to the real scene.In this way,the content is registered into the driver′s view to improve the safety of driving.Experimental results show that the accuracy of the semantic segmentation algorithm can reach77.8%,and image processing time of the algorithm for each frame is 45 ms,in other words,about 22 frame·s~(-1).

引文

[1]Park H S,Min W P,Won K H,et al.In-vehicle AR-HUD system to provide driving-safety information[J].ETRI Journal,2013,35(6):1038-1047.
    [2]Yu Y H.Research status and development trend of augmented reality technology[J].Journal of Hunan Mass Media Vocational Technical College,2016,16(1):55-57.余艳红.增强现实技术的研究现状及发展趋势[J].湖南大众传媒职业技术学院学报,2016,16(1):55-57.
    [3]Gui Z W,Liu Y,Chen J,et al.Online learning of tracking and registration based on natural scenes[J].Journal of Software,2016,27(11):2929-2945.桂振文,刘越,陈靖,等.基于自然场景在线学习的跟踪注册技术[J].软件学报,2016,27(11):2929-2945.
    [4]Fiala M.ARTag,afiducial marker system using digital techniques[C]∥Proceedings of 2005IEEEComputer Society Conference on Computer Vision&Pattern Recognition,2005,2:590-596.
    [5]Zhang G,Chen H S,Ye Y D.A LoG operator based markerless augmented reality algorithm:LoG-PTAMM[J].Journal of Computer-Aided Design&Computer Graphics,2016,28(9):1577-1586.张格,陈昊升,叶阳东.一种基于LoG算子的无标识增强现实算法:LoG-PTAMM[J].计算机辅助设计与图形学学报,2016,28(9):1577-1586.
    [6]Gao K J,Sun S Y,Yao G S,et al.Semantic segmentation of night vision images for unmanned vehicls based on deep learning[J].Journal of Applied Optics,2017,38(3):421-428.高凯珺,孙韶媛,姚广顺,等.基于深度学习的无人车夜视图像语义分割[J].应用光学,2017,38(3):421-428.
    [7]Liu C H,Li Z,Xu C,et al.BRDF model for commonly used materials of space targets based on deep neural network[J].Acta Optica Sinica,2017,37(11):1129001.刘程浩,李智,徐灿,等.基于深度神经网络的空间目标常用材质BRDF模型[J].光学学报,2017,37(11):1129001.
    [8]Lu Y S,Li Y X,Liu B,et al.Hyperspectral data haze monitoring based on deep residual network[J].Acta Optica Sinica,2017,37(11):1128001.陆永帅,李元祥,刘波,等.基于深度残差网络的高光谱遥感数据霾监测[J].光学学报,2017,37(11):1128001.
    [9]Huang H,He K,Zheng X L,et al.Spatial-spectral feature extraction of hyperspectral image based on deep learning[J].Laser&Optoelectronics Progress,2017,54(10):101001.黄鸿,何凯,郑新磊,等.基于深度学习的高光谱图像空-谱联合特征提取[J].激光与光电子学进展,2017,54(10):101001.
    [10]Wu S C,Zhao H T,Sun S Y.Depth estimation from monocular infrared video based on bi-recursive convolutional neural network[J].Acta Optica Sinica,2017,37(12):1215003.吴寿川,赵海涛,孙韶媛.基于双向递归卷积神经网络的单目红外视频深度估计[J].光学学报,2017,37(12):1215003.
    [11]Long J,Shelhamer E,Darrell T.Fully convolutional networks for semantic segmentation[J].IEEETransactions on Pattern Analysis&Machine Intelligence,2017,39(4):640-651.
    [12]Badrinarayanan V,Kendall A,Cipolla R.SegNet:a deep convolutional encoder-decoder architecture for scene segmentation[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2017,39(12):2481-2495.
    [13]Chen L C,Papandreou G,Kokkinos I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs[J].Computer Science,2014,11(4):357-361.
    [14]Liu W,Anguelov D,Erhan D,et al.SSD:single shot multibox detector[C]∥Proceedings of 2016European Conference on Computer Vision,2016,9905:21-37.
    [15]Zhang K.Research on a rapid fusion method for remote sensing images based on an improved atrous wavelet decompsition[D].Kaifeng:Henan University,2016:66-72.张凯.基于改进atrous小波分解的遥感影像快速融合方法的研究[D].开封:河南大学,2016:66-72.
    [16]Xu L,Zhao H T,Sun S Y.Monocular infrared image depth estimation based on deep convolutional neural networks[J].Acta Optica Sinica,2016,36(7):0715002.许路,赵海涛,孙韶媛.基于深层卷积神经网络的单目红外图像深度估计[J].光学学报,2016,36(7):0715002.
    [17]Besl P J,Mckay N D.Method for registration of 3Dshapes[C]∥Proceedings of 1992 Robotics-DLtentative International Society for Optics and Photonics,1992,14:239-256.
    [18]Brostow G J,Shotton J,Fauqueur J,et al.Segmentation and recognition using structure from motion point clouds[C]∥Proceedings of 2008European Conference on Computer Vision.2008,5302:44-57.
    [19]Fei T,Liang X H,He Z Y,et al.A registration method based on nature feature with KLT tracking algorithm for wearable computers[C]∥Proceedings of 2009 International Conference on Cyberworlds.2009,1:416-421.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700