基于卷积神经网络的低参数量实时图像分割算法

英文篇名：Low-Parameter Real-Time Image Segmentation Algorithm Based on Convolutional Neural Network
作者：谭光鸿 ; 侯进 ; 韩雁鹏 ; 罗朔
英文作者：Tan Guanghong;Hou Jin;Han Yanpeng;Luo Shuo;School of Information Science and Technology,Southwest Jiaotong University;
关键词：图像处理 ; 图像分割 ; 实时图像 ; 低参数量 ; 卷积模块 ; 多尺度特征
英文关键词：image processing;;image segmentation;;real-time image;;low number of parameters;;convolution module;;multiscale feature
中文刊名：JGDJ
英文刊名：Laser & Optoelectronics Progress
机构：西南交通大学信息科学与技术学院;
出版日期：2018-12-06 20:43
出版单位：激光与光电子学进展
年：2019
期：v.56;No.644
基金：浙江大学CAD&CG国家重点实验室开放课题(A1923);; 成都市科技项目(2015-HM01-00050-SF)
语种：中文;
页：JGDJ201909011
页数：9
CN：09
ISSN：31-1690/TN
分类号：100-108

摘要

提出了一种低参数量实时图像语义分割网络模型Atrous-squeezeseg。模型在最低参数量为2.1×107时的运算帧率为45.3frame/s,像素点准确度与均交并比分别可达到59.5%与62.9%。同时,嵌入式设备NVIDIA TX2的运算帧率可达8.3frame/s。实验结果表明,相比于其他分割算法,所提模型的速度和参数量均得到了提升。
We propose a real-time image semantic segmentation network model,which is named as Atroussqueezeseg.Under the condition that the minimum parameter of the model is 2.1×107,the operation frame rate is45.3 frame/s,and the pixel point accuracy and mean intersection over union can reach 59.5% and 62.9%,respectively.At the same time,in the embedded device NVIDIA TX2,the operate frame rate is up to 8.3 frame/s.The experimental results show that,compared with other segmentation algorithms,the speed and parameter quantity of the proposed model are increased.

引文

[1] Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[C]∥IEEE Conference onComputerVisionandPattern Recognition,2014.
    [2] Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[C]∥Proceedings of the 25th International Conference on Neural,2012,1:1097-1105.
    [3] He K M,Zhang X Y,Ren S Q,et al.Deep residual learning for image recognition[C]∥Proceedings of2016 IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
    [4] Zhang X N,Zhong X,Zhu R F,et al.Scene classification of remote sensing images based on integrated convolutional neural networks[J].Acta Optica Sinica,2018,38(11):1128001.张晓男,钟兴,朱瑞飞,等.基于集成卷积神经网络的遥感影像场景分类[J].光学学报,2018,38(11):1128001.
    [5] Ye G L,Sun S Y,Gao K J,et al.Nighttime pedestriandetectionbasedonfasterregion convolutionneuralnetwork[J]. Laser&Optoelectronics Progress,2017,54(8):081003.叶国林,孙韶媛,高凯珺,等.基于加速区域卷积神经网络的夜间行人检测研究[J].激光与光电子学进展,2017,54(8):081003.
    [6] Wu C Y,Yi B S,Zhang Y G,et al.Retinal vessel image segmentation based on improved convolutional neural network[J].Acta Optica Sinica,2018,38(11):1111004.吴晨玥,易本顺,章云港,等.基于改进卷积神经网络的视网膜血管图像分割[J].光学学报,2018,38(11):1111004.
    [7] Guo C C, Yu F Q, Chen Y.Image semantic segmentation based on convolutional neural network feature and improved superpixel matching[J].Laser&Optoelectronics Progress,2018,55(8):081005.郭呈呈,于凤芹,陈莹.基于卷积神经网络特征和改进超像素匹配的图像语义分割[J].激光与光电子学进展,2018,55(8):081005.
    [8] Shelhamer E,Long J,Darrell T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(4):640-651.
    [9] Badrinarayanan V,Kendall A,Cipolla R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
    [10] Gamal M,Siam M,Abdel-Razek M.ShuffleSeg:real-time semantic segmentation network[C]∥IEEE Conference onComputerVisionandPattern Recognition,2018.
    [11] Song Q S, Zhang C, Chen Y,et al. Road segmentation using full convolutional neural networks with conditional random fields[J]. Journal of Tsinghua University(Science and Technology),2018,58(8):725-731.宋青松,张超,陈禹,等.组合全卷积神经网络和条件随机场的道路分割[J].清华大学学报(自然科学版),2018,58(8):725-731.
    [12] Iandola F N, Han S, Moskewicz M W,et al.SqueezeNet:Alexnet-level accuracy with 50×fewer parameters and <0.5 MB model size[C]∥IEEE Conference onComputerVisionandPattern Recognition,2016.
    [13] Shi J B, Malik J. Normalized cuts and image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.
    [14] Zhu W,Qu J Y,Wu R B.Straight convolutional neuralnetworksalgorithmbasedonbatch normalization for image classification[J].Journal of Computer-Aided Design&Computer Graphics,2017,29(9):1650-1657.朱威,屈景怡,吴仁彪.结合批归一化的直通卷积神经网络图像分类算法[J].计算机辅助设计与图形学学报,2017,29(9):1650-1657.
    [15] Ioffe S,Szegedy C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]∥Proceedings of the 32nd International Conference on Machine Learning,2015,37:448-456.
    [16] Yu F,Koltun V.Multi-scale context aggregation by dilated convolutions[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2015.
    [17] Ronneberger O, Fischer P, Brox T. U-net:convolutionalnetworksforbiomedicalimage segmentation[C]∥IEEE Conference on Medical ImageComputingandComputer-Assited Intervention,2015:234-241.
    [18] Abadi M,Agarwal A,Barham P,et al.Tensorflow:large-scalemachinelearningonheterogeneous distributed systems[J].arXiv preprint arXiv:1603.04467,2016.
    [19] Chetlur S, Woolley C, Vandermersch P,et al.cuDNN:efficient primitives for deep learning[J].arXiv:1410.0759,2014.
    [20] Zhou B L,Zhao H,Puig X,et al.Semantic understanding of scenes through the ADE20Kdataset[J].International Journal of Computer Vision,2019,127(3):302-321.
    [21] Garcia-Garcia A,Orts-Escolano S,Oprea S,et al.A survey on deep learning techniques for image and video semantic segmentation[J]. Applied Soft Computing,2018,70:41-65.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700