摘要
提出了一种低参数量实时图像语义分割网络模型Atrous-squeezeseg。模型在最低参数量为2.1×107时的运算帧率为45.3frame/s,像素点准确度与均交并比分别可达到59.5%与62.9%。同时,嵌入式设备NVIDIA TX2的运算帧率可达8.3frame/s。实验结果表明,相比于其他分割算法,所提模型的速度和参数量均得到了提升。
We propose a real-time image semantic segmentation network model,which is named as Atroussqueezeseg.Under the condition that the minimum parameter of the model is 2.1×107,the operation frame rate is45.3 frame/s,and the pixel point accuracy and mean intersection over union can reach 59.5% and 62.9%,respectively.At the same time,in the embedded device NVIDIA TX2,the operate frame rate is up to 8.3 frame/s.The experimental results show that,compared with other segmentation algorithms,the speed and parameter quantity of the proposed model are increased.
引文
[1] Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[C]∥IEEE Conference onComputerVisionandPattern Recognition,2014.
[2] Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[C]∥Proceedings of the 25th International Conference on Neural,2012,1:1097-1105.
[3] He K M,Zhang X Y,Ren S Q,et al.Deep residual learning for image recognition[C]∥Proceedings of2016 IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
[4] Zhang X N,Zhong X,Zhu R F,et al.Scene classification of remote sensing images based on integrated convolutional neural networks[J].Acta Optica Sinica,2018,38(11):1128001.张晓男,钟兴,朱瑞飞,等.基于集成卷积神经网络的遥感影像场景分类[J].光学学报,2018,38(11):1128001.
[5] Ye G L,Sun S Y,Gao K J,et al.Nighttime pedestriandetectionbasedonfasterregion convolutionneuralnetwork[J]. Laser&Optoelectronics Progress,2017,54(8):081003.叶国林,孙韶媛,高凯珺,等.基于加速区域卷积神经网络的夜间行人检测研究[J].激光与光电子学进展,2017,54(8):081003.
[6] Wu C Y,Yi B S,Zhang Y G,et al.Retinal vessel image segmentation based on improved convolutional neural network[J].Acta Optica Sinica,2018,38(11):1111004.吴晨玥,易本顺,章云港,等.基于改进卷积神经网络的视网膜血管图像分割[J].光学学报,2018,38(11):1111004.
[7] Guo C C, Yu F Q, Chen Y.Image semantic segmentation based on convolutional neural network feature and improved superpixel matching[J].Laser&Optoelectronics Progress,2018,55(8):081005.郭呈呈,于凤芹,陈莹.基于卷积神经网络特征和改进超像素匹配的图像语义分割[J].激光与光电子学进展,2018,55(8):081005.
[8] Shelhamer E,Long J,Darrell T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(4):640-651.
[9] Badrinarayanan V,Kendall A,Cipolla R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
[10] Gamal M,Siam M,Abdel-Razek M.ShuffleSeg:real-time semantic segmentation network[C]∥IEEE Conference onComputerVisionandPattern Recognition,2018.
[11] Song Q S, Zhang C, Chen Y,et al. Road segmentation using full convolutional neural networks with conditional random fields[J]. Journal of Tsinghua University(Science and Technology),2018,58(8):725-731.宋青松,张超,陈禹,等.组合全卷积神经网络和条件随机场的道路分割[J].清华大学学报(自然科学版),2018,58(8):725-731.
[12] Iandola F N, Han S, Moskewicz M W,et al.SqueezeNet:Alexnet-level accuracy with 50×fewer parameters and <0.5 MB model size[C]∥IEEE Conference onComputerVisionandPattern Recognition,2016.
[13] Shi J B, Malik J. Normalized cuts and image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.
[14] Zhu W,Qu J Y,Wu R B.Straight convolutional neuralnetworksalgorithmbasedonbatch normalization for image classification[J].Journal of Computer-Aided Design&Computer Graphics,2017,29(9):1650-1657.朱威,屈景怡,吴仁彪.结合批归一化的直通卷积神经网络图像分类算法[J].计算机辅助设计与图形学学报,2017,29(9):1650-1657.
[15] Ioffe S,Szegedy C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]∥Proceedings of the 32nd International Conference on Machine Learning,2015,37:448-456.
[16] Yu F,Koltun V.Multi-scale context aggregation by dilated convolutions[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2015.
[17] Ronneberger O, Fischer P, Brox T. U-net:convolutionalnetworksforbiomedicalimage segmentation[C]∥IEEE Conference on Medical ImageComputingandComputer-Assited Intervention,2015:234-241.
[18] Abadi M,Agarwal A,Barham P,et al.Tensorflow:large-scalemachinelearningonheterogeneous distributed systems[J].arXiv preprint arXiv:1603.04467,2016.
[19] Chetlur S, Woolley C, Vandermersch P,et al.cuDNN:efficient primitives for deep learning[J].arXiv:1410.0759,2014.
[20] Zhou B L,Zhao H,Puig X,et al.Semantic understanding of scenes through the ADE20Kdataset[J].International Journal of Computer Vision,2019,127(3):302-321.
[21] Garcia-Garcia A,Orts-Escolano S,Oprea S,et al.A survey on deep learning techniques for image and video semantic segmentation[J]. Applied Soft Computing,2018,70:41-65.