基于语义分割-对抗的图像语义分割模型

英文篇名：Image Semantic Segmentation Model Based on Semantic Segmentation-Adversarial Network
作者：王鑫 ; 于重重 ; 马先钦 ; 陈秀新
英文作者：WANG Xin;YU Chong-chong;MA Xian-qin;CHEN Xiu-xin;School of Computer and Information Engineering, Beijing Technology and Business University;
关键词：图像语义分割 ; 语义分割-对抗模型 ; 端到端训练 ; 自主学习
英文关键词：Image semantic segmentation;;SSGAN model;;End-to-end Training;;Autonomous learning
中文刊名：JSJZ
英文刊名：Computer Simulation
机构：北京工商大学计算机与信息工程学院;
出版日期：2019-02-15
出版单位：计算机仿真
年：2019
期：v.36
基金：北京市自然科学基金重点项目(B类,KZ201410011014);; 北京市教委科研计划面上项目(KM201510011010)
语种：中文;
页：JSJZ201902041
页数：5
CN：02
ISSN：11-3724/TP
分类号：201-205

摘要

图像语义分割对场景理解等具有重要的作用,是当前计算机视觉领域研究的一个热点问题。针对当前图像语义分割方法存在的精度低等问题,提出语义分割-对抗模型(Semantic Segmentation Generative Adversarial Networks, SSGAN)。模型采用Deeplab-VGG16作为生成模型,通过对输入真实样本的学习,生成语义分割图;采用金字塔池(Atrous Spatial Pyramid Pooling, ASPP)作为判别模型,对人工标记图与生成分割图进行高阶规律统计。在数据集POSCALVOC2012上实验得到mIOU为0.823,较Adversarial提高0.24。SSGAN模型通过将对抗模型与传统语义分割模型相结合,既保持传统语义分割模型端到端的训练方式,又具有对抗网络自主学习能力,避免人工设计对应的高阶损失项产生的不匹配。最后通过剪枝与权值量化共享将模型压缩为原来的0.045。实验证明本文所提方法具有可行性。
Image semantic segmentation plays an important role in scene understanding. It is a hot issue in the field of computer vision. Due to the low accuracy of the current image semantic segmentation methods, this paper proposed the Semantic Segmentation Generative Adversarial Networks(SSGAN). The model used DeepLabv-VGG16 as the generation model to generate the semantic segmentation map by learning the input real sample data. The discriminant model used the Atrous Spatial Pyramid Pooling(ASPP) to count the high-order rules with the ground truth and the generated segmentation map. Experiment on the POSCALVOC 2012 public dataset show that mIOU is 0.823, 0.24 higher than Adversarial. The SSGAN model not only keeps end-to-end training of the traditional semantic segmentation model, but also has the ability of self-learning. At last, through Pruning and Quantization and Weight Sharing, the model is compressed to 0.045. Thus, this research proves that SSGAN model is feasible.

引文

[1] 周莉莉,姜枫. 图像分割方法综述研究[J]. 计算机应用研究, 2017, 34(7):1921-1928.
    [2] 魏云超,赵耀. 基于DCNN的图像语义分割综述[J]. 北京交通大学学报, 2016,40(4): 82-91.
    [3] 徐风尧,王恒升. 移动机器人导航中的楼道场景语义分割[J]. 计算机应用于研究, 2018,35(5):1-7.
    [4] A Radford, L Metz, S Chintala. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J]. Computer Science, 2015.
    [5] B Shuai, T Liu, G Wang. Improving Fully Convolution Network for Semantic Segmentation[J]. Computer Vision and Pattern Recognition, 2016.
    [6] V Badrinarayanan, A Kendall, R Cipolla. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Scene Segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017,(99):1-1.
    [7] S Zheng, et al. Conditional Random Fields as Recurrent Neural Networks[J]. IEEE International Conference on Computer Vision, 2016.
    [8] D Eigen, R Fergus. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture[C]. IEEE International Conference on Computer Vision. IEEE, 2016:2650-2658.
    [9] G Lin, et al. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation[C]. Computer Vision and Pattern Recognition. IEEE, 2016:3194-3203.
    [10] J Long, E Shelhamer, T Darrell. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017,39(4):640-651.
    [11] G Lin, et al. Exploring Context with Deep Structured models for Semantic Segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017,(99):1-1.
    [12] L C Chen, et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016,(99):1-1.
    [13] P Luc, et al. Semantic Segmentation using Adversarial Networks[J]. NIPS Workshop on Adversarial Training, 2016.
    [14] 肖旭, 基于深度学习的图像语义分割研究[D]. 南昌航空大学, 2017.
    [15] A Radford, L Metz, S Chintala. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J]. Computer Science, 2015.
    [16] 雷杰,等. 深度网络模型压缩综述[J]. 软件学报, 2018,(2).
    [17] S Han, H Mao, W J Dally. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding[J]. Fiber, 2015,56(4):3-7.
    [18] B Hariharan, P Arbelaez, L Bourdev, S Maji, and J Malik. Semantic contours from inverse detectors[C]. In ICCV, 2011.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700