基于反卷积特征学习的图像语义分割算法

英文篇名：Image Semantic Segmentation Algorithm Based on Deconvolution Feature Learning
作者：郑菲 ; 孟朝晖 ; 郭闯世
英文作者：ZHENG Fei;MENG Zhao-Hui;GUO Chuang-Shi;College of Computer and Information,Hohai University;
关键词：深度学习 ; 语义分割 ; 批次中心化 ; 多尺度特征 ; 反卷积网络
英文关键词：deep learning;;semantic segmentation;;batch centralization;;multi-scale features;;deconvolution network
中文刊名：XTYY
英文刊名：Computer Systems & Applications
机构：河海大学计算机与信息学院;
出版日期：2019-01-15
出版单位：计算机系统应用
年：2019
期：v.28
语种：中文;
页：XTYY201901022
页数：9
CN：01
ISSN：11-2854/TP
分类号：149-157

摘要

随着深度学习的发展,语义分割任务中许多复杂的问题得以解决,为图像理解奠定了坚实的基础.本文算法突出表现在两个方面,其一是利用反卷积网络,对卷积网络中不同深度的卷积层提取到的多尺度特征进行融合,之后再次通过反卷积操作对融合后的特征图进行上采样,将其放大到原图像的大小,最后对每个像素进行语义类别的预测.其二为了提升本文网络结构的性能,提出一种新的数据处理方式,批次中心化算法.经过实验验证,本文算法在SIFT-Flow数据集上语义分割的平均准确率达到45.2%,几何分割的准确率达到96.8%,在PASCAL VOC2012数据集上语义分割的平均准确率达到73.5%.
With the development of deep learning,many complex problems in semantic segmentation tasks are solved,which lays a solid foundation for image understanding.The proposed algorithm highlights two aspects.Firstly,our algorithm fuses multi-scale features from different levels of deep convolutional network by using multi-level deconvolution network.Then our algorithm upsamples these feature maps by deconvolution,meanwhile zooms them up to the original image size to predict semantic categories pixel-to-pixel.The second one,we propose a new method for data processing which is batch centralization algorithm,in order to improve the performance of network structure in this study.Through experimental verification,the mean IoU of semantic segmentation on the SIFT-Flow dataset reaches 45.2%,and the accuracy of geometric segmentation reaches 96.8%.The mean IoU of semantic segmentation on the PASCAL VOC2012 dataset reaches 73.5%.

引文

1刘丹,刘学军,王美珍.一种多尺度CNN的图像语义分割算法.遥感信息,2017, 32(1):57-64.[doi:10.3969/j.issn.1000-3177.2017.01.011]
    2魏云超,赵耀.基于DCNN的图像语义分割综述.北京交通大学学报,2016, 40(4):82-91.[doi:10.11860/j.issn.1673-0291.2016.04.013]
    3熊志勇,张国丰,王江晴.基于多尺度特征提取的图像语义分割.中南民族大学学报(自然科学版),2017, 36(3):118-124.[doi:103969/j.issn.1672-4321.2017.03.025]
    4 Mardia KV, Hainsworth TJ. A spatial thresholding method for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1988, 10(6):919-927.[doi:10.1109/34.9113]
    5 Giannakeas N, Karvelis PS, Exarchos TP, et al.Segmentation of microarray images using pixel classification-comparison with clustering-based methods.Computers in Biology and Medicine, 2013, 43(6):705-716.[doi:10.1016/j.compbiomed.2013.03.003]
    6 Shi JB,Malik J. Normalized cuts and image segmentation.IEEE Transactions on Pattern Analysis and Machine Intelligence,2000, 22(8):888-905.[doi:10.1109/34.868688]
    7 Rother C, Kolmogorov V,Blake A."GrabCut":Interactive foreground extraction using iterated graph cuts. ACM SIGGRAPH. Los Angeles, CA, USA. 2004. 309-314.
    8 Shelhamer E, Long J,Darrell T. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651.[doi:10.1109/TPAMI.2016.2572683]
    9 Pinheiro PO, Collobert R. Recurrent convolutional neural networks for scene labeling. Proceedings of the 31st International Conference on Machine Learning. Beijing,China. 2014. 82-90.
    10 Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. International Conference on LearningRepresentations. SanJuan, Puerto Rico. 2016.
    11 Chen LC, Papandreou G, Kokkinos I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. Computer Science, 2014(4):357-361.
    12 Chen LC,Papandreou G,Kokkinos I,et al. DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence,2018, 40(4):834-848.[doi:10.1109/TPAMI.2017.2699184]
    13 Chen LC, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587, 2017.
    14 Chen LC, Zhu YK, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv:1802.02611, 2018.
    15 Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks.Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, NV, USA.2012. 1097-1105.
    16 Simony an K,Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.
    17 Szegedy C,Liu W,Jia YQ,et al. Going deeper with convolutions. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA,USA. 2015. 1-9.
    18 Zeiler MD, Krishnan D, Taylor GW, et al. Deconvolutional networks. Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.San Francisco, CA, USA. 2010. 2528-2535.
    19 Zeiler MD, Taylor GW, Fergus R. Adaptive deconvolutional networks for mid and high level feature learning.Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain. 2011. 2018-2025.
    20 Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. Proceedings of 2014 European Conference on Computer Vision. Springer. Cham. 2014.818-833.
    21 Ioffe S,Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift.Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France.2015.448-456.
    22 He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference onComputer Vision and Pattern Recognition. Las Vegas, NV,USA. 2016. 770-778.
    23 Liu C,Yuen J, Torralba A. SIFT flow:Dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence,2011, 33(5):978-994.[doi:10.1109/TPAMI.2010.147]
    24 Tighe J, Lazebnik S. SuperParsing:Scalable nonparametric image parsing with superpixels. In:Daniilidis K, Maragos P,Paragios N, eds. Computer Vision—ECCV 2010. Berlin,Heidelberg. Springer. 2010. 352-365.
    25 Tighe J, Lazebnik S. Finding things:Image parsing withregions and per-exemplar detectors. Proceedings of 2013IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA. 2013. 3001-3008.
    26 Farabet C, Couprie C, Najman L, et al. Learning hierarchical features for scene labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8):1915-1929.[doi:10.1109/TPAMI.2012.231]
    27 Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago,Chile. 2015. 1520-1528.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700