基于双层模型的宫廷服饰龙纹自动分割算法研究

英文篇名：Automatic Segmentation of Dragon Design Based on Bi-Level Model in Chinese Imperial Costume Images
作者：赵海英 ; 杨婷
英文作者：ZHAO Hai-ying;YANG Ting;School of Digital Media & Design Arts, Beijing University of Posts and Telecommunication;Beijing key Laboratory of Mobile Media and Cultural Computing, Beijing University of Posts and Telecommunications;
关键词：自动分割 ; 双层模型 ; 目标检测层 ; 分割层 ; 宫廷服饰图像
英文关键词：automatic segmentation;;bi-level model;;object detection layer;;segmentation layer;;Chinese imperial costume image
中文刊名：GCTX
英文刊名：Journal of Graphics
机构：北京邮电大学数字媒体与设计艺术学院;北京邮电大学世纪学院移动媒体与文化计算北京市重点实验室;
出版日期：2019-02-15
出版单位：图学学报
年：2019
期：v.40;No.143
基金：国家自然科学基金项目(61163044);; 北京市科委基金课题(D171100003717003);; 甘肃省人才引进项目(2015-RC-47)
语种：中文;
页：GCTX201901022
页数：8
CN：01
ISSN：10-1034/T
分类号：152-159

摘要

宫廷服饰纹样蕴含着丰富的文化内涵,但由于缺少像素级语义标注的数据库,使得宫廷服饰纹样精准分割成为极具挑战的问题。为此,提出一种融合深度学习和GrabCut算法的双层模型,实现目标检测和分割功能。分析不同深度卷积神经网络的特点,在模型目标检测层(ODL)选择使用二阶段目标检测框架中的R-FCN方法;在模型分割层(SL)使用基于图论的GrabCut算法产生最终分割结果。在宫廷服饰图像数据集上进行仿真实验,证明基于深度卷积神经网络和GrabCut算法的双层模型可以产生较好的分割效果。
The design pattern of Chinese imperial costumes contains rich cultural connotation.However, due to the lack of data set of pixel-level semantic annotation, the accurate segmentation of Chinese imperial costume images has become a very challenging problem. In this paper, a bi-level model integrating deep learning and GrabCut is proposed to realize the object detection and segmentation. The characteristics of different deep convolution neural network models are analyzed,and a two-stage object detector R-FCN is selected in the object detection layer(ODL). The segmentation layer(SL) of the proposed model employs GrabCut algorithm based on graph theory to produce final segmentation result. Experiments show that the proposed bi-level model can produce good segmentation results in the Chinese imperial costume image data set.

引文

[1]赵海英,彭宏,杨一帆,等.基于拓扑构型的地毯图案生成方法[J].计算机辅助设计与图形学学报,2013,25(4):502-509.
    [2]赵海英,潘志庚,徐正光.基于构型风格的新疆民族织物图案自动生成[J].图学学报,2013,34(1):17-21.
    [3]赵海英,徐正光,张彩明.一类新疆民族风格的织物图案生成方法[J].图学学报,2012,33(2):1-8.
    [4]赵海英,陈洪,叶瑞松.一种基于平面对称群的对称图案生成方法[J].图学学报,2015,36(6):872-878.
    [5]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:3431-3440.
    [6]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs[J].Computer Science,2014(4):357-361.
    [7]ZHENG S,JAYASUMANA S,ROMERA-PAREDES B,et al.Conditional random fields as recurrent neural networks[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:1529-1537.
    [8]NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:1520-1528.
    [9]LIU Z,LI X,LUO P,et al.Semantic image segmentation via deep parsing network[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:1377-1385.
    [10]LIN G,SHEN C,VAN DEN HENGEL A,et al.Efficient piecewise training of deep structured models for semantic segmentation[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:3194-3203.
    [11]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.New York:IEEE Press,2014:580-587.
    [12]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//European Conference on Computer Vision.Cham:Springer,2014:346-361.
    [13]HARIHARAN B,ARBELáEZ P,GIRSHICK R,et al.Simultaneous detection and segmentation[C]//European Conference on Computer Vision.Cham:Springer,2014:297-312.
    [14]LIU S,QI X,SHI J,et al.Multi-scale patch aggregation(mpa)for simultaneous detection and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:3141-3149.
    [15]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEEInternational Conference on Computer Vision.New York:IEEE Press,2015:1440-1448.
    [16]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[C]//International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2015:91-99.
    [17]DAI J,LI Y,HE K,et al.R-fcn:Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2016:379-387.
    [18]SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training region-based object detectors with online hard example mining[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:761-769.
    [19]MOSTAJABI M,YADOLLAHPOUR P,SHAKHNAROVICH G.Feedforward semantic segmentation with zoom-out features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:3376-3385.
    [20]KR?HENBüHL P,KOLTUN V.Efficient inference in fully connected crfs with gaussian edge potentials[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2011:109-117.
    [21]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[EB/OL].[2018-06-04].https://arxiv.org/abs/1511.07122.
    [22]CHEN L C,YANG Y,WANG J,et al.Attention to scale:Scale-aware semantic image segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:3640-3649.
    [23]GHIASI G,FOWLKES C C.Laplacian pyramid reconstruction and refinement for semantic segmentation[C]//European Conference on Computer Vision.Cham:Springer,2016:519-534.
    [24]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2018,40(4):834-848.
    [25]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[EB/OL].[2018-06-11].https://arxiv.org/abs/1706.05587.
    [26]ROTHER C,KOLMOGOROV V,BLAKE A.Grabcut:Interactive foreground extraction using iterated graph cuts[C]//ACM Transactions on Graphics(TOG).New York:ACM Press,2004,23(3):309-314.
    [27]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:779-788.
    [28]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
    [29]ZEILER M D,FERGUS R.Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision.Cham:Springer,2014:818-833.
    [30]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].[2018-05-10].https://arxiv.org/abs/1409.1556.
    [31]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:770-778.
    [32]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEEComputer Society Press,2015:1-9.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700