摘要
宫廷服饰纹样蕴含着丰富的文化内涵,但由于缺少像素级语义标注的数据库,使得宫廷服饰纹样精准分割成为极具挑战的问题。为此,提出一种融合深度学习和GrabCut算法的双层模型,实现目标检测和分割功能。分析不同深度卷积神经网络的特点,在模型目标检测层(ODL)选择使用二阶段目标检测框架中的R-FCN方法;在模型分割层(SL)使用基于图论的GrabCut算法产生最终分割结果。在宫廷服饰图像数据集上进行仿真实验,证明基于深度卷积神经网络和GrabCut算法的双层模型可以产生较好的分割效果。
The design pattern of Chinese imperial costumes contains rich cultural connotation.However, due to the lack of data set of pixel-level semantic annotation, the accurate segmentation of Chinese imperial costume images has become a very challenging problem. In this paper, a bi-level model integrating deep learning and GrabCut is proposed to realize the object detection and segmentation. The characteristics of different deep convolution neural network models are analyzed,and a two-stage object detector R-FCN is selected in the object detection layer(ODL). The segmentation layer(SL) of the proposed model employs GrabCut algorithm based on graph theory to produce final segmentation result. Experiments show that the proposed bi-level model can produce good segmentation results in the Chinese imperial costume image data set.
引文
[1]赵海英,彭宏,杨一帆,等.基于拓扑构型的地毯图案生成方法[J].计算机辅助设计与图形学学报,2013,25(4):502-509.
[2]赵海英,潘志庚,徐正光.基于构型风格的新疆民族织物图案自动生成[J].图学学报,2013,34(1):17-21.
[3]赵海英,徐正光,张彩明.一类新疆民族风格的织物图案生成方法[J].图学学报,2012,33(2):1-8.
[4]赵海英,陈洪,叶瑞松.一种基于平面对称群的对称图案生成方法[J].图学学报,2015,36(6):872-878.
[5]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:3431-3440.
[6]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs[J].Computer Science,2014(4):357-361.
[7]ZHENG S,JAYASUMANA S,ROMERA-PAREDES B,et al.Conditional random fields as recurrent neural networks[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:1529-1537.
[8]NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:1520-1528.
[9]LIU Z,LI X,LUO P,et al.Semantic image segmentation via deep parsing network[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:1377-1385.
[10]LIN G,SHEN C,VAN DEN HENGEL A,et al.Efficient piecewise training of deep structured models for semantic segmentation[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:3194-3203.
[11]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.New York:IEEE Press,2014:580-587.
[12]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//European Conference on Computer Vision.Cham:Springer,2014:346-361.
[13]HARIHARAN B,ARBELáEZ P,GIRSHICK R,et al.Simultaneous detection and segmentation[C]//European Conference on Computer Vision.Cham:Springer,2014:297-312.
[14]LIU S,QI X,SHI J,et al.Multi-scale patch aggregation(mpa)for simultaneous detection and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:3141-3149.
[15]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEEInternational Conference on Computer Vision.New York:IEEE Press,2015:1440-1448.
[16]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[C]//International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2015:91-99.
[17]DAI J,LI Y,HE K,et al.R-fcn:Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2016:379-387.
[18]SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training region-based object detectors with online hard example mining[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:761-769.
[19]MOSTAJABI M,YADOLLAHPOUR P,SHAKHNAROVICH G.Feedforward semantic segmentation with zoom-out features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:3376-3385.
[20]KR?HENBüHL P,KOLTUN V.Efficient inference in fully connected crfs with gaussian edge potentials[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2011:109-117.
[21]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[EB/OL].[2018-06-04].https://arxiv.org/abs/1511.07122.
[22]CHEN L C,YANG Y,WANG J,et al.Attention to scale:Scale-aware semantic image segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:3640-3649.
[23]GHIASI G,FOWLKES C C.Laplacian pyramid reconstruction and refinement for semantic segmentation[C]//European Conference on Computer Vision.Cham:Springer,2016:519-534.
[24]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2018,40(4):834-848.
[25]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[EB/OL].[2018-06-11].https://arxiv.org/abs/1706.05587.
[26]ROTHER C,KOLMOGOROV V,BLAKE A.Grabcut:Interactive foreground extraction using iterated graph cuts[C]//ACM Transactions on Graphics(TOG).New York:ACM Press,2004,23(3):309-314.
[27]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:779-788.
[28]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
[29]ZEILER M D,FERGUS R.Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision.Cham:Springer,2014:818-833.
[30]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].[2018-05-10].https://arxiv.org/abs/1409.1556.
[31]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:770-778.
[32]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEEComputer Society Press,2015:1-9.