基于深度示例差异化的零样本多标签图像分类

英文篇名：Zero-Shot Multi-Label Image Classification Based on Deep Instance Differentiation
作者：冀中 ; 李慧慧 ; 何宇清
英文作者：JI Zhong;LI Huihui;HE Yuqing;School of Electrical and Information Engineering, Tianjin University;
关键词：零样本学习 ; 多标签分类 ; 跨模态映射 ; 多示例学习
英文关键词：zero-shot learning;;multi-label classification;;cross-modal mapping;;multi-instance learning
中文刊名：KXTS
英文刊名：Journal of Frontiers of Computer Science and Technology
机构：天津大学电气自动化与信息工程学院;
出版日期：2018-03-26 15:34
出版单位：计算机科学与探索
年：2019
期：v.13;No.124
基金：国家自然科学基金Nos.61771329,61472274~~
语种：中文;
页：KXTS201901009
页数：9
CN：01
ISSN：11-5602/TP
分类号：101-109

摘要

零样本多标签图像分类是对含多个标签且测试类别标签在训练过程中没有相应训练样本的图像进行分类标注。已有的研究表明,多标签图像类别间存在相互关联,合理利用标签间相互关系是多标签图像分类技术的关键,如何实现已见类到未见类的模型迁移,并利用标签间相关性实现未见类的分类是零样本多标签分类需要解决的关键问题。针对这一挑战性的学习任务,提出一种深度示例差异化分类算法。首先利用深度嵌入网络实现图像视觉特征空间至标签语义特征空间的跨模态映射,然后在语义空间利用示例差异化算法实现多标签分类。通过在主流数据集Natural Scene和IAPRTC-12上与已有算法进行对比实验,验证了所提方法的先进性和有效性,同时验证了嵌入网络的先进性。
Zero-shot multi-label image classification aims to tag images with multiple labels under the condition that the testing labels have no corresponding training samples. Previous studies show that the key to multi-label image classification technology is to make rational use of the label relationship that exists between multi-label image categories. The key issues that need to be addressed in the zero-shot multi-label image classification are to realize the model transfer from seen class to unseen class and utilize the label correlation to achieve classification of testing unseen classes. A deep instance differentiation algorithm is proposed for this challenging learning task.Specifically, visual features are first projected into semantic embedding space by use of deep embedding network,then an instance differentiation algorithm is employed to achieve multi-label classification in semantic space.Compared with existing algorithms on the popular Natural Scene and IAPRTC-12 datasets, the experimental results demonstrate the proposed method is advanced and effective, and the advance of embedding network is verified.

引文

[1]Zhang Y,Gong B Q,Shah M.Fast zero-shot image tagging[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,Jun 27-30,2016.Washington:IEEE Computer Society,2016:5985-5994.
    [2]Fu Y W,Hospedales T M,Xiang T,et al.Transductive multiview zero-shot learning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(11):2332-2345.
    [3]Lazaridou A,Bruni E,Baroni M.Is this a wampimuk?Cross-modal mapping between distributional semantics and the visual world[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,Baltimore,Jun 22-27,2014.Stroudsburg:ACL,2014:1403-1414.
    [4]Ji Z,Yu Y L,Pang Y W,et al.Zero-shot learning with multibattery factor analysis[J].Signal Processing,2017,138:265-272.
    [5]Yu Y L,Ji Z,Guo J C,et al.Transductive zero-shot learning with adaptive structural embedding[J].IEEE Transactions on Neural Networks and Learning Systems,2017,99:1-12.
    [6]Fu Y W,Yang Y X,Hospedales T M,et al.Transductive multi-label zero-shot learning[C]//Proceedings of the British Machine Vision Conference,Nottingham,Sep 1-5,2014.Durham:BMVA Press,2014:2332-2345.
    [7]Mensink T,Gavves E,Snoek C G M.COSTA:co-occurrence statistics for zero-shot classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Columbus,Jun 23-28,2014.Washington:IEEE Computer Society,2014:2441-2448.
    [8]Sandouk U,Chen K.Multi-label zero-shot learning via concept embedding[J].arXiv:1606.00282,2016.
    [9]Zhang M L,Zhang K.Multi-label learning by exploiting label dependency[C]//Proceedings of the 16th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining,Washington,Jul 25-28,2010.New York:ACM,2010:999-1008.
    [10]Zhang M L,Zhou Z H.ML-KNN:a lazy learning approach to multi-label learning[J].Pattern Recognition,2007,40(7):2038-2048.
    [11]Huang S J,Zhou Z H.Multi-label learning by exploiting label correlations locally[C]//Proceedings of the 26th AAAIConference on Artificial Intelligence,Toronto,Jul 22-26,2012.Menlo Park:AAAI,2012:949-955.
    [12]Zhang M L,Sánchez J M P,Robles V.Feature selection for multi-label naive Bayes classification[J].Information Sciences,2009,179(19):3218-3229.
    [13]Lin Z J,Ding G G,Han J G,et al.End-to-end feature aware label space encoding for multilabel classification with many classes[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(6):2472-2487.
    [14]Kong X N,Ng M K,Zhou Z H.Transductive multilabel learning via label set propagation[J].IEEE Transactions on Knowledge and Data Engineering,2013,25(3):704-719.
    [15]Zhang M L,Zhou Z H.Multi-label learning by instance differentiation[C]//Proceedings of the 22nd AAAI Conference on Artificial Intelligence,Vancouver,Jul 22-26,2007.Menlo Park:AAAI,2007:669-674.
    [16]Zhou Z H,Zhang M L,Huang S J,et al.Multi-instance multi-label learning[J].Artificial Intelligence,2012,176(1):2291-2320.
    [17]Mikolov T,Chen K,Corrado G,et al.Efficient estimation of word representations in vector space[J].ar Xiv:1301.3781,2013.
    [18]Socher R,Ganjoo M,Manning C D,et al.Zero-shot learning through cross-modal transfer[C]//Proceedings of the 27th Annual Conference on Neural Information Processing Systems,Lake Tahoe,Dec 5-8,2013.Red Hook:Curran Associates,2013:935-943.
    [19]Ji Z,Yu Y L,Pang Y W,et al.Manifold regularized crossmodal embedding for zero-shot learning[J].Information Sciences,2017,378:48-58.
    [20]Shigeto Y,Suzuki I,Hara K,et al.Ridge regression,hubness,and zero-shot learning[C]//LNCS 9284:Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases,Porto,Sep 7-11,2015.Berlin,Heidelberg:Springer,2015:135-151.
    [21]Zhang L,Xiang T,Gong S G.Learning a deep embedding model for zero-shot learning[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition,Honolulu,Jul 21-26,2017.Washington:IEEE Computer Society,2017:3010-3019.
    [22]Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[J].ar Xiv:1409.1556,2013.
    [23]Glorot X,Bengio Y.Understanding the difficulty of training deep feedforward neural networks[J].Journal of Machine Learning Research,2010,9:249-256.
    [24]Grubinger M.Analysis and evaluation of visual information systems performance[D].Melbourne:Victoria University,2007.
    [25]Oliva A,Torralba A.Modeling the shape of the scene:a holistic representation of the spatial envelope[J].International Journal of Computer Vision,2001,42(3):145-175.
    [26]Wang Z,Zhang M L.Inductive semi-supervised multi-label learning with co-training[C]//Proceedings of the 23rd ACMSIGKDD International Conference on Knowledge Discovery and Data Mining,Halifax,Aug 13-17,2017.New York:ACM,2017:1305-1314.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700