摘要
零样本学习(Zero-Shot Learning,ZSL)利用视觉和语义特征关联模型进行可鉴别知识迁移,但视觉和语义数据不是简单的对应关系,难以直接建立映射函数。提出一种局部敏感双字典方法,主要贡献有两点:(1)双字典方法。视觉-语义的单字典映射缺乏直接关联的共有变量,提出双字典方法为视觉和语义添加一个共有结构的描述字典,从而构造更合理的视觉-语义关联通道。(2)局部敏感的流形保持方法。在双字典学习中,局部结构信息的描述是关键点,通过构造流形结构图来定义局部敏感约束项,对字典学习和局部流形保持进行联合优化。在AwA和CUB数据集上的实验结果表明,该方法在分类准确率上优于对比算法。
Zero-Shot Learning(ZSL)has utilized association model of visual and semantic features to transfer discriminative knowledge. But visual and semantic data is not a simple correspondence, it's difficult to directly establish the mapping. A locality sensitive double dictionary method is proposed, which has two main contributions:(1)Double dictionary method.Visual-semantic single dictionary mapping lacks direct associated common variables, a double dictionary method is proposed to add a descriptive dictionary of common structure for visual and semantic features, a more reasonable visualsemantic association channel is constructed.(2)Locality sensitive manifold preserving method. In the double dictionary learning, the description of local structure information is vital. A manifold structure graph is constructed to define locality sensitive regularization, and to jointly optimize the dictionary learning and local manifold preserving. The experimental results on Aw A and CUB datasets show the proposed method outperforms the compared algorithms in accuracy.
引文
[1] Xu X,Shen F,Yang Y,et al.Matrix tri-factorization with manifold regularizations for zero-shot learning[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition,2017:2007-2016.
[2] Lampert C H,Nickisch H,Harmeling S.Learning to detect unseen object classes by between-class attribute transfer[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition,2009:951-958.
[3] Norouzi M,Mikolov T,Bengio S,et al.Zero-shot learning by convex combination of semantic embeddings[C]//Proceedings of International Conference on Learning Representations,2014:1-9.
[4] Mensink T,Gavves E,Snoek C G M.COSTA:co-occurrence statistics for zero-shot classification[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition,2014:2441-2448.
[5] Changpinyo S,ChaoW L,Gong B,et al.Synthesized classifiers for zero-shot learning[C]//Proceedings of the 2016IEEE Conference on Computer Vision and Pattern Recognition,2016:5327-5336.
[6] Akata Z,Reed S,Walter D,et al.Evaluation of output embeddings for fine-grained image classification[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition,2015:2927-2936.
[7] Xian Y,Akata Z,Sharma G,et al.Latent embeddings for zero-shot classification[C]//Proceedings of the 2016IEEE Conference on Computer Vision and Pattern Recognition,2016:69-77.
[8] Zhang Z,Saligrama V.Zero-shot learning via semantic similarity embedding[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision,2015:4166-4174.
[9] Ding Z,Shao M,Fu Y.Low-rank embedded ensemble semantic dictionary for zero-shot learning[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition,2017:6005-6013.
[10] Kolouri S,Rostami M,Owechko Y,et al.Joint dictionaries for zero-shot learning[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence,2018:3431-3439.
[11] Wah C,Branson S,Welinder P,et al.The caltech-ucsd birds-200-2011 dataset[R].California Institute of Technology,2011.
[12] Kodirov E,Xiang T,Fu Z,et al.Unsupervised domain adaptation for zero-shot learning[C]//Proceedings of the2015 IEEE Conference on Computer Vision and Pattern Recognition,2015:2452-2460.
[13] Cai D,He X,Han J.Locally consistent concept factorization for document clustering[J].IEEE Transactions on Knowledge and Data Engineering,2011,23(6):902-913.
[14] Yan S,Xu D,Zhang B,et al.Graph embedding and extensions:a general framework for dimensionality reduction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(1):40-51.
[15] Lee H,Battle A,Raina R,et al.Efficient sparse coding algorithms[C]//Proceedings of 20th Annual Conference on Neural Information Processing Systems,2006:801-808.
[16] Mikolov T,Sutskever I,Chen K,et al.Distributed representations of words and phrases and their compositionality[C]//Proceedings of 27th Annual Conference on Neural Information Processing Systems,2013:3111-3119.
[17] Zhang Z,Saligrama V.Zero-shot learning via joint latent similarity embedding[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition,2016:6034-6042.
[18] Lazaridou A,Bruni E,Baroni M.Is this a wampimuk?crossmodal mapping between distributional semantics and the visual world[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,2014:1403-1414.
[19] Kruskal J.Multidemensional scaling by optimizing goodness of fit to a nonmetric hypothesis[J].Psychometrika,1964,29:1-27.