基于改进匹配网络的单样本学习

英文篇名：One-shot learning based on improved matching network
作者：蒋留兵 ; 周小龙 ; 姜风伟 ; 车俐
英文作者：JIANG Liubing;ZHOU Xiaolong;JIANG Fengwei;CHE Li;School of Computer and Information Security,Guilin University of Electronic Technology;Key Laboratory of Wireless Broadband Communication and Signal Processing in Guangxi,Guilin University of Electronic Technology;School of Information and Communication,Guilin University of Electronic Technology;
关键词：深度学习 ; 小样本 ; 改进匹配网络 ; 平方欧氏距离 ; LSTM
英文关键词：deep learning;;few-shot;;improved matching network;;squared Euclidean distance;;LSTM
中文刊名：XTYD
英文刊名：Systems Engineering and Electronics
机构：桂林电子科技大学计算机与信息安全学院;桂林电子科技大学无线宽带通信与信号处理重点实验室;桂林电子科技大学信息与通信学院;
出版日期：2019-03-22 11:12
出版单位：系统工程与电子技术
年：2019
期：v.41;No.477
基金：国家自然科学基金(61561010);; 广西自然科学基金(2017GXNSFAA198089);; 广西重点研发计划(桂科AB18126003,AB18221016)资助课题
语种：中文;
页：XTYD201906006
页数：8
CN：06
ISSN：11-2422/TN
分类号：43-50

摘要

当前深度学习是基于大量标注数据样本通过多层网络实现模型自动识别。然而,在很多特殊场景下,难以获取大量标注样本数据,小样本物体识别仍是深度学习下关键性的难题。针对这一问题,首先利用4层深度卷积神经网络(deep convolution neural network,DCNN)提取训练样本和测试样本的高层语义特征,然后基于改进的匹配网络分别采用双向LSTM和attLSTM算法对训练样本和测试样本深入提取更加关键和有用特征并进行编码,最后在平方欧氏距离上利用softmax非线性分类器对测试样本进行分类识别。实验通过Omniglot数据集对提出的改进模型进行测试,取得了非常好的效果。改进的模型即使在最复杂的20-way 1-shot情况下,依然能够达到93.2%的识别率,Vinyals的原创匹配网络模型在20-way 1-shot的情况下只能达到88.2%的识别率,与原创匹配网络模型相比,改进的模型在类别数更多而样本数较少的复杂场景下具有更好的识别效果。
The current deep learning is based on a large number of labeled data samples to automatically identify the model through a multi-layer network.However,in many special scenarios,it is difficult to obtain a large amount of sample data,and the identification of few-shot learning is still a key problem in deep learning.To solve this problem,the four-layer deep convolutional neural network(DCNN)is first used to extract the high-level semantic features of the training samples and the test samples.Then use the bi-directional LSTM and attLSTM algorithms for further extraction and code of more critical and useful features of training samples and test samples based on the improved matching network.Finally,the softmax nonlinear classifier is used to classify the test samples on the squared euclidean distance.The experiment tests on the proposed improved model with the Omniglot data set and achieves very good results.The improved model can achieve a 93.2% recognition rate even in the most complicated 20-way 1-shot case,and the original matching network model of Vinyals only achieve 88.2% recognition in the case of 20-way 1-shot.Compared with the original matching network model,the improved model has a better recognition effect in a complex scenario with more categories and fewer samples.

引文

[1]FEI-FEI L,FERGUS R,PERONA P.One-shot learning of object categories[J].IEEE Trans.on Pattern Analysis and Machine Intelligence,2006,28(4):594-611.
    [2]LAKE B M,SALAKHUTDINOV R,TENENBAUM J B.Human-level concept learning through probabilistic program induction[J].Science,2015,350(6266):1332-1338.
    [3]VINYALS O,BLUNDELL C,LILLICRAP T,et al.Matching networks for one shot learning[C]∥Proc.of the Advances in Neural Information Processing Systems,2016:3630-3638.
    [4]SNELL J,SWERSKY K,ZEMEL R.Prototypical networks for few-shot learning[C]∥Proc.of the Advances in Neural Information Processing Systems,2017:4080-4090.
    [5]RAVI S,LAROCHELLE H.Optimization as a model for fewshot learning[C]∥Proc.of the International Conference on Learning Represention,2017:1-11.
    [6]MUNKHDALAI T,YU H.Meta networks[C]∥Proc.of the International Conference on Machine Learning,2017:2554-2563.
    [7]SIMARD P Y,STEINKRAUS D,PLATT J C.Best practices for convolutional neural networks applied to visual document analysis[C]∥Proc.of the 7th International Conference on Document Analysis and Recognition,2003,3:958-962.
    [8]KESHARI R,VATSA M,SINGH R,et al.Learning structure and strength of CNN filters for small sample size training[C]∥Proc.of the IEEE Conference on Computer Vision and Pattern Recognition,2018:9349-9358.
    [9]DUAN M,LI K,YANG C,et al.A hybrid deep learning CNN-ELM for age and gender classification[J].Neuro Computing,2018,275:448-461.
    [10]KOCH G,ZEMEL R,SALAKHUTDINOV R.Siamese neural networks for one-shot image recognition[C]∥Proc.of the IC-ML Deep Learning Workshop,2015,2.
    [11]BELLET A,HABRARD A,SEBBAN M.A survey on metric learning for feature vectors and structured data[J].Computer Science,2013.
    [12]CHOROWSKI J K,BAHDANAU D,SERDYUK D,et al.Attention-based models for speech recognition[C]∥Proc.of the Advances in Neural Information Processing Systems,2015:577-585.
    [13]LIU Y,SUN C,LIN L,et al.Learning natural language inference using bi-directional LSTM model and inner attention[J].ArXiv Preprint ArXiv:1605.09090,2016.
    [14]VINYALS O,BLUNDELL C,LILLICRAP T,et al.Matching networks for one shot learning[C]∥Proc.of the Advances in Neural Information Processing Systems,2016:3630-3638.
    [15]EDWARDS H,STORKEY A.Towards a neural statistician[J].Stat,2017,1050:20.
    [16]MEHROTRA A,DUKKIPATI A.Generative adversarial residual pairwise networks for one shot learning[J].ArXiv Preprint ArXiv:1703.08033,2017.
    [17]KAISER L,NACHUM O,ROY A,et al.Learning to remember rare events[J].ArXiv Preprint ArXiv:1703.03129,2017.
    [18]FINN C,ABBEEL P,LEVINE S.Model-agnostic meta-learning for fast adaptation of deep networks[C]∥Proc.of the International Conference on Machine Learning,2017:1126-1135.
    [19]SANTORO A,BARTUNOV S,BOTVINICK M,et al.Metalearning with memory-augmented neural networks[C]∥Proc.of the International Conference on Machine Learning,2016:1842-1850.
    [20]MOCANU D C,MOCANU E.One-shot learning using mixture of variational auto-encoders:ageneralization learning approach[C]∥Proc.of the 17th International Conference on Autonomous Agents and Multi-agent Systems,2018:2016-2018.
    [21]SANTORO A,BARTUNOV S,BOTVINICK M,et al.One-shot learning with memory-augmented neural networks[J].ArXiv Preprint ArXiv:1605.06065,2016.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700