基于LapESVR的比例标签学习模型

英文篇名：Learning with Proportions Based on LapESVR
作者：石勇 ; 孟凡 ; 齐志泉
英文作者：Shi Yong;Meng Fan;Qi Zhiquan;School of Economics and Management,University of Chinese Academy of Sciences;Research Center on Fictitious Economy & Data Science,Chinese Academy of Sciences;Key Laboratory of Big Data Mining and Knowledge Management,Chinese Academy of Sciences;School of Management Science and Engineering,Central University of Finance and Economics;
关键词：比例标签学习 ; LLP ; 流形学习 ; Lap-InvCal ; LapESVR
英文关键词：Leaning with Label Proportions;;Manifold Learning;;Lap-InvCal;;LapE
中文刊名：ZWGD
英文刊名：Management Review
机构：中国科学院大学经济与管理学院;中国科学院虚拟经济与数据科学研究中心;中国科学院大数据挖掘与知识管理重点实验室;中央财经大学管理科学与工程学院;
出版日期：2019-06-30
出版单位：管理评论
年：2019
期：v.31
基金：国家自然科学基金重大研究计划(91546201);国家自然科学基金青年项目(61402429; 61702099; 71801232)
语种：中文;
页：ZWGD201906121
页数：9
CN：06
ISSN：11-5057/F
分类号：137-145

摘要

大数据时代,在实际应用中所面临的数据体量大幅增长,由于对数据进行详细标记的难度很大而且成本极高,弱标签数据已经成为了大数据时代所面临的主要数据。比例标签数据作为弱标签数据中的一个重要类型,有着广阔的应用场景,但目前仍未受到广泛关注。已有的比例标签学习模型在处理大规模问题时,计算速度往往较慢。为了提高学习速度,本文提出Lap-Inv Cal模型,利用LapESVR进行比例标签学习。大量实验表明,该模型在保证较高精度的同时,大幅提升了训练速度,能够广泛应用于大规模比例标签学习问题中。
In big data era,data volume has experienced a significant increase and it is nearly impossible to label all the collected data samples. As a result,weakly labeled data has become dominant in real world applications. Data labeled with class proportions is one of the most important categories in weakly labeled data,which has wide application scenarios but attracts little attention. Existing methods for Learning with Label Proportion Problem( LLP) usually have high complexity and are not efficient to solve large scale problems. In this paper,motivated by Lap ESVR and Inv Cal,we propose a novel LLP model named Lap-InvCal,which incorporates the idea of manifold learning into LLP. Extensive experiments demonstrate the high accuracy and speed of Lap-Inv Cal,indicating the promising potential of Lap-InvCal in handling big data.

引文

[1]Turner V.,Gantz J.F.,Reinsel D.,et al.The Digital Universe of Opportunities:Rich Data and the Increasing Value of the Internet of Things[R].International Data Corporation,White Paper,IDC_1672,2014
    [2]Mann G.S.,McCallum A.Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data[J].Journal of Machine Learning Research,2010,11(2):955-984
    [3]Tang K.,Sukthankar R.,Yagnik J.,et al.Discriminative Segment Annotation in Weakly Labeled Video[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2013
    [4]Xu X.,Li W.,Xu D.,et al.Co-Labeling for Multi-View Weakly Labeled Learning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(6):1113-1125
    [5]程圣军.基于带约束随机游走图模型的弱监督学习算法研究[D].哈尔滨工业大学博士学位论文,2014
    [6]Chapelle O.,Scholkopf B.,Zien A.Semi-Supervised Learning[J].IEEE Transactions on Neural Networks,2009,20(3):542-542
    [7]Zhu X.,Ghahramani Z.,Lafferty J.Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions[C].In Proceedings of the 20th International Conference on Machine Learning,2003
    [8]Andrews S.,Tsochantaridis I.,Hofmann T.Support Vector Machines for Multiple-Instance Learning[C].Advances in Neural Information Processing Systems,2003
    [9]Zhou Z.H.,Zhang M.L.Multi-Instance Multi-Label Learning with Application to Scene Classification[C].Advances in Neural Information Processing Systems,2007
    [10]Rueping S.Svm Classifier Estimation from Group Probabilities[C].In Proceedings of the 27th International Conference on Machine Learning,2010
    [11]Quadrianto N.,Smola A.J.,Caetano T.S.,et al.Estimating Labels from Label Proportions[J].The Journal of Machine Learning Research,2009,10:2349-2374
    [12]Lai K.T.,Yu F.X.,Chen M.S.,et al.Video Event Detection by Inferring Temporal Instance Labels[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014
    [13]Yu F.X.,Cao L.,Merler M.,et al.Modeling Attributes from Category-Attribute Proportions[C].In Proceedings of the 22nd ACM International Conference on Multimedia,2014
    [14]Yu F.,Liu D.,Kumar S.,et al.Svm for Learning with Label Proportions[C].In Proceedings of the 30rd International Conference on Machine Learning,2013
    [15]Stolpe M.,Morik K.Learning from Label Proportions by Optimizing Cluster Model Selection[C].In Joint European Conference on Machine Learning and Knowledge Discovery in Databases,2011
    [16]Hernández-González J.,Inza I.,Lozano J.A.Learning Bayesian Network Classifiers from Label Proportions[J].Pattern Recognition,2013,46(12):3425-3440
    [17]Belkin M.,Niyogi P.,Sindhwani V.Manifold Regularization:A Geometric Framework for Learning from Labeled and Unlabeled Examples[J].Journal of Machine Learning Research,2006,7(Nov):2399-2434
    [18]Chen L.,Tsang I.W.,Xu D.Laplacian Embedded Regression for Scalable Manifold Regularization[J].IEEE Transactions on Neural Networks and Learning Systems,2012,23(6):902-915
    [19]Kück H.,de Freitas N.Learning About Individuals from Group Statistics[C].In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence,2005
    [20]Chen B.C.,Chen L.,Ramakrishnan R.,et al.Learning from Aggregate Views[C].In Proceedings of the 22nd IEEE International Conference on Data Engineering,2006
    [21]Hinton G.E.,Dayan P.,Revow M.Modeling the Manifolds of Images of Handwritten Digits[J].IEEE Transactions on Neural Networks,1997,8(1):65-74
    [22]Melacci S.,Belkin M.Laplacian Support Vector Machines Trained in the Primal[J].Journal of Machine Learning Research,2011,12(3):1149-1184
    [23]Joachims T.Transductive Learning Via Spectral Graph Partitioning[C].In Proceedings of the 20th International Conference on Machine Learning,2003
    [24]Zhou D.,Bousquet O.,Lal T.N.,et al.Learning with Local and Global Consistency[C].In Advances in Reural Information Processing Systems,2003

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700