基于自步学习的无监督属性选择算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Unsupervised feature selection algorithm based on self-paced learning
  • 作者:龚永红 ; 郑威 ; 吴林 ; 谭马龙 ; 余浩
  • 英文作者:GONG Yonghong;ZHENG Wei;WU Lin;TAN Malong;YU Hao;Library, Guilin University of Aerospace Technology;Guangxi Key Laboratory of Multi-source Information Mining and Security, Guangxi Normal University;
  • 关键词:无监督学习 ; 属性选择 ; 自步学习 ; 自表达 ; 稀疏学习
  • 英文关键词:unsupervised learning;;feature selection;;self-paced learning;;self-representation;;sparse learning
  • 中文刊名:JSJY
  • 英文刊名:Journal of Computer Applications
  • 机构:桂林航天工业学院图书馆;广西师范大学广西多源信息挖掘与安全重点实验室;
  • 出版日期:2018-05-17 15:22
  • 出版单位:计算机应用
  • 年:2018
  • 期:v.38;No.338
  • 基金:国家自然科学基金资助项目(61573270);; 广西自然科学基金资助项目(2015GXNSFCB139011);; 广西研究生教育创新计划项目(YCSW2018093)~~
  • 语种:中文;
  • 页:JSJY201810019
  • 页数:6
  • CN:10
  • ISSN:51-1307/TP
  • 分类号:110-115
摘要
针对现有属性选择算法平等地对待每个样本而忽略样本之间的差异性,从而使学习模型无法避免噪声样本影响问题,提出一种融合自步学习理论的无监督属性选择(UFS-SPL)算法。首先自动选取一个重要的样本子集训练得到属性选择的鲁棒性初始模型,然后逐步自动引入次要样本提升模型的泛化能力,最终获得一个能避免噪声干扰而同时具有鲁棒性和泛化性的属性选择模型。在真实数据集上与凸半监督多标签属性选择(CSFS)、正则化自表达(RSR)和无监督属性选择的耦合字典学习方法(CDLFS)相比,UFS-SPL的聚类准确率、互信息和纯度平均提升12. 06%、10. 54%和10. 5%。实验结果表明,UFS-SPL能够有效降低数据集中无关信息的影响。
        Concerning that the samples are treated equally and the difference of samples is ignored in the conventional feature selection algorithms, as well as the learning model cannot effectively avoid the influence from the noise samples, an Unsupervised Feature Selection algorithm based on Self-Paced Learning( UFS-SPL) was proposed. Firstly, a sample subset containing important samples for training was selected automatically to construct the initial feature selection model, then more important samples were added gradually into the former model to improve its generalization ability, until a robust and generalized feature selection model was constructed or all samples were selected. Compared with Convex Semi-supervised multi-label Feature Selection( CSFS), Regularized Self-Representation( RSR) and Coupled Dictionary Learning method for unsupervised Feature Selection( CDLFS), the clustering accuracy, normalized mutual information and purity of UFS-SPL were increased by 12. 06%, 10. 54% and 10. 5%, respectively. The experimental results show that UFS-SPL can effectively remove the effect of irrelevant information from original data sets.
引文
[1]ZHU X,HUANG Z,YANG Y,et al.Self-taught dimensionality reduction on the high-dimensional small-sized data[J].Pattern Recognition,2013,46(1):215-229.
    [2]ZHU X,LI X,ZHANG S,et al.Robust joint graph sparse coding for unsupervised spectral feature selection[J].IEEE Transactions on Neural Networks and Learning Systems,2017,28(6):1263-1275.
    [3]ZHU X,HUANG Z,SHEN H T,et al.Dimensionality reduction by mixed kernel canonical correlation analysis[J].Pattern Recognition,2012,45(8):3003-3016.
    [4]ZHU X,SUK H I,THUNG K H,et al.Joint discriminative and representative feature selection for Alzheimers disease diagnosis[C]//Proceedings of the 2016 Machine Learning in Medical Imaging,LNCS 10019.Berlin:Springer,2016:77-85.
    [5]ZHU X,JIN Z,JI R.Learning high-dimensional multimedia data[J].Multimedia Systems,2017,23(3):281-283.
    [6]HUANG G,SONG S J,GUPTA J N,et al.Semi-supervised and unsupervised extreme learning machines[J].IEEE Transactions on Cybernetics,2014,44(12):2405-2417.
    [7]MITRA P,MURTHY C A,PAL S K.Unsupervised feature selection using feature similarity[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(3):301-312.
    [8]WAN J,YANG M,CHEN Y.Discriminative cost sensitive Laplacian score for face recognition[J].Neurocomputing,2015,152:333-344.
    [9]YANG Y,SHEN H T,MA Z,et al.L2,1-norm regularized discriminative feature selection for unsupervised learning[C]//IJCAI2011:Proceedings of the 22nd International Joint Conference on Artificial Intelligence.Menlo Park:AAAI Press,2011,2:1589-1594.
    [10]ZHU P,ZUO W,ZHANG L,et al.Unsupervised feature selection by regularized self-representation[J].Pattern Recognition,2015,48(2):438-446.
    [11]REITMAIER T,CALMA A,SICK B.Transductive active learning-a new semi-supervised learning approach based on iteratively refined generative models to capture structure in data[J].Information Sciences,2015,293:275-298.
    [12]DORNAIKA F,TRABOULSI Y E.Learning flexible graph-based semi-supervised embedding[J].IEEE Transactions on Cybernetics,2015,46(1):206-218.
    [13]KUMAR M P,PACKER B,KOLLER D.Self-paced learning for latent variable models[C]//NIPS 2010:Proceedings of the 23rd International Conference on Neural Information Processing Systems.[S.l.]:Curran Associates Inc.,2010,1:1189-1197.
    [14]JIANG L,MENG D,YU S I,et al.Self-paced learning with diversity[EB/OL].[2017-12-12].http://www.jdl.ac.cn/doc/2011/201511618293424634_2014_nips_self-paced%20learning%20with%20divers.pdf.
    [15]HOU C,NIE F,LI X,et al.Joint embedding learning and sparse regression:a framework for unsupervised feature selection[J].IEEE Transactions on Cybernetics,2017,44(6):793-804.
    [16]MENG D,ZHAO Q,JIANG L.A theoretical understanding of selfpaced learning[J].Information Sciences,2017,414:319-328.
    [17]ZHAO Q,MEMG D,JIANG L,et al.Self-paced learning for matrix factorization[C]//AAAI 2015:Proceedings of the 29th AAAIConference on Artificial Intelligence.Menlo Park,CA:AAAIPress,2015:3196-3202.
    [18]CHANG X,NIE F,YANG Y,et al.A convex formulation for semi-supervised multi-label feature selection[C]//AAAI 2014:Proceedings of the 28th AAAI Conference on Artificial Intelligence.Menlo Park:AAAI Press,2014:1171-1177.
    [19]ZHU P,HU Q,ZHANG C,et al.Coupled dictionary learning for unsupervised feature selection[EB/OL].[2017-12-12].https://www.researchgate.net/publication/301895225_Coupled_Dictionary_Learning_for_Unsupervised_Feature_Selection.
    [20]UCI.Machine learning repository[DB/OL].[2018-01-10].http://archive.ics.uci.edu/ml/index.php.
    [21]LCUN Y,BOTTOU L,BENGIO Y,et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
    [22]KAMACHI M,LYONS M,GYOBA J.The Japanese Female Facial Expression(JAFFE)database[DB/OL].[2018-01-10].http://www.kasrl.org/jaffe.html.
    [23]LI J,CHENG K,WANG S,et al.Feature selection:a data perspective[J].ACM Computing Surveys,2010,9(4):Article No.39.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700