基于低秩和图拉普拉斯的属性选择算法

英文篇名：Attribute selection algorithm based on low rank and Tula Laplace
作者：曹再辉 ; 吴庆涛 ; 施进发
英文作者：CAO Zaihui;WU Qingtao;SHI Jinfa;Collaborative Innovation Center for Aviation Economy Development;Zhengzhou University of Aeronautics;North China University of Water Resources and Electric Power;
关键词：属性选择 ; 低秩约束 ; 图拉普拉斯 ; 子空间学习 ; 稀疏正则化
英文关键词：attribute selection;;low rank constraints;;Tula Laplace;;subspace learning;;sparse regularization
中文刊名：JSGG
英文刊名：Computer Engineering and Applications
机构：航空经济发展河南省协同创新中心;郑州航空工业管理学院;华北水利水电大学;
出版日期：2018-09-01
出版单位：计算机工程与应用
年：2018
期：v.54;No.912
基金：国家自然科学基金(No.71371172);; 河南省高等学校重点科研项目(No.18A520051);; 郑州航空工业管理学院青年基金(No.2016143001)
语种：中文;
页：JSGG201817018
页数：7
CN：17
分类号：115-120+126

摘要

针对无监督属性选择算法使用单一方法,未考虑数据间内在相关性和噪声等问题,提出一种基于属性自表达的低秩无监督属性选择算法。算法首先将稀疏正则化(l2,1-范数)引入属性自表达损失函数中实现无监督稀疏学习,其次在系数矩阵中加入低秩约束以降低噪声和离群点的影响,然后利用低秩结构和图拉普拉斯正则化使子空间学习兼顾数据的全局和局部结构,最后通过属性自表达实现无监督学习。经数据集上多次迭代验证,该算法能够快速收敛并达到全局最优,与SOGFS、PCA、LPP、RSR等四种算法相比分类准确率平均提高了16.11%、14.03%、9.92%和4.2%,并且在各数据集上互信息平均值也是最高的,说明该算法有效、高效。
Aiming at the problem that unsupervised attribute selection algorithm uses a single method without considering the intrinsic correlation and noise between the data, a low rank unsupervised attribute selection algorithm based on attribute self expression is proposed. Firstly, the sparse regularization is introduced into the attribute self expression loss function to realize unsupervised sparse learning, secondly, the low rank constraint is added to the algorithm coefficient matrix to reduce the influence of noise and outliers, then, the low rank structure and the Tula Laplace regularization are used in the algorithm to make the subspace study both the global and local structures of the data, finally, the algorithm achieves unsupervised learning through attribute self-expression. The experimental results show that the algorithm can converge quickly and achieve global optimum. Compared with SOGFS, PCA, LPP, RSR and other four kinds of algorithm, the classification accuracy rate respectively increases by 16.11%, 14.03%, 9.92% and 4.2%, and the average value of mutual information in each data set is the highest, indicating that the algorithm is effective and efficient.

引文

[1]Zhu X,Huang Z,Shen H,et al.Dimensionality reduction by mixed kernel canonical correlation analysis[J].Pattern Recognition,2012,45(8):3003-3016.
    [2]Zhu X,Zhang S,Jin Z,et al.Missing value estimation for mixed-attribute data sets[J].IEEE Transactions on Knowledge&Data Engineering,2010,23(1):110-121.
    [3]Zhu X,Zhang S,Jin Z,et al.Missing value estimation for mixed-attribute data sets[J].IEEE Transactions on Knowledge&Data Engineering,2010,23(1):110-121.
    [4]Zhu X,Suk H I,Shen D.Matrix-similarity based loss function and feature selection for Alzheimer’s disease diagnosis[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Columbus,OH:IEEE,2014:3089-3096.
    [5]Zhu X,Li X,Zhang S.Block-row sparse multiview multilabel learning for image classification[J].IEEE Trans on Cybern,2015.
    [6]Zhu X,Suk H I,Shen D.A novel matrix-similarity based loss function for joint regression and classification in AD diagnosis[J].Neuro Image,2014,100:91-105.
    [7]Zhu X,Huang Z,Cheng H,et al.Sparse hashing for fast multimedia search[J].ACM Transactions on Information Systems Information Systems,2013,31(2):595-605.
    [8]Zhu X,Huang Z,Yang Y,et al.Self-taught dimensionality reduction on the high-dimensional small-sized data[J].Pattern Recognition,2013,46(1):215-229.
    [9]Pyatykh S,Hesser J,Zheng L.Image noise level estimation by principal component analysis[J].IEEE Transactions on Image Processing,2013,22(2):687-699.
    [10]Fan Z,Xu Y,Zhang D.Local linear discriminant analysis framework using sample neighbors[J].IEEE Transactions on Neural Networks,2011,22(7):1119-1132.
    [11]He X,Niyoqi P.Locality preserving projections[D].USA:University of Chicago,2005:137-145.
    [12]Konietschke F,Pauly M.Bootstrapping and permuting paired t-test type statistics[J].Statistics and Computing,2013,24(3):283-296.
    [13]Liimatainen K,Heikkil?R,Yliharja O,et al.Sparse logistic regression and polynomial modelling for detection of artificial drainage networks[J].Remote Sensing Letters,2015,6(4):311-320.
    [14]种智,何威,程德波,等.基于子空间学习的图稀疏属性选择算法[J].计算机应用研究,2016,33(9):2679-2683.
    [15]倪志伟,肖宏旺,伍章俊,等.基于改进离散型萤火虫群优化算法和分形维数的属性选择方法[J].模式识别与人工智能,2013,26(12):1169-1179.
    [16]李敬明,倪志伟,许莹,等.基于二进制萤火虫算法的属性选择方法研究[J].系统科学与数学,2017,32(2):407-425.
    [17]胡荣耀,刘星毅,程德波,等.基于稀疏学习的低秩属性选择算法[J].计算机工程与应用,2017,53(10):132-139.
    [18]魏浩,丁要军.一种基于相关的属性选择改进算法[J].计算机应用与软件,2014,31(8):280-285.
    [19]Gu Q,Li Z,Han J.Joint feature selection and subspace Learning[C]//IJCAI,2011:1294-1299.
    [20]Cai X,Ding C,Nie F,et al.On the equivalent of lowrank linear regressions and linear discriminant analysis based regressions[C]//ACM SIGKDD International Conference on Knowledge Discovery&Data Mining,2013:1124-1132.
    [21]Merris R.Laplacian matrices of graphs:a survey[J].Linear Algebra&Its Applications,1994,197/198(2):143-176.
    [22]Zhu X,Suk H I,Wang L,et al.A novel relational regularization feature selection method for joint regression and classification in ad diagnosis[J].Medical Image Analysis,2015,75(6):570-577.
    [23]Zhu X,Zhang S,Zhang J,et al.Cost-sensitive imputing missing values with ordering[C]//AAAI,2007,2.
    [24]Zhu P,Zuo W,Zhang L,et al.Unsupervised feature selection by regularized self-representation[J].Pattern Recognition,2015,48(2):438-446.
    [25]Nie F,Zhu W,Li X.Unsupervised feature selection with structured graph optimization[C]//AAAI,2016:1302-1308.
    [26]Graham D,Allinson N.Characterizing virtual eigensignatures for general-purpose face recognition[J].Journal of Nursing Management,1998,3(2):87-91.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700