基于?_(2,1)范数的在线流特征选择算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Online Streaming Feature Selection Algorithm Regularized by ?_(2,1)-norm
  • 作者:吴中华 ; 郑玮
  • 英文作者:WU Zhonghua;ZHENG Wei;School of Computer Science and Technology,Nanjing University of Science and Technology;
  • 关键词:流特征 ; 特征选择 ; ?2 ; 1范数
  • 英文关键词:treaming features;;feature selection;;?2,1-norm
  • 中文刊名:JSSG
  • 英文刊名:Computer & Digital Engineering
  • 机构:南京理工大学计算机科学与工程学院;
  • 出版日期:2019-06-20
  • 出版单位:计算机与数字工程
  • 年:2019
  • 期:v.47;No.356
  • 基金:2017年江苏省研究生科研创新计划项目(编号:KYCX17_0361)资助
  • 语种:中文;
  • 页:JSSG201906006
  • 页数:8
  • CN:06
  • ISSN:42-1372/TP
  • 分类号:29-36
摘要
高维特征数据包含大量的无关信息和冗余信息,这些信息可能会极大降低学习算法的效率。对于加速机器学习算法,提升学习模型泛化能力和避免维数灾难的影响,特征选择算法在很多应用场景下扮演重要角色。在数据特征空间未知,动态变化的场景下,传统的基于静态特征空间场景的特征选择算法因效率低而不适用。为解决特征空间动态未知的流特征场景下的特征选择问题,提出基于?2,1范数的在线流特征选择算法。利用?2,1范数的行稀疏性质和噪声不敏感的特性实现特征选择模型的构建。实验表明,在多个高维数据集上,新提出的流特征选择算法相比较其他的流特征选择算法具有较高的分类识别率和稳定性。
        High dimensional streaming feature data contain a mass of irrelevant and redundant information,which may greatly reduce the efficiency of learning algorithms. Feature selection algorithms play an important role in many application scenarios for speeding up machine learning algorithms,improving the generalization ability of learning models and avoiding the curse of dimensionality. In the scene where the feature space is unknown and dynamic,the traditional feature selection algorithm based on the static feature space is not suitable for low efficiency. In order to solve streaming feature selection problem that feature space is dynamic and unknown,the paper proposes the online streaming feature selection regularized by ?2,1-norm. The paper constructes the feature selection model using the sparse property of the ?2,1-norm and the insensitivity of the noise. Experimental results demonstrate that,compared with other streaming feature selection algorithms,the proposed feature selection algorithm has higher recognition performance and stability in multiple high-dimensional datasets.
引文
[1]Beezer R A,Hastie T,Tibshirani R,et al. The Elements of Statistical Learning:Data Mining,Inference and Prediction. By[J]. Journal of the Royal Statistical Society,2006,167(1):192-192.
    [2]Peng H,Long F,Ding C. Feature Selection Based on Mutual Information:Criteria of Max-Dependency,Max-Relevance,and Min-Redundancy[M]. IEEE Computer Society,2005.
    [3]Song L,Smola A,Gretton A,et al. Feature selection via dependence maximization[J]. Journal of Machine Learning Research,2012,1(1):1393-1434.
    [4] Urbach E R,Stepinski T F. Automatic detection of sub-km craters in high resolution planetary images[J].Planetary&Space Science,2009,57(7):880-887.
    [5]Ding W,Stepinski T F,Mu Y,et al. Subkilometer crater discovery with boosting and transfer learning[J]. Acm Transactions on Intelligent Systems&Technology,2011,2(4):1-22.
    [6]Perkins S,Theiler J. Online Feature Selection using Grafting[C]//2003:592--599.
    [7]Glocer K,Eads D,Theiler J. Online feature selection for pixel classification[C]//International Conference on Machine Learning. ACM,2005:249-256.
    [8]Zhou J,Foster D,Stine R,et al. Streaming feature selection using alpha-investing[C]//Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM,2005:384-393.
    [9]Zhou J,Foster D P,Stine R A,et al. Streamwise Feature Selection[J]. Journal of Machine Learning Research,2006,7(1):1861-1885.
    [10]Kohavi R,John G H. Wrappers for feature subset selection[J]. Artificial Intelligence,1996,97(1-2):273-324.
    [11]Yu L,Liu H. Efficient Feature Selection via Analysis of Relevance and Redundancy[J]. Journal of Machine Learning Research,2004,5(12):1205-1224.
    [12]Wu X,Yu K,Ding W,et al. Online Feature Selection with Streaming Features[J]. IEEE Transactions on Pattern Analysis&Machine Intelligence,2013,35(5):1178-1192.
    [13]Yu K,Wu X,Ding W,et al. Towards Scalable and Accurate Online Feature Selection for Big Data[J]. Acm Transactions on Knowledge Discovery from Data,2016,11(2):16.
    [14]Li J,Hu X,Tang J,et al. Unsupervised Streaming Feature Selection in Social Media[J]. 2015:1041-1050.
    [15]Yang Y,Shen H T,Ma Z,et al. l 2,1-norm regularized discriminative feature selection for unsupervised learning[C]//International Joint Conference on Artificial Intelligence. AAAI Press,2011:1589-1594.
    [16]Nie F,Huang H,Cai X,et al. Efficient and robust feature selection via joint?2,1-norms minimization[C]//International Conference on Neural Information Processing Systems. Curran Associates Inc. 2010:1813-1821.
    [17]Wen J,Lai Z,Wong W K,et al. Optimal Feature Selection for Robust Classification via?2,1-Norms Regularization[J]. 2014:517-521.
    [18]Fletcher R. Practical Methods of Optimization 2nd edn[J]. Journal of the Operational Research Society,2000,32(5):417-417.
    [19]Yu K,Ding W,Wu X. LOFS:A library of online streaming feature selection[J]. Knowledge-Based Systems,2016,113:1-3.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700