用户名: 密码: 验证码:
结合滑动窗口与模糊互信息的多标记流特征选择
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Multi-label Streaming Feature Selection Combining Sliding Window and Fuzzy Mutual Information
  • 作者:程玉 ; 李雨 ; 王一宾 ; 陈飞
  • 英文作者:CHENG Yu-sheng;LI Yu;WANG Yi-bin;CHEN Fei;School of Computer and Information,Anqing Normal University;The University Key Laboratory of Data Science and Intelligence Application of Fujian;The University Key Laboratory of Intelligent Perception and Computing of Anhui Province;
  • 关键词:模糊互信息 ; 多标记学习 ; 数据流 ; 特征选择
  • 英文关键词:fuzzy mutual information;;multi-label learning;;streaming data;;feature selection
  • 中文刊名:XXWX
  • 英文刊名:Journal of Chinese Computer Systems
  • 机构:安庆师范大学计算机与信息院;数据科学与智能应用福建省高校重点实验室;安徽省高校智能感知与计算重点实验室;
  • 出版日期:2019-02-15
  • 出版单位:小型微型计算机系统
  • 年:2019
  • 期:v.40
  • 基金:安徽省高校重点科研项目(KJ2017A352)资助;; 福建省高校重点实验室开放课题项目(D1801)资助;; 安徽省高校重点实验室基金项目(ACAIM160102)资助
  • 语种:中文;
  • 页:XXWX201902016
  • 页数:8
  • CN:02
  • ISSN:21-1106/TP
  • 分类号:82-89
摘要
特征选择是处理高维度问题的一种有效方法,而传统的大部分算法都基于静态的特征空间.但是有些问题其特征空间和标记空间均呈现增量或动态的特点,传统的特征选择算法不再适用.针对这一问题,结合滑动窗口机制,本文提出了结合滑动窗口与模糊互信息的多标记流特征选择;同时,为了减弱互信息对特征重要程度的判断,对模糊互信息进行正则化处理,并通过正则化重新优化特征重要度目标函数.提出的算法在多标记数据集上进行了大量测试,实验结果和统计假设检验说明本文算法是有效的.
        Feature selection is an effective method to deal with high dimension problem in multi-label learning. The traditional feature selection algorithm is based on static features and labels space mostly. However,in the real life,some features cannot be known in advance,and their features or labels are incremental or dynamic. The traditional algorithms about the feature selection do not work well any longer. To solve this problem,combining with the sliding windowmechanism,a streaming feature selection method is proposed with fuzzy mutual information. At the same time,to weaken the judge of mutual information on the importance of the feature,the fuzzy mutual information is regularized and the degree of importance is redefined to optimize the objective function. The experimental results and the statistical hypothesis test further illustrate the effectiveness of our proposed algorithm.
引文
[1]Kumar V,Minz S.Multi-view ensemble learning:an optimal feature set partitioning for high-dimensional data classification[J].Know ledge&Information Systems,2015,2015:1-59.
    [2]Kong X,Yu P S.Multi-label feature selection for graph classification[C].IEEE,International Conference on Data M ining,IEEE,2011:274-283.
    [3]Lee J,Kim D W.Fast multi-label feature selection based on information-theoretic feature ranking[J].Pattern Recognition,2015,48(9):2761-2771.
    [4]Chen H,Li T,Luo C,et al.A decision-theoretic rough set approach for dynamic data mining[J].IEEE Transactions on Fuzzy Systems,2015,23(6):1958-1970.
    [5]Javidi M M,Eskandari S.Stream wise feature selection:a rough set method[J].International Journal of M achine Learning&Cybernetics,2018,9(4):667-676.
    [6]Zhang L,Hu Q,Duan J,et al.Multi-label feature selection with fuzzy rough sets[M].Rough Sets and Know ledge Technology.Springer International Publishing,2014:121-128.
    [7]Lee J,Kim D W.Feature selection for multi-label classification using multivariate mutual information[J].Pattern Recognition Letters,2013,34(3):349-357.
    [8]Lin Y,Hu Q,Liu J,et al.Multi-label feature selection based on neighborhood mutual information[J].Applied Soft Computing,2016,38(C):244-256.
    [9]Zhang Y,Zhou Z H.Multi-label dimensionality reduction via dependence maximization[C].AAAI Conference on Artificial Intelligence,AAAI 2008,Chicago,Illinois,USA,July.DBLP,2008:1503-1505.
    [10]Zhang M L,Robles V.Feature selection for multi-label naive Bayes classification[J].Information Sciences,2009,179(19):3218-3229.
    [11]Wu X,Yu K,Ding W,et al.Online feature selection with streaming features[J].IEEE Transactions on Pattern Analysis&M achine Intelligence,2013,35(5):1178-1192.
    [12]Lin Y,Hu Q,Zhang J,et al.Multi-label feature selection with streaming labels[J/OL].Information Sciences,2016,372:256-275.https://doi.org/10.1016/j.ins.2016.08.039.
    [13]Eskandari S,Javidi M M.Online streaming feature selection using rough sets[M].Elsevier Science Inc.,2016.
    [14]Chen H,Li T,Luo C,et al.A decision-theoretic rough set approach for dynamic data mining[J].IEEE Transactions on Fuzzy Systems,2015,23(6):1958-1970.
    [15]Zhao Lin,Li Jiu-shun,Cheng Jian-hua.Innovation-based adaptive Kalman filter w ith sliding w indow for integrated navigation[J].Systems Engineering and Electronics,2017,39(11):155-159.
    [16]Chang Jian-long,Cao Feng,Zhou Ao-ying.Clustering evolving data streams over sliding w indow s[J].Journal of Softw are,2007,18(4):905-918.
    [17]Yu K,Ding W,Wu X.LOFS:a library of online streaming feature selection[J].Know ledge-Based Systems,2016,113:1-3.
    [18]Rahmaninia M,Moradi P.OSFSMI:online stream feature selection method based on mutual information[J].Applied Soft Computing,2018,68:733-746.
    [19]Datar M,GionisA,Indyk P,et al.Maintaining stream statistics over sliding w indow s[J].Siam Journal on Computing,2002,31(6):1794-1813.
    [20]Guha S,Meyerson A,Mishra N,et al.Clustering data streams:theory and practice[J].IEEE Transactions on Know ledge and Data Engineering,2003,15(3):515-528.
    [21]Lin Y,Hu Q,Liu J,et al.Multi-label feature selection based on max-dependency and min-redundancy[J].Neuro Computing,2015,168(C):92-103.
    [22]Liu Jing-hua,Lin Meng-lei,Wang Chen-xi,et al.Multi-label feature selection algorithm based on local subspace[J].Pattern Recognition and Artificial Intelligence,2016,29(3):240-251.
    [23]Cheng Yu-sheng,Zhang You-sheng,Hu Xue-gang.Entropy of know ledge and rough set based on boundary region[J].Journal of System Simulation,2007,19(9):2008-2011.
    [24]Yu S,Huang T Z.Exponential weighted entropy and exponential weighted mutual information[J].Neuro Computing,2017,249:86-94.
    [25]Kamimura R.Collective mutual information maximization to unify passive and positive approaches for improving interpretation and generalization[J].Neural Networks,2017,90:56-71.
    [26]Lin Y,Hu X,Wu X.Quality of information-based source assessment and selection[J].Neurocomputing,2014,133(133):95-102.
    [27]Wu Wei-zhi.An uncertainty measure in partition-based fuzzy rough sets[J].International Journal of General Systems,2005,34(1):77-90.
    [28]Zhang M L,Zhou Z H.ML-KNN:a lazy learning approach to multi-label learning[J].Pattern Recognition,2007,40(7):2038-2048.
    [29]Ar J.Statistical comparisons of classifiers over multiple data sets[J].Journal of Machine Learning Research,2006,7(1):1-30.
    [15]赵琳,李久顺,程建华.基于滑动窗口的新息自适应组合导航算法[J].系统工程与电子技术,2017,39(11):155-159.
    [16]常建龙,曹锋,周傲英.基于滑动窗口的进化数据流聚类[J].软件学报,2007,18(4):905-918.
    [22]刘景华,林梦雷,王晨曦,等.基于局部子空间的多标记特征选择算法[J].模式识别与人工智能,2016,29(3):240-251.
    [23]程玉胜,张佑生,胡学钢.基于边界域的知识粗糙熵与粗集粗糙熵[J].系统仿真学报,2007,19(9):2008-2011.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700