用户名: 密码: 验证码:
概念漂移不平衡数据流随机平衡采样分类算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Mining concept-drifting imbalanced streams using random sampling algorithm
  • 作者:袁磊 ; 季梦遥
  • 英文作者:YUAN Lei;JI Mengyao;Department of Information Center,Renmin Hospital of Wuhan University;Department of Gastroenterology,Renmin Hospital of Wuhan University;
  • 关键词:概念漂移 ; 不平衡流数据 ; 采样 ; 集成分类器
  • 英文关键词:imbalanced data;;imbalanced streams;;sampling;;ensemble classifier
  • 中文刊名:HDZK
  • 英文刊名:Journal of Hubei University(Natural Science)
  • 机构:武汉大学人民医院信息中心;武汉大学人民医院消化内科;
  • 出版日期:2019-01-05
  • 出版单位:湖北大学学报(自然科学版)
  • 年:2019
  • 期:v.41;No.153
  • 基金:国家自然科学基金(61401263);; 湖北省自然基金(2018CFB136);; 中央高校基本科研业务费专项资金(302-410500195);; 武汉大学自主科研项目(302-410500195、302-410500195)资助
  • 语种:中文;
  • 页:HDZK201901018
  • 页数:6
  • CN:01
  • ISSN:42-1212/N
  • 分类号:100-105
摘要
带概念漂移不平衡流数据分类研究是机器学习和现实应用领域的一个难点和热点.针对带概念漂移不平衡流数据的动态性和不平衡性,本文中提出了随机平衡采样算法用于再平衡不平衡数据流.之后,在随机平衡采样算法的基础上提出了一种新的处理带概念漂移的不平衡流数据集成分类算法用于抵抗流数据的概念漂移和不平衡性.理论和实验表明本文中提出的集成分类算法对处理带概念漂移的不平衡流数据较强的多样性和泛化能力.
        Mining imbalanced data stream with concept drifts become an important and challenging task in machine learning and real world application areas. In this paper,we proposed a new data sampling algorithm,called random balance sampling algorithm( RBS),to battle against the imbalanced data stream. Then a new ensemble classifier,called random balance sampling concept-drifting imbalanced streaming ensemble algorithm( RBSCISE) was built from imbalanced and concept-drifting streaming scenario. The theoretical and empirical study shows that the new ensemble classifier is superior and more robust for concept-drifting imbalanced data streams.
引文
[1]刘静静,智淑敏.一种传感器网络不确定感知数据挖掘方法研究[J].电子设计工程,2016(13):73-76.
    [2]陈小芳,葛晓滨,马冠骏.基于数据挖掘的网络购物用户行为分析[J].牡丹江师范学院学报(自然科学版),2016(1):32-35.
    [3]周立军,张杰,吕海燕.基于数据挖掘技术的网络入侵检测技术研究[J].现代电子技术,2016(6):10-13.
    [4]李莉,王小刚.基于Spark的并行K-means气象数据挖掘研究[J].信息技术,2017(9):26-30.
    [5]Domingos P,Hulten G.Mining high-speed data streams[C].Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Boston,2000:71-80.
    [6]Tao Y.Mining Time-Changing Data Streams[J].Computer Science,2011.
    [7]Nú1ez M,Fidalgo R,Morales R.Learning in environments with unknown dynamics:towards more robust concept learners.[J].Journal of Machine Learning Research,2007,8(8):2595-2628.
    [8]Bifet A,Holmes G,Pfahringer B,et al.New ensemble methods for evolving data streams[C]//Intell Data Anal,2009.
    [9]Gu X F,Xu J W,Huang S J,et al.An improving online accuracy updated ensemble method in learning from evolving data streams[C]//International Computer Conference on Wavelet Active Media Technology and Information Processing,2015.
    [10]Loeffel P X,Biffet A,Marsala C,et al.Droplets Ensemble Learning on Drifting Data Streams[C]//International Symposium on Intelligent Data Analysis,Springer,Cham,2017.
    [11]Gao J,Ding B,Fan W,et al.Classifying data streams with skewed class distributions and concept drifts[J].IEEE Internet Computing,2008,12(6):37-49.
    [12]Ditzler G,Polikar R,Chawla N.An incremental learning algorithm for non-stationary environments and class imbalance[C].//International Conference on Pattern Recognition,2010.
    [13]Wang S K,Dai B R.A G-means update ensemble learning approach for the imbalanced data stream with concept drifts[M].New York:Springer International Publishing,2016.
    [14]Sun Y,Wang Z,Li H,et al.A novel ensemble classification for data streams with class imbalance and concept drift[J].International Journal of Performability Engineering,2017,13(6):945-955.
    [15]季梦遥,袁磊.不平衡数据的随机平衡采样bagging算法分类研究[J].贵州大学学报(自然科学版),2017(6):54-58.
    [16]袁磊,季梦遥.基于随机平衡采样的不平衡数据集分类算法研究[J].海南大学学报(自然科学版),2017(3):228-233.
    [17]袁磊,季梦遥.基于随机平衡采样的不平衡数据流分类研究[J].云南民族大学学报(自然科学版),2018(1):63-68.
    [18]Ditzler G,Polikar R,Chawla N V.An incremental learning algorithm for non-stationary environments and class imbalance[C]//20th International Conference on Pattern Recognition,ICPR 2010,Istanbul,Turkey,2010:2997-3000.
    [19]Ditzler G,Polikar R.Incremental learning of concept drift from streaming imbalanced data[J].IEEE Transactions on Knowledge&Data Engineering,2013,25(10):2283-2301.
    [20]Gao J,Fan W,Han J,et al.A general framework for mining concept-drifting data streams with skewed distributions[C]//Siam International Conference on Data Mining,Minnesota,Usa,2007(4):26-28.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700