用户名: 密码: 验证码:
Fixed-size ensemble classifier system evolutionarily adapted to a recurring context with an unlimited pool of classifiers
详细信息    查看全文
  • 作者:Konrad Jackowski (1)
  • 关键词:Concept drift ; Multiple classifier systems
  • 刊名:Pattern Analysis & Applications
  • 出版年:2014
  • 出版时间:November 2014
  • 年:2014
  • 卷:17
  • 期:4
  • 页码:709-724
  • 全文大小:3,157 KB
  • 参考文献:1. Bifet A (2009) Adaptive learning and mining for data streams and frequent patterns. PhD thesis, Universitat Politecnica de Catalunya
    2. Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1): 69鈥?01
    3. Zliobait藱e I (2010) Adaptive training set formation. PhD thesis, Vilnius University, Lithuania
    4. Hilas C (2009) Designing an expert system for fraud detection in private telecommunications networks. Expert Syst Appl 36(9):11559鈥?1569 CrossRef
    5. Black M, Hickey R (2002) Classification of customer call data in the presence of concept drift and noise, In: Soft-Ware 2002, Proceedings of the 1st International Conference on Computing in an Imperfect World, Springer, Berlin, pp 74鈥?7
    6. Delany SJ, Cunningham P, Tsymbal A (2005) A comparison of ensemble and case-based maintenance techniques for handling concept drift in spam filtering. Technical Report TCD-CS-2005-19, Trinity College Dublin
    7. Cunningham P, Nowlan N (2003) A case-based approach to spam filtering that can track concept drift. In: ICCBR-2003 Workshop on Long-Lived CBR Systems. Springer, London, pp 3鈥?6
    8. Kelly MG, Hand DJ, Adams NM (1999) The impact of changing populations on classifier performance. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining-KDD鈥?9, ACM Press, New York, USA, pp 367鈥?71
    9. Markou M, Singh S (2003) Novelty detection: a review鈥攑art 1: statistical approaches. Signal Process 83:2481鈥?497 CrossRef
    10. Widmer G, Kubat M (1993) Effective learning in dynamic environments by explicit context tracking. Proceedings of the 6th European Conference on Machine Learning ECML-1993, Springer. Lecture Notes Comput Sci 667:227鈥?43
    11. Kuncheva LI (2008) Classifier ensembles for detecting concept change in streaming data: overview and perspectives. In: Proceedings of the 2nd Workshop SUEMA, ECAI 2008, Patras, Greece, pp 5鈥?
    12. Mak L-O, Krause P (2006) Detection and management of concept drift, machine learning and cybernetics. In: International Conference on, 2006, pp 3486鈥?491
    13. Ouyang Z, Zhou M, Wang T, Wu Q (2009) Mining concept-drifting and noisy data streams using ensemble classifiers, artificial intelligence and computational intelligence, 2009. AICI鈥?9. International Conference, vol 4, pp 360鈥?64
    14. Tsymbal A (2004) The problem of concept drift: definitions and related work, Technical Report TCD-CS-2004-15. Department of Computer Science, Trinity College Dublin, Ireland
    15. Klinkenberg R, Renz I (1998) Adaptive information filtering: Learning in the presence of concept drifts. In: Learning for text categorization. AAAI Press, Marina del Rey, pp 33鈥?0
    16. Klinkenberg R (2004) Learning drifting concepts: example selection vs. example weighting. Intell data anal 8:281鈥?00
    17. Chen S, Wang H, Zhou S, Yu P (2008) Stop chasing trends: Discovering high order models in evolving data. In: Proceedings of the 24th International Conference on Data Engineering, 2008, pp 923鈥?32
    18. Kuncheva LI (2004) Classifier ensembles for changing environments. In: 5th International Workshop on Multiple Classifier Systems, MCS 04, LNCS, vol. 3077, Springer, Berlin, pp 1鈥?5
    19. Kuncheva LI (2004) Combining pattern classifiers. Methods and algorithms. Wiley, New York CrossRef
    20. Littlestone N, Warmuth MK (1994) The weighted majority algorithm. Inf Comput 108:212鈥?61 CrossRef
    21. Schlimmer J, Granger R (1986) Incremental learning from noisy data. Mach Learn 1(3):317鈥?54
    22. Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37鈥?6
    23. Ouyang Z, Gao Y, Zhao Z, Wang T (2011) Study on the classification of data streams with concept drift. In: 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp 1673鈥?677
    24. Domingos P, Hulten G (2003) A general framework for mining massive data streams. J Comput Graph Stat 12:945鈥?49 CrossRef
    25. Polikar R, Udpa L, Udpa SS, Honavar V (2001) Learn++: An incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern Part C Appl Rev 31(4): 497鈥?08
    26. Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, San Francisco, pp 97鈥?06
    27. Bifet A, Holmes G, Pfahringer B, Read J, Kranen P, Kremer Hardy, Jansen H, Seidl T (2011) MOA: a real-time analytics open source framework. ECML/PKDD 3:617鈥?20
    28. Bifet A, Holmes G, Pfahringer B, Kirkby R, Gavalda R (2009) New ensemble methods for evolving data streams. In: KDD鈥?9 Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 139鈥?48
    29. Chu F, Zaniolo C (20074) Fast and light boosting for adaptive mining of data streams. In: Dai H, Srikant R, Zhang C (eds) PAKDD. Springer, Berlin,*** pp 282鈥?92
    30. Bifet A, Gavalda R (2006) Learning from time-changing data with adaptive windowing. Technical report, Universitat Politecnica de Catalunya, 2006. (http://www.lsi.upc.edu/~abifet)
    31. Lazarescu M, Venkatesh S, Bui H (2003) Using multiple windows to track concept drift. Technical report. Faculty of Computer Science, Curtin University
    32. Kurlej B, Wo藕niak M (2011) Learning Curve in Concept Drift While Using Active Learning Paradigm. In: Bouchachia A (ed) Adaptive and Intelligent Systems. Springer Berlin Heidelberg, pp 98鈥?06
    33. Koychev I (2000) Gradual forgetting for adaptation to concept drift. In: Proceedings of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, pp 101鈥?06
    34. Stanley K (2003) Learning concept drift with a committee of decision trees. Technical Report UT-AI-TR-03-302, Computer Sciences Department, University of Texas
    35. Tsymbal A, Pechenizkiy M, Cunningham P, Puuronen S (2008) Dynamic integration of classifiers for handling concept drift. Inf Fusion 1:56鈥?8 CrossRef
    36. Shipp CA, Kuncheva LI (2002) Relationships between combination methods and measures of diversity in combining classifiers. Inf Fusion 3(2):135鈥?48 CrossRef
    37. Littlestone N, Warmuth M (1994) The weighted majority algorithm. Inf Comput 108(2):212鈥?61 CrossRef
    38. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3:79鈥?7 CrossRef
    39. Nikunj CO (2000) Online ensemble learning. In: AAAI/IAAI. AAAI Press/The MIT Press, USA
    40. Rodriguez J, Kuncheva L (2008) Combining online classification approaches for changing environments. In: SSPR/SPR, vol 5342 of LNCS, Springer, Berlin, pp 520鈥?29
    41. Street N, Kim Y (2001) A streaming ensemble algorithm (sea) for large scale classification. In: KDD鈥?1 Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 377鈥?82
    42. Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Getoor L, Senator TE, Domingos P, Faloutsos C (eds) KDD, ACM, pp 226鈥?35
    43. Kolter J, Maloof M (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: ICDM, IEEE, pp 123鈥?30
    44. Zliobaite I (2009) Learning under concept drift: an overview. Technical report, Vilnius University, Faculty of Mathematics and Informatics
    45. Gaber MM, Yu PS (2006) Classification of changes in evolving data streams using online clustering result deviation. In: Third international workshop on knowledge discovery in data streams, Pittsburgh, PA, USA
    46. Salganicoff M (1993) Density-adaptive learning and forgetting. In: Proceedings of the 10th International Conference on Machine Learning, pp 276鈥?83
    47. Klinkenberg R, Joachims T (2000) Detecting concept drift with support vector machines. In: Proceedings of the 17th international conference on machine learning (ICML), San Francisco, CA, USA, pp 487鈥?94
    48. Baena-Garc谋a M, Campo-Avila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: 4th international workshop on knowledge discovery from data streams, pp 77鈥?6
    49. Kurlej B, Wozniak M (2012) Active learning approach to concept drift problem. Log J IGPL 20:550鈥?59 CrossRef
    50. Ramamurthy S, Bhatnagar R (2007) Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Proceedings of the 6th international conference on machine learning and applications, pp 404鈥?09
    51. Sasthakumar R, Raj B (2007) Tracking recurrent concept drift in streaming data using ensemble classifiers. In: ICMLA鈥?7 Proceedings of the 6th international conference on machine learning and applications, IEEE Computer Society, Washington, DC, pp 404鈥?09
    52. Turney P (1993) Exploiting context when learning to classify. In: Proceedings of the European conference on machine learning (ECML-93), pp 402鈥?07
    53. Widmer G (1997) Tracking context changes through meta-learning. Mach Learn 27(3):59鈥?86 CrossRef
    54. Gomes JB, Ruiz EM, Sousa PAC (2011) Learning recurring concepts from data streams with a context-aware ensemble. In: Proceedings of SAC, 2011, pp 994鈥?99
    55. Katakis I, Tsoumakas G, Vlahavas I (2010) Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl Inf Syst 22(3):371鈥?91 CrossRef
    56. Hosseini MJ, Ahmadi Z, Beigy H (2011) Pool and accuracy based stream classification: a new ensemble algorithm on data stream classification using recurring concepts detection. In: 2011 IEEE 11th International Conference on Data Mining Workshops, pp 588鈥?95
    57. B盲ck T, Fogel D, Michalewicz Z (1997) Handbook of evolutionary computation. Oxford University Press, Oxford CrossRef
    58. Jackowski K, Wozniak M (2010) Method of classifier selection using the genetic approach. Expert Syst 27(2):114鈥?28 CrossRef
    59. Alpaydin E (2004) Introduction to machine learning. The MIT Press, Cambridge
    60. Duin RPW, Juszczak P, Paclik P, Pekalska E, de Ridder D, Tax DMJ (2004) PRTools4, a Matlab toolbox for pattern recognition, Delft University of Technology, The Netherlands
    61. UCI Machine Learning Repository. http://www.archive.ics.uci.edu/ml/
    62. Harries M (1999) Splice-2 comparative evaluation: electricity pricing, Technical report, The University of South Wales, Australia
    63. Bifet A, Gavald谩 R (2009) Adaptive learning from evolving data streams in IDA 2009
    64. Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895鈥?924 CrossRef
  • 作者单位:Konrad Jackowski (1)

    1. Department of Systems and Computer Networks, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland
  • ISSN:1433-755X
文摘
This paper presents a novel ensemble classifier system designed to process data streams featuring occasional changes in their characteristics (concept drift). The ensemble is especially effective when the concepts reappear (recurring context). The system collects information on emerging contexts in a pool of elementary classifiers trained on subsequent data chunks. The pool is updated only when concept drift is detected. In contrast to other ensemble solutions, classifiers are not removed from the pool, and therefore, knowledge of past contexts is preserved for future use. To ensure high classification performance, the number of classifiers contributing to decision-making is fixed and limited. Only selected elements from the pool can join the decision-making ensemble. The process of selecting classifiers and adjusting their weights is realized by an evolutionary-based optimization algorithm that aims to minimize the system misclassification rate. Performance of the system is evaluated through a series of experiments presenting some key features of the system.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700