Self-adaptive and local strategies for a smooth treatment of drifts in data streams
详细信息    查看全文
  • 作者:Ammar Shaker (1)
    Edwin Lughofer (2)
  • 关键词:Data stream learning ; Adaptive local and global drift handling ; Drift and forgetting intensity ; Component divergence ; Early drift recognition ; Evolving granular models
  • 刊名:Evolving Systems
  • 出版年:2014
  • 出版时间:December 2014
  • 年:2014
  • 卷:5
  • 期:4
  • 页码:239-257
  • 全文大小:1,586 KB
  • 参考文献:1. Angelov P (2010) Evolving Takagi鈥揝ugeno fuzzy systems from streaming data, eTS+. In: Angelov P, Filev D, Kasabov N (eds) Evolving intelligent systems: methodology and applications. Wiley, New York, pp 21鈥?0 CrossRef
    2. Angelov P, Filev D (2004) An approach to online identification of Takagi鈥揝ugeno fuzzy models. IEEE Trans Syst Man Cybernet Part B: Cybernet 34(1):484鈥?98 CrossRef
    3. Angelov P, Filev D, Kasabov N (2010) Evolving intelligent systems鈥攎ethodology and applications. Wiley, New York CrossRef
    4. Angelov P, Lughofer E, Zhou X (2008) Evolving fuzzy classifiers using different model architectures. Fuzzy Sets Syst 159(23):3160鈥?182 CrossRef
    5. Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601鈥?604
    6. Bouchachia A (2011) Evolving clustering: an asset for evolving systems. IEEE SMC Newsl 36. http://www.my-smc.org/news/back/2011_09/main_article3.html
    7. Bouchachia A, Vanaret C (2011) Incremental learning based on growing gaussian mixture models. In: Proceedings of 10th International Conference on machine learning and applications (ICMLA 2011), p to appear. Honululu, Haweii
    8. Cernuda C, Lughofer E, Maerzinger W, Kasberger J (2011) NIR-based quantification of process parameters in polyetheracrylat (PEA) production using flexible non-linear fuzzy systems. Chemom Intell Lab Syst 109(1):22鈥?3 CrossRef
    9. Cortez P, Cerdeira A, Almeida F,聽Matos T, Reis J (2009) Modeling wine preferences by data mining from physicochemical properties. Decis Support Syst 47(4):547鈥?53
    10. Delany SJ, Cunningham P, Tsymbal A, Coyle L (2005) A case-based technique for tracking concept drift in spam filtering. Knowl Based Syst 18(4鈥?):187鈥?95 CrossRef
    11. Diehl C, Cauwenberghs G (2003) SVM incremental learning, adaptation and optimization. In: Proceedings of the International Joint Conference on neural networks, vol 4. Boston, pp 2685鈥?690
    12. Dovzan D, Skrjanc I (2011) Recursive clustering based on a Gustafson鈥揔essel algorithm. Evol Syst 2(1):15鈥?4 CrossRef
    13. French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cogn Sci 3(4):128鈥?35 CrossRef
    14. Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca Raton CrossRef
    15. Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Lecture notes in computer science, vol 3171. Springer, Berlin Heidelberg, , pp 286鈥?95
    16. Gama J, Rodrigues P, Sebastiao R (2009) Evaluating algorithms that learn from data streams. In: SAC 鈥?9 Proceedings of the 2009 ACM symposium on applied computing. ACM, New York, pp 1496鈥?500
    17. Gama J, Sebasti茫o R, Rodrigues P (2013) On evaluating stream learning algorithms. Mach Learn聽90(3):317鈥?46
    18. Groi脽b枚ck W, Lughofer E, Klement E (2004) A comparison of variable selection methods with the main focus on orthogonalization. In: Lop茅z-D铆az M, Gil M, Grzegorzewski P, Hryniewicz O, Lawry J (eds) Soft methodology and random information systems, advances in soft computing. Springer, Berlin, Heidelberg, New York, pp 479鈥?86 CrossRef
    19. Hamker F (2001) RBF learning in a non-stationary environment: the stability鈥損lasticity dilemma. In: Howlett R, Jain L (eds) Radial basis function networks 1: recent developments in theory and applications. Physica Verlag, Heidelberg, New York, pp 219鈥?51
    20. Hisada M, Ozawa S, Zhang K, Kasabov N (2010) Incremental linear discriminant analysis for evolving feature spaces in multitask pattern recognition problems. Evol Syst 1(1):17鈥?7 CrossRef
    21. Ikonomovska E, Gama J, Sebastiao R, Gjorgjevik D (2009) Regression trees from data streams with drift detection. In: v. Lecture Notes in Computer Science (ed) Discovery science. Springer, Berlin, Heidelberg, pp 121鈥?35
    22. Kalhor A, Araabi B, Lucas C (2010) An online predictor model as adaptive habitually linear and transiently nonlinear model. Evol Syst 1(1):29鈥?1 CrossRef
    23. Kasabov N (2007) Evolving connectionist systems: the knowledge engineering approach, 2nd edn. Springer, London
    24. Klement E, Mesiar R, Pap E (2000) Triangular norms. Kluwer Academic Publishers, Dordrecht, Norwell, New York, London CrossRef
    25. Klinkenberg R (2004) Learning drifting concepts: example selection vs. example weighting. Intell Data Anal 8(3):281鈥?00
    26. Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22(1):79鈥?6 CrossRef
    27. Kurlej B, Wozniak M (2011) Learning curve in concept drift while using active learning paradigm. In: Bouchachia A (ed) ICAIS 2011, LNAI 6943. Springer, Berlin, Heidelberg, pp 98鈥?06
    28. Lindstrom P, Namee B, Delany S (2013) Drift detection using uncertainty distribution divergence. Evol Syst 4(1):13鈥?5 CrossRef
    29. Lughofer E (2005) Aspects of incremental rule consequent learning. Technical report FLLL-TR-0502. Fuzzy logic laboratorium Linz-Hagenberg, A-4232 Hagenberg, Austria
    30. Lughofer E (2008) FLEXFIS: a robust incremental learning approach for evolving TS fuzzy models. IEEE Trans Fuzzy Syst 16(6):1393鈥?410 CrossRef
    31. Lughofer E (2011) Evolving fuzzy systems鈥攎ethodologies. Advanced concepts and applications. Springer, Berlin, Heidelberg CrossRef
    32. Lughofer E (2012) Flexible evolving fuzzy inference systems from data streams (FLEXFIS++). In: Sayed-Mouchaweh M, Lughofer E (eds) Learning in non-stationary environments: methods and applications. Springer, New York, pp 205鈥?46 CrossRef
    33. Lughofer E (2012) Single-pass active learning with conflict and ignorance. Evol Syst 3(4):251鈥?71 CrossRef
    34. Lughofer E (2013) On-line assurance of interpretability criteria in evolving fuzzy systems鈥攁chievements, new concepts and open issues. Inf Sci 251:22鈥?6 CrossRef
    35. Lughofer E, Angelov P (2011) Handling drifts and shifts in on-line data streams with evolving fuzzy systems. Appl Soft Comput 11(2):2057鈥?068 CrossRef
    36. Lughofer E, Bouchot JL, Shaker A (2011) On-line elimination of local redundancies in evolving fuzzy systems. Evol Syst 2(3):165鈥?87 CrossRef
    37. Mahalanobis PC (1936) On the generalised distance in statistics. Proc Natl Inst Sci India 2(1):49鈥?5
    38. Moe-Helgesen OM, Stranden H (2005) Catastophic forgetting in neural networks. Technical report, Norwegian University of Science and Technology, Trondheim
    39. Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of Page-Hinkley, an approach for fault detection in an agro-alimentary production system. Proc Asian Control Conf 2:815鈥?18
    40. Pratama M, Anavatti S, Lughofer E (2014) GENFIS: towards and effective localist network. IEEE Trans Fuzzy Syst. doi:10.1109/TFUZZ.2013.2264938
    41. Qin S, Li W, Yue H (2000) Recursive PCA for adaptive process monitoring. J Process Control 10(5):471鈥?86 CrossRef
    42. Ramamurthy S, Bhatnagar R (2007) Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Proceedings of the Sixth International Conference on machine learning and applications (ICMLA). Cincinnati, Ohio, pp 404鈥?09
    43. Raquel Sebasti茫o Margarida M, Silva JG, Mendon莽a T (2011) Contributions to an advisory system for changes detection in depth of anesthesia signals. In: LEMEDS11: Proceedings of the Learning from medical data streams. Bled, Slovenia
    44. Sayed-Mouchaweh M, Lughofer E (2012) Learning in non-stationary environments: methods and applications. Springer, New York CrossRef
    45. Sebastiao R, Silva M, Rabico R, Gama J, Mendonca T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3鈥?2 CrossRef
    46. Serdio F, Lughofer E, Pichler K, Buchegger T, Efendic H (2014) Residual-based fault detection using soft computing techniques for condition monitoring at rolling mills. Inf Sci 259:304鈥?20 CrossRef
    47. Shaker A, H眉llermeier E (2012) Instance-based classification and regression on data streams. In: Lughofer E, Sayed-Mouchaweh M (eds) Learning in non-stationary environments: methods and applications. Springer, New York, pp 185鈥?01 CrossRef
    48. Shaker A, Senge R, H眉llermeier E (2013) Evolving fuzzy patterns trees for binary classification on data streams. Inf Sci 220:34鈥?5 CrossRef
    49. Shilton A, Palaniswami M, Ralph D, Tsoi AC (2005) Incremental training of support vector machines. IEEE Trans Neural Netw 16(1):114鈥?31
    50. Soleimani H, Lucas K, Araabi B (2010) Recursive Gath鈥揋eva clustering as a basis for evolving neuro-fuzzy modeling. Evol Syst 1(1):59鈥?1 CrossRef
    51. Song M, Wang H (2005) Highly efficient incremental estimation of gaussian mixture models for online data stream clustering. In: Priddy KL (ed) Intelligent computing: theory and applications III, Proceedings of the SPIE, vol 5803. pp 174鈥?83
    52. Takagi T, Sugeno M (1985) Fuzzy identification of systems and its applications to modeling and control. IEEE Trans Syst Man Cybernet 15(1):116鈥?32 CrossRef
    53. Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical report TCD-CS-2004-15, Department of Computer Science, Trinity College Dublin, Ireland
    54. Utgoff P, Berkman NC, Clouse JA (1997) Decision tree induction based on efficient tree restructuring. Mach Learn 29(1):5鈥?4
    55. Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69鈥?01
  • 作者单位:Ammar Shaker (1)
    Edwin Lughofer (2)

    1. Department of Mathematics and Computer Science, Philipps-University Marburg, Marburg, Germany
    2. Department of Knowledge-based Mathematical Systems, Johannes Kepler University of Linz, Linz, Austria
  • ISSN:1868-6486
文摘
In this paper, we are dealing with a new concept for handling drifts in data streams during the run of on-line, evolving modeling processes in a regression context. Drifts require a specific attention in evolving modeling methods, as they usually change the underlying data distribution making previously learnt model parameters and structure outdated. Our approach comes with three new stages for an appropriate drift handling: (1) drifts are not only detected, but also quantified with a new extended version of the Page-Hinkley test; (2) we integrate an adaptive forgetting factor changing over time and which steers the degree of forgetting in dependency of the current drift intensity in the data stream; (3) we introduce local forgetting factors by addressing the different local regions of the feature space with a different forgetting intensity; this is achieved by using fuzzy model architecture within stream learning whose structural components (fuzzy rules) provide a local partitioning of the feature space and furthermore ensure smooth transitions of drift handling topology between neighboring regions. Additionally, our approach foresees an early drift recognition variant, which relies on divergence measures, indicating the degree of divergence in local parts of the feature space separately already before the global model error may start to rise significantly. Thus, it can be seen as an attempt regarding drift prevention on global model level. The new approach is successfully evaluated and compared with fixed forgetting and no forgetting on high-dimensional real-world data streams, including different types of drifts.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700