摘要
Classification is an important data mining problem. Emerging Patterns (EPs) are itemsets whose supports change significantly from one data class to another. Previous studies have shown that classifiers based on EPs are competitive to other state-of-the-art classification systems. In this paper, we propose a new type of Emerging Patterns, called Maximal Emerging Patterns (MaxEPs), which are the longest EPs satisfying certain constraints. MaxEPs can be used to condense the vast amount of information, resulting in a significantly smaller set of high quality patterns for classification. We also develop a new overlapping or intersection based mechanism to exploit the properties of MaxEPs. Our new classifier, Classification by Maximal Emerging Patterns (CMaxEP), combines the advantages of the Bayesian approach and EP-based classifiers. The experimental results on 36 benchmark datasets from the UCI machine learning repository demonstrate that our method has better overall classification accuracy in comparison to JEP-classifier, CBA, C5.0 and NB. Keywords: Emerging Patterns, classification, Bayesian learning, maximal Emerging Patterns.