Centroid-Based Classification of Categorical Data
详细信息    查看全文
  • 作者:Lifei Chen (20)
    Gongde Guo (20)
  • 关键词:Categorical data ; classification ; centroid ; weighted distance
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8485
  • 期:1
  • 页码:472-475
  • 参考文献:1. Han, E.-H(S.), Karypis, G.: Centroid-based document classification: Analysis and experimental results. In: Zighed, D.A., Komorowski, J., 呕ytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol.聽1910, pp. 424鈥?31. Springer, Heidelberg (2000) CrossRef
    2. Chen, L., Ye, Y., Jiang, Q.: New centroid-based classifier for text categorization. In: Proceedings of the AINAW, pp. 1217鈥?222 (2008)
    3. Sen, P.: Gini diversity index, hamming distance and curse of dimensionality. Metron - International Journal of Statistics聽 LXIII(3), 329鈥?49 (2005)
    4. Weinberger, K., Saul, L.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. Journal of Machine Learning Research聽10, 207鈥?44 (2009)
    5. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence聽27, 1226鈥?238 (2005) CrossRef
    6. Hall, M., Frank, E., et al.: The weka data mining software: An update. SIGKDD Explorations聽11 (2009)
  • 作者单位:Lifei Chen (20)
    Gongde Guo (20)

    20. School of Mathematics and Computer Science, Fujian Normal University, China
  • ISSN:1611-3349
文摘
The traditional centroid-based classifiers cannot be directly applied to categorical data classification due to the undefined concept of centroid for a categorical class, and the lack of an effective distance measure for categorical objects. In this paper, two centroid-based classifiers are proposed for categorical data classification. We propose a new formulation for the centroid of categorical classes to address the first problem, while two weighted distance measures are defined for the second problem. The experimental results conducted on real-world data sets show the effectiveness of the proposed methods.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700