Probabilistic consensus clustering using evidence accumulation
详细信息    查看全文
  • 作者:André Louren?o ; Samuel Rota Bulò ; Nicola Rebagliati ; Ana L. N. Fred…
  • 关键词:Consensus clustering ; Evidence Accumulation ; Ensemble clustering ; Bregman divergence
  • 刊名:Machine Learning
  • 出版年:2015
  • 出版时间:January 2015
  • 年:2015
  • 卷:98
  • 期:1-2
  • 页码:331-357
  • 全文大小:1,806 KB
  • 参考文献:1. Arora, R., Gupta, M., Kapila, A., & Fazel, M. (2011). Clustering by left-stochastic matrix factorization. In L.?Getoor & T.?Scheffer (Eds.), / ICML (pp. 761-68). Omnipress.
    2. Ayad, H., & Kamel, M. S. (2008). Cumulative voting consensus method for partitions with variable number of clusters. / IEEE Transactions on Pattern Analysis and Machine Intelligence, / 30(1), 160-73. CrossRef
    3. Banerjee, A., Krumpelman, C., Basu, S., Mooney, R. J., & Ghosh, J. (2005a). Model-based overlapping clustering. In / Int. conf. on knowledge discovery and data mining (pp. 532-37).
    4. Banerjee, A., Merugu, S., Dhillon, I., & Ghosh, J. (2005b). Clustering with Bregman divergences. / Journal of Machine Learning Research, / 6, 1705-749.
    5. Bezdek, J. (1981). / Pattern recognition with fuzzy objective function algorithms. Norwell: Kluwer Academic. CrossRef
    6. Bezdek, J., & Hathaway, R. (2002). VAT: a?tool for visual assessment of (cluster) tendency. In / Proceedings of the 2002 international joint conference on neural networks 2002, IJCNN-2 (Vol.?3, pp. 2225-230).
    7. Boyd, S., & Vandenberghe, L. (2004). / Convex optimization (1st ed.). Cambridge: Cambridge University Press. CrossRef
    8. Cui, Y., Fern, X. Z., & Dy, J.?G. (2010). Learning multiple nonredundant clusterings. In / Transactions on Knowledge Discovery from Data (TKDD) (Vol.?4, pp. 1-2).
    9. Dhillon, I. S., Mallela, S., & Kumar, R. (2003). A?divisive information-theoretic feature clustering algorithm for text classification. / Journal of Machine Learning Research, / 3, 1265-287.
    10. Dimitriadou, E., Weingessel, A., & Hornik, K. (2002). A combination scheme for fuzzy clustering. In / AFSS-2 (pp. 332-38).
    11. F?rber, I., Günnemann, S., Kriegel, H., Kr?ger, P., Müller, E., Schubert, E., Seidl, T., & Zimek, A. (2010). On using class-labels in evaluation of clusterings. In / MultiClust: 1st international workshop on discovering, summarizing and using multiple clusterings.
    12. Fern, X. Z., & Brodley, C. E. (2004). Solving cluster ensemble problems by bipartite graph partitioning. In / Proc. ICML -4.
    13. Frank, A., & Asuncion, A. (2012). In / UCI machine learning repository. http://archive.ics.uci.edu/ml.
    14. Fred, A. (2001). Finding consistent clusters in data partitions. In J.?Kittler & F.?Roli (Eds.), / Multiple classifier systems (Vol.?2096, pp. 309-18). Berlin: Springer. CrossRef
    15. Fred, A., & Jain, A. (2002). Data clustering using evidence accumulation. In / Proc. of the 16th int’l conference on pattern recognition (pp. 276-80).
    16. Fred, A., & Jain, A. (2005). Combining multiple clustering using evidence accumulation. / IEEE Transactions on Pattern Analysis and Machine Intelligence, / 27(6), 835-50. CrossRef
    17. Fred, A., & Jain, A. (2006). Learning pairwise similarity for data clustering. In / Proc. of the 18th int’l conference on pattern recognition (ICPR), Hong Kong (Vol.?1, pp. 925-28). CrossRef
    18. Ghosh, J., & Acharya, A. (2011). Cluster ensembles / Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, / 1(4), 305-15.
    19. Karypis, G., Aggarwal, R., Kumar, V., & Shekhar, S. (1997). Multilevel hypergraph partitioning: applications in VLSI domain. In / Proc. design automation conf.
    20. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. / Proceedings of the National Academy of Sciences of the United States of America, / 101(Suppl 1), 5228-235. CrossRef
    21. Heller, K., & Ghahramani, Z. (2007). A?nonparametric Bayesian approach to modeling overlapping clusters. In / Int. conf. AI and statistics.
    22. Jain, A. K., & Dubes, R. (1988). / Algorithms for clustering data. New York: Prentice Hall.
    23. Jardine, N., & Sibson, R. (1968). The construction of hierarchic and non-hierarchic classifications. / Computer Journal, / 11, 177-84. CrossRef
    24. Kachurovskii, I. R. (1960). On monotone
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Automation and Robotics
    Computing Methodologies
    Simulation and Modeling
    Language Translation and Linguistics
  • 出版者:Springer Netherlands
  • ISSN:1573-0565
文摘
Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700