Semi-supervised hierarchical clustering ensemble and its application
详细信息    查看全文
文摘
Clustering ensemble is an important part of ensemble learning. It aims to study and integrate multiple clustering results from different clustering algorithms or same algorithm with different initial parameters for the same dataset. CHAMELEON is a hierarchical clustering algorithm which can discover natural clusters of different shapes and sizes as the result of its merging decision dynamically adapts to the different clustering model characterized. Inspired by the idea of CHAMELEON, the paper proposes a novel clustering ensemble models including semi-supervised method and discusses its application in fault diagnosis of high speed train (HST) running gear. The contributions of this paper include: constructing a sparse graph via the similarity matrix which aggregates multiple clustering results; partitioning the sparse graph (vertex=object, edge weight=similarity) into a large number of relatively small sub-clusters; obtaining the final clustering partition by merging these sub-clusters repeatedly. The experimental results demonstrate that our method outperforms some of state-of-the-art ensemble algorithms regarding the accuracy and stability and recognizes fault patterns of HST running gear effectively.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700