Learning mid-perpendicular hyperplane similarity from cannot-link constraints
详细信息    查看全文
文摘
Pairwise constraints known as must-link and cannot-link constraints have been frequently used in semi-supervised clustering. In this paper, we propose a novel usage of cannot-link constraints and develop a method called Mid-Perpendicular Hyperplane Similarity (MPHS) for semi-supervised clustering. Since a cannot-link constraint means that the two objects linked by it are not in the same class, there is a mid-perpendicular hyperplane to distinguish them. For each cannot-link constraint, we first compute the corresponding mid-perpendicular hyperplane and then use distances of objects to this hyperplane to learn a new data representation and similarity matrix. Finally, we combine all the similarity matrices from all cannot-link constraints into single similarity matrix and perform kernel k-means on it to obtain the partition. We implement MPHS for two cases, i.e., a simple one performed in original input space when the data set is nearly linear-separable, and an advanced one in kernel-induced feature space when the data set is complex and nonlinear-separable. Experimental results on several UCI data sets and some image data sets show the effectiveness of our method.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700