基于Canopy聚类的谱聚类算法

英文篇名：A spectral clustering algorithm based on Canopy clustering
作者：周伟 ; 肖杨
英文作者：ZHOU Wei;XIAO Yang;School of Mechanical Engineering,Hubei University of Technology;
关键词：K-Means ; 谱聚类 ; 初始化敏感 ; Canopy
英文关键词：K-Means;;spectral clustering;;initialization sensitivity;;Canopy
中文刊名：JSJK
英文刊名：Computer Engineering & Science
机构：湖北工业大学机械工程学院;
出版日期：2019-06-15
出版单位：计算机工程与科学
年：2019
期：v.41;No.294
基金：国家自然科学基金青年基金(51405144)
语种：中文;
页：JSJK201906019
页数：6
CN：06
ISSN：43-1258/TP
分类号：145-150

摘要

传统的谱聚类算法对初始化敏感,针对这个缺陷,引入Canopy算法对样本进行"粗"聚类得到初始聚类中心点,将结果作为K-Means算法的输入,提出了一种基于Canopy和谱聚类融合的聚类算法(Canopy-SC),减少了传统谱聚类算法选择初始中心点的盲目性,并将其用于人脸图像聚类。与传统的谱聚类算法相比,Canopy-SC算法能够得到较好的聚类中心和聚类结果,同时具有更高的聚类精确度。实验结果表明了该算法的有效性和可行性。
The traditional spectral clustering algorithm is sensitive to initialization. Aiming at this defect, we introduce the canopy algorithm to conduct coarse cluster and get the initial clustering center as the input of the K-Means algorithm. Then we propose a spectral clustering algorithm based on canopy clustering(Canopy-SC) to reduce the blind selection of the initial center of the traditional spectral clustering algorithm. We apply the new algorithm to face image clustering. Compared with the traditional spectral clustering algorithm, the Canopy-SC algorithm can not only get better clustering centers and results, but also has a higher clustering accuracy. Experiments demonstrate its effectiveness and feasibility.

引文

[1] Jain A K,Murty M N,Flynn P J.Data clustering:A review[J].ACM Computing Surveys,1999,31(3):264-323.
    [2] Lu Wei-jia,Yan Zhuang-zhi.Improved FCM algorithm based on K-Means and granular computing[J].Journal of Intelligent Systems,2015,24(2):215-222.
    [3] Chen Wei-jie,Giger M L.A fuzzy c-means (FCM) based algorithm for intensity inhomogeneity correction and segmentation of MR images[C]//Proc of IEEE International Symposium on Biomedical Imaging:From Nano to Macro,2005:1307-1310.
    [4] Zhao Yan,Zhao Xue-min.Research on user clustering algorithm based on CURE[J].Computer Engineering and Applications,2012,48(11):97-101.(in Chinese)
    [5] Cai Xiao-yan,Dai Guan-zhong,Yang Li-bin.Survey on spectral clustering algorithms[J].Computer Science,2008,35(7):14-18.(in Chinese)
    [6] Jia Hong-jie,Ding Shi-fei,Shi Zhong-zhi.Approximate weighted kernel k-means for large-scale spectral clustering[J].Journal of Software,2015,26(11):2836-2846.(in Chinese)
    [7] Manor L Z,Perona P.Self-tuning spectral clustering[C]//Proc of the 17th International Conference on Neural Information Processing Systems,2004:1601-1608.
    [8] Xiang Tao,Gong Shao-gang.Spectral clustering with eigenvector selection[J].Pattern Recognition,2008,41(3):1012-1029.
    [9] Yang Peng,Zhu Qing-sheng,Huang Biao.Spectral clustering with density sensitive similarity function[J].Knowledge-Based Systems,2011,24(5):621-628.
    [10] Wang Ling,Bo Lie-feng,Jiao Li-cheng.Density-sensitive semi-supervised spectral clustering[J].Journal of Software,2007,18(10):2412-2422.(in Chinese)
    [11] Zhu Qiang-sheng,He Hua-can,Zhou Yan-quan.Order sensibility of spectral clustering on input data sets[J].Application Research of Computers,2007,24(4):62-63.(in Chinese)
    [12] Esteves R M,Rong C.Using Mahout for clustering Wikipedia’s latest articles:A comparison between k-means and fuzzy c-means in the cloud [C]//Proc of the 3rd IEEE International Conference on Cloud Computing Technology and Science,2011:565-569.
    [13] McCallum A,Nigam K,Ungar L H.Efficient clustering of high-dimensional data sets with application to reference matching[C]//Proc of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2000:169-178.
    [14] Fiedler M.Algebraic connectivity of graphs[J].Czechoslovak Mathematical Journal,1973,23(2):298-305.
    [15] Wang Zhong,Liu Gui-quan,Chen En-hong.A spectral clustering algorithm based on fuzzy K-hamonic means[J].CAAI Transactions on Intelligent Systems,2009,4(2):95-99.(in Chinese)
    [16] Sun Ji-gui,Liu Jie,Zhao Lian-yu.Clustering algorithms research[J].Journal of Software,2008,19(1):48-61.(in Chinese)
    [17] Steinley D,Brusco M J,Hubert L.The variance of the adjusted rand index[J].Psychological Methods,2016,21(2):261-272.
    [4] 赵妍,赵学民.基于CURE的用户聚类算法研究[J].计算机工程与应用,2012,48(11):97-101.
    [5] 蔡晓妍,戴冠中,杨黎斌.谱聚类算法综述[J].计算机科学,2008,35(7):14-18.
    [6] 贾洪杰,丁世飞,史忠植.求解大规模谱聚类的近似加权核k-means算法[J].软件学报,2015,26(11):2836-2846.
    [10] 王玲,薄列峰,焦李成.密度敏感的半监督谱聚类[J].软件学报,2007,18(10):2412-2422.
    [11] 朱强生,何华灿,周延泉.谱聚类算法对输入数据顺序的敏感性[J].计算机应用研究,2007,24(4):62-63.
    [15] 汪中,刘贵全,陈恩红.基于模糊K-hamonic means的谱聚类算法[J].智能系统学报,2009,4(2):95-99.
    [16] 孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008,19(1):48-61.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700