一种基于SimRank得分的谱聚类算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Spectral Clustering Algorithm Based on SimRank Score
  • 作者:李鹏清 ; 李扬定 ; 邓雪莲 ; 李永钢 ; 方月
  • 英文作者:LI Peng-qing;LI Yang-ding;DENG Xue-lian;LI Yong-gang;FANG Yue;Guangxi Key Lab of Multi-source Information Mining &Security,Guangxi Normal University;School of Public Health and Management,Guangxi University of Chinese Medicine;
  • 关键词:谱聚类 ; 相似度矩阵 ; SimRank得分 ; 邻接矩阵 ; 拉普拉斯矩阵 ; k-均值聚类
  • 英文关键词:Spectral clustering;;Similarity matrix;;SimRank score;;Adjacency matrix;;Laplace matrix;;k-means clustering
  • 中文刊名:JSJA
  • 英文刊名:Computer Science
  • 机构:广西师范大学广西多源信息挖掘与安全重点实验室;广西中医药大学公共卫生与管理学院;
  • 出版日期:2018-11-15
  • 出版单位:计算机科学
  • 年:2018
  • 期:v.45
  • 基金:国家重点研发计划项目(2016YFB1000905);; 国家自然科学基金(61363009,61672177,61573270,81701780);; 广西自然科学/青年基金(2015GXNSFCB139011,2017GXNSFBA198221);; 广西多源信息挖掘与安全重点实验室开放基金(16-A-01-01,16-A-01-02);; 广西研究生教育创新计划项目(XYCSZ2017064,XYCSZ2017067,YCSW2017065);; 广西研究生创新计划项目(YCSW2018094)资助
  • 语种:中文;
  • 页:JSJA2018S2095
  • 页数:5
  • CN:S2
  • ISSN:50-1075/TP
  • 分类号:468-471+477
摘要
传统的谱聚类算法在建立相似度矩阵时仅考虑数据点与点的距离,忽略了数据点之间隐含的内在联系。针对这一问题,提出了一种基于SimRank的谱聚类算法。该算法首先用无向图数据建立邻接矩阵,并计算出基于SimRank的相似度矩阵;然后根据相似度矩阵建立拉普拉斯矩阵表达式,对其进行归一化后再进行谱分解;最后对分解得到的特征向量进行k-means聚类。在Zoo等UCI标准数据集上的实验结果表明,所提算法在聚类精确度、标准互信息和纯度3个评价指标上均优于现有的LRR(Low Rank Rrepresentation)等基于距离相似度的谱聚类算法。
        Traditional spectral clustering algorithms only consider distance between data points,ignoring their intrinsic relation.To deal with this problem,a spectral clustering method based on SimRank score was proposed.Firstly,the method computes the adjacency matrix of the undirected graph data,and obtains the similarity matrix based on SimRank.Secondly,a Laplacian matrix expression is constructed based on similarity matrix,which is then normalized followed by spectral decomposition.Finally,a k-means clustering procedure is performed on the obtained eigenvectors to obtain the final clustering results.Experimental results on benchmark datasets from UCI data repository show that the proposed algorithm is superior to the existing spectral clustering algorithms based on distance similarity in terms of clustering accuracy,standard mutual information and purity.
引文
[1]刘紫涵,吴鹏海,吴艳兰,等.三种谱聚类算法及其应用研究[J].计算机应用研究,2017,34(4):1026-1031.
    [2] MIGUEL C.On the diameter of the commuting graph of the matrix ring over a centrally finite division ring[J].Linear Algebra&Its Applications,2016,509:276-285.
    [3] LI X,DU Y,WEI Y,et al.The research of concept context graph layer division based on six degrees of separation theory[J].Journal of Computational Information Systems,2013,9(22):9219-9226.
    [4] ZHANG J M,SHEN Y X.Review on spectral methods for clustering[C]∥Control Conference.IEEE,2015:3791-3796.
    [5] CHE W F,FENG G C.Spectral clustering:A semi-supervised approach[J].Neuro Computing,2012,77(1):119-228.
    [6] ZHAO Y C,ZHANG S C.Generalized Dimension-Reduction Framework for Recent-Biased Time Series Analysis[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(2):231-244.
    [7] LANGONE R,MALL R,ALZATE C,et al.Kernel Spectral Clustering and Applications[M]∥Unsupervised Learning Algorithms.Springer International Publishing,2016.
    [8]李瑞琳,赵永华,黄小磊.一种基于MPI的稀疏化局部尺度并行谱聚类算法的研究与实现[J].计算机工程与科学,2016,38(5):839-847.
    [9] LIU G,LIN Z,YAN S,et al.Robust recovery of subspace structures by low-rank representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):171-184.
    [10]ELHAMIFAR E,VIDAL R.Sparse subspace clustering[C]∥CVPR.2009:2790-2797.
    [11]LU C Y,MIN H,ZHAO Z Q,et al.Robust and efficient subspace segmentation via least squares regression[C]∥ECCV.2012:347-360.
    [12]邹小林,冯国灿.基于正则割(Ncut)的多阈值图像分割方法[J].计算机工程与应用,2012,48(19):174-178.
    [13]WANG S,SISKIND J M.Image Segmentation with Ratio Cut[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2003,25(6):675-690.
    [14]SRINIVASARAO P,SURESH K,RAVI K B.Image Segmentation using Clustering Algorithms[J].International Journal of Computer Applications,2015,120:36-38.
    [15]刘萍,黄纯万.基于SimRank的作者相似度计算[J].情报理论与实践,2015,38(6):109-114.
    [16]ZHENG W,ZOU L,CHEN L,et al.Efficient SimRank-Based Similarity Join[J].Acm Transactions on Database Systems,2017,42(3):16.
    [17]CHEN W F,FENG G C.Spectral clustering with discriminate cuts[J].Knowledge-Based Systems,2012,28(7):27-37.
    [18]BOOBALAN M P,LOPEZ D,GAO X Z.Graph clustering using k-Neighbourhood Attribute Structural similarity[J].Applied Soft Computing,2016,47:216-223.
    [19]ALZATE C,SUYKENS J A.Hierarchical kernel spectral clustering[J].Neural Networks,2012,35(2):21-30.
    [20]刘敏,韩宾,郭有倩.一种改进的基于K-means的信息聚类算法研究[J].信息通信,2015(9):35-36.
    [21]FANG R,POUYANFAR S,YANG Y,et al.Computational Health Informatics in the Big Data Age:A Survey[J].ACM Computing Surveys,2016,49(1):12.
    [22]ZHU X F,LI X L,ZHANG S C.Block-Row Sparse Multiview Multilabel Learning for Image Classification[J].IEEE Transactions on Cybernetics,2016,46(2):450-461.
    [23]李翠平.一种基于SimRank的结点相似度计算方法:CN104933312A[P].2015.
    [24]GAO Y,WANG M,TAO D C,et al.3-D object retrieval and recognition with hypergraph analysis[J].IEEE Transactions on Image Processing a Publication of the IEEE Signal Processing Society,2012,21(9):4290-4303.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700