基于属性划分和弧形距离的层次聚类算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Hierarchical Clustering Algorithm Based on Attribute Partitioning and Curve Distance
  • 作者:夏卓群 ; 欧慧 ; 武志伟 ; 范开钦
  • 英文作者:XIA Zhuoqun;OU Hui;WU Zhiwei;FAN Kaiqin;School of Computer and Communication Engineering,Changsha University of Science and Technology;The State Taxation Bureau of Hunan Province;
  • 关键词:弧形距离 ; 属性划分 ; 最大最小距离 ; 聚类归类 ; 类标签
  • 英文关键词:curve distance;;attribute partitioning;;max-min distance;;cluster classification;;class lable
  • 中文刊名:JSJC
  • 英文刊名:Computer Engineering
  • 机构:长沙理工大学计算机与通信工程学院;湖南省国家税务局;
  • 出版日期:2015-08-15
  • 出版单位:计算机工程
  • 年:2015
  • 期:v.41;No.454
  • 基金:湖南省自然科学基金资助项目(14JJ7043);; 湖南省交通运输厅科技进步与创新基金资助项目(201405)
  • 语种:中文;
  • 页:JSJC201508033
  • 页数:6
  • CN:08
  • ISSN:31-1289/TP
  • 分类号:180-185
摘要
传统k-means初始中心随机选取,在较大范围内,利用以流形距离为相似度测度的参数不能较好地反映数据集的全局一致性。为此,基于属性划分和弧形距离,提出一种层次聚类算法。依据粒计算中属性划分思想和最大最小距离法则选择初始阶段的类代表点,根据k-means进行粗聚类。采用新的距离测度,即弧形距离和反映类内相似度大类间相似度小的准则函数,对初阶段类代表点聚类归类得到期望类代表点。每个数据点依据其类代表点的类标签信息找到自己所属的类标签。实验结果表明,与其他算法相比,该算法较好地体现数据集的全局一致性,减少了运行时间。
        Aiming at resolving the problems of the traditional k-means algorithm random selecting of initial clustering centers,having the flaw of the global consistency on the large scale w hose parameters are based on manifold distance as the measure of the similarity. A hierarchical clustering algorithm based on attribute partitioning and curve distance is proposed. It is based on the attribute partitioning ideological of granular computing and max-min distance method selects initial cluster centers and makes the crude clustering by k-means to get early stage exemplars. According to new distance measure,that is curve distance and criterion function. The big similarity w ithin class and smaller similarity betw een class does cluster classification to get expect exemplars. Each data points are assigned through the labels of their corresponding representative exemplars. Experimental results show that the algorithm has the good global consistency to the data set,and the running time is reduced.
引文
[1]Han J W,Kamber M,Pei Jian.数据挖掘概念与技术[M].范明,梦小峰,译.北京:机械工业出版社,2012.
    [2]Zhou Dengyong,Bouaquet O,Weston J,et al.Learning with Local and Global Consistency[M].Cambridge,USA:MIT Press,2004.
    [3]杨瑞瑞,牛建强,孟红飞.基于流形矩离的迭代聚类算法路面裂缝提取[J].计算机工程,2011,37(12):212-214.
    [4]魏莱,王守觉.基于流形距离的半监督判别分析[J].软件学报,2010,21(10):2445-2453.
    [5]李阳阳,石洪竺,焦李成,等.基于流形距离的量子进化聚类算法[J].电子学报,2011,39(10):2343-2347.
    [6]Wang Na,Wang Sun’an,Du Haifeng.An Iterative Optimization Clustering Algorithm Based on Manifold Distance[C]//Proceedings of the 4th IEEE Conference on Industrial Electronics and Applications.Washington D.C.,USA:IEEE Press,2009:1565-1568.
    [7]Tao Xinmin,Song Shaoyu,Cao Pandong.A Spectral Clustering Algorithm Based on Manifold Distance Kernel[J].Information and Control,2012,41(3):307-313.
    [8]潘晓英,刘芳,焦李成.密度敏感的多智能体进化聚类算法[J].软件学报,2010,21(10):2420-2431.
    [9]王玲,薄列峰,焦李成.密度敏感的半监督谱聚类[J].软件学报,2007,18(10):2412-2422.
    [10]Gong Maoguo,Jiao Licheng,Wang Ling,et al.Densitysensitive Evolutionary Clustering[C]//Proceedings of the11th Pacific-Asia Conference on Knowledge Discovery and Data Mining.Berlin,Germany:Springer,2007:507-514.
    [11]Gong Maoguo,Jiao Licheng,Bo Liefeng,et al.Image Texture Classification Using a Manifold Distance Evolutionary Clustering Method[J].Opitical Engineering,2008,47(7).
    [12]吴毓龙,袁平波.密度敏感的距离测度在特定图像聚类中的应用[J].计算机工程,2009,35(6):210-212.
    [13]苗谦,王匡胤,刘清,等.粒计算:过去、现在与展望[M].北京:科学出版社,2007.
    [14]邱兴兴,程霄.基于改进流形距离k-medoids算法[J].计算机应用,2013,33(9):2482-2485.
    [15]严蔚敏,吴伟民.数据结构[M].北京:清华大学出版社,1997.
    [16]公茂果,王爽,马萌,等.复杂分布数据的二阶段聚类算法[J].软件学报,2011,22(11):2760-2772.
    [17]卢鹏丽,王祖东.密度敏感的层次化聚类算法研究[J].计算机工程与应用,2014,50(4):190-195.
    [18]Yan Donghui,Huang Ling,Jordan M I.Fast Approximate Spectral Clustering[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM Press,2009:907-916.
    [19]Blake C L,Merz C J.UCI Machine Learning Reposit ory[EB/OL].(2010-05-07).http://archive.ics.uci.edu/ml.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700