基于K-means聚类算法优化方法的研究

英文篇名：Research on optimization method based on K-means clustering algorithm
作者：刘叶 ; 吴晟 ; 周海河 ; 吴兴蛟 ; 韩林峄
英文作者：LIU Ye;WU Sheng;ZHOU Hai-he;WU Xing-jiao;Han Lin-yi;School of Information Engineering and Automation,Kunming University of Science and technology;
关键词：K-means聚类 ; K-means++聚类 ; K-mediods聚类 ; 两步聚类
英文关键词：K-means clustering;;K-means + + clustering;;K-mediods clustering;;Two-step clustering
中文刊名：HDZJ
英文刊名：Information Technology
机构：昆明理工大学信息工程与自动化学院;
出版日期：2019-01-17
出版单位：信息技术
年：2019
期：v.43;No.326
语种：中文;
页：HDZJ201901018
页数：5
CN：01
ISSN：23-1557/TN
分类号：74-78

摘要

针对传统K-means聚类中存在的一系列问题,文中提出了一种基于K-means聚类的改进算法。该算法首先利用K-means++聚类从数据中选择K个距离尽可能远的对象作为初始聚类中心,然后利用K-mediods聚类选择数据样本的中位数作为聚类中心的对象,最后与两步聚类结合。通过对几个常用UCI标准数据集进行仿真实验,结果表明该算法比传统算法更优。
Aiming at a series of problems in traditional K-means clustering,this paper proposes an improved algorithm based on K-means clustering. The algorithm uses K-means + + clustering to select K objects as far as possible from the data as the initial clustering center firstly,and then uses K-mediods clustering to select the median of the data samples as the cluster center object,and finally combined with Two-step clustering. Simulation experiments on several common UCI standard datasets show that the proposed algorithm is superior to traditional algorithms.

引文

[1]汪中,刘贵全,陈恩红.一种优化初始中心点的K-means算法[J].模式识别与人工智能,2009,22(2):299-304.
    [2]李春生,王耀南.聚类中心初始化的新方法[J].控制理论与应用,2010,27(10):1435-1440.
    [3]邱剑锋.人工蜂群算法的改进方法与收敛性理论的研究[D].合肥:安徽大学,2014:59-63.
    [4]雷小锋,谢昆青,林帆,等.一种基于K-Means局部最优性的高效聚类算法[J].软件学报,2008,19(7):1683-1692.
    [5]朱颢东,钟勇,赵向辉.一种优化初始中心点的K-Means文本聚类算法[J].郑州大学学报:理学版,2009,41(2):29-32.
    [6]陶新民,徐晶,杨立标,等.一种改进的粒子群和K均值混合聚类算法[J].电子与信息学报,2010,32(1):92-97.
    [7] LU B,JU F. An optimized genetic K-means clustering algorithm[C]. CSIP 2012:Proceedings of the 2012 International Conference on Computer Science and Information Processing. Piscataway:IEEE,2012:1296-1299.
    [8]Lu W J,Yan Z Z. Improved FCM Algorithm based on K-means and granular computing[J]. Journal of Intelligent Systems,2015,24(2):215-222.
    [9]Tzortzis G,Li Kas A,Tzortzis G. The MinMax K-Means clustering algorithm[J]. Pattern Recognition,2014,47(7):2505-2516.
    [10]Yu S,Tranchevent L C,Moor B D,et al. Optimized data fusion for Kernel K-means clustering[J]. IEEE Transactions on Software Engineering,2012,34(5):1031-1039.
    [11]Zhong C,Malinen M,Miao D,et al. A fast minimum spanning tree algorithm based on K-means[J]. Information Sciences,2015,295(C):1-19.
    [12]刘书香,卢才武,张志霞.数据挖掘中的客户聚类分析及其算法实现[J].信息技术,2004(1):4-6,10.
    [13]赵文冲,蔡江辉,赵旭俊,等.一种影响空间下的快速K-means聚类算法[J].小型微型计算机系统,2016,37(9):2060-2064.
    [14]Na S,Xumin L,Yong G. Research on k-means clustering algorithm:An improved k-means clustering algorithm[C]. Intelligent Information Technology and Security Informations(IITSI),2010 Third International Symposium on. IEEE,2010:63-67.