摘要
针对聚类数目未知情况下的聚类问题,提出了一种自组织特征映射网络(Self-organizing Feature Maps,SOFM)的二阶段聚类算法.首先通过SOFM网络的自组织学习过程对数据集进行粗聚类,将数据集划分为若干个簇,以获胜神经元代表每个簇内的所有样本;然后采用凝聚层次聚类的方法对获胜神经元进行再聚类,并以树状图的形式给出可视化聚类结果;最后综合两阶段聚类结果得到最终的聚类结果.实验结果表明,所提出的算法可以获得良好的聚类结果.
In order to achieve clustering when the number of clusters is unknown,a two-phase clustering algorithm based on Self-organizing Feature Maps(SOFM) is proposed. Firstly,the datasets are roughly clustered through the self-organizing learning process of SOFM. After that,the datasets are divided into several clusters. The winning neurons of SOFM stand for the samples in each cluster.Then,those winning neurons are re-clustered through the method of agglomerative hierarchical clustering,and the clustering results are shown in the form of dendrogram. Finally,based on these two clustering results,the final results are obtained. The experimental results show that the proposed algorithm has better clustering results.
引文
[1]Han Jia-wei,Kamber Micheline,Pei Jian.Data mining:concepts and techniques[M].Beijing:China M achine Press,2012.
[2]Sun Ji-gui,Liu Jie,Zhao Lian-yu.Clustering algorithms research[J].Journal of Software,2008,19(1):48-61.
[3]Hartigan J A,Wong M A.A k-means clustering algorithm[J].Applied Statistics,1979,28(1):100-108.
[4]Park H S,Jun C H.A simple and fast algorithm for k-medoids clustering[J].Expert Systems w ith Applications,2009,36(2):3336-3341.
[5]Wang Jun,Wang Shi-tong,Deng Zhao-hong.Survey on challenges in clustering analysis research[J].Control and Decision,2012,27(3):321-328.
[6]Chen Li-fei,Jiang Qing-shan,Wang Sheng-rui.A hierarchical method for determining the number of clusters[J].Journal of Softw are,2008,19(1):62-72.
[7]Guo Kai,Li Hai-fang,Wang Hui-qing.An adaptive spectral clustering algorithm based on artificial immune[J].Journal of Chinese Computer Systems,2013,34(4):856-859.
[8]Wang Yong,Tang Jing,Rao Qin-fei,et al.High efficient k-means algorithm for determining optimal number of clusters[J].Journal of Computer Applications,2014,34(5):1331-1335.
[9]Xu Zheng-guo,Zheng Hui,He Liang,et al.Self-adaptive clustering based on local density by descending search[J].Journal of Computer Research and Development,2016,53(8):1719-1728.
[10]Mur A,Dormido R,Duro N,et al.Determination of the optimal number of clusters using a spectral clustering optimization[J].Expert Systems w ith Applications,2016,65(23):304-314.
[11]Stephanakis I M,Anastassopoulos G C,Iliadis L.A self-organizing feature map(SOFM)model based on aggregate-ordering of local color vectors according to block similarity measures[J].Neuro Computing,2013,107(4):97-107.
[12]Huang C H,Lin C H.Multiple harmonic-source classification using a self-organization feature map netw ork w ith voltage-current w avelet transformation patterns[J].Applied M athematical M odelling,2015,39(19):5849-5861.
[13]Li N,Cheng X,Zhang S,et al.Realistic human action recognition by fast HOG3D and self-organization feature map[J].M achine Vision and Applications,2014,25(7):1793-1812.
[14]Brugger D,Bogdan M,Rosenstiel W.Automatic cluster detection in Kohonen's SOM[J].IEEE Transactions on Neural Netw orks,2008,19(3):442-459.
[15]Wu S,Chow T W S.Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density[J].Pattern Recognition,2004,37(2):175-188.
[16]Zhou Chen-xi,Liang Xun,Qi Jin-shan.A semi-supervised agglomerative hierarchical clustering method based on dynamically updating constraints[J].Acta Automatica Sinica,2015,41(7):1253-1263.
[1]Han Jia-wei,Kamber Micheline,Pei Jian.数据挖掘:概念与技术[M].北京:机械工业出版社,2012.
[2]孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008,19(1):48-61.
[5]王骏,王士同,邓赵红.聚类分析研究中的若干问题[J].控制与决策,2012,27(3):321-328.
[6]陈黎飞,姜青山,王声瑞.基于层次划分的最佳聚类数确定方法[J].软件学报,2008,19(1):62-72.
[7]郭凯,李海芳,王会青.一种人工免疫的自适应谱聚类算法[J].小型微型计算机系统,2013,34(4):856-859.
[8]王勇,唐靖,饶勤菲,等.高效率的K-means最佳聚类数确定算法[J].计算机应用,2014,34(5):1331-1335.
[9]徐正国,郑辉,贺亮,等.基于局部密度下降搜索的自适应聚类方法[J].计算机研究与发展,2016,53(8):1719-1728.
[16]周晨曦,梁循,齐金山.基于约束动态更新的半监督层次聚类算法[J].自动化学报,2015,41(7):1253-1263.