一种改进的动态SOM算法及其在聚类中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
聚类技术是一种无监督的学习过程,它的主要目的是把无标示的数据分成具有相同特点的组。主要包括划分的方法,层次的方法,基于密度的方法,基于网格的方法,基于模型的方法。基于模型的方法,SOM(Self-Organizing Maps)神经网络是一种典型的无导师聚类算法,它是1981年由芬兰学者Kohonen提出的自组织特征神经网络模型,以其所具有的诸如拓扑结构保持、概率分布保持、无导师学习及可视化等特征,被广泛应用于众多信息处理领域,可用于语言识别,图像压缩,机器人控制,优化问题控制理论、金融分析、实验物理学、化学、医药等。
     本文主要研究了自组织神经网络算法,针对当前动态SOM仍然存在两个缺陷,生成新的神经元受网络结构限制和生成神经元要预先给个阀值,并提出了一种改进的GSOM算法,并对某研究员检测的细胞各个指标数据进行自动聚类分类,通过与传统SOM算法比较,验证算法的高效性。改进算法主要有以下创新点1)不需要在实验前设定神经元数目,完全自组织无导师学习,自动聚类;2)基于方差分析思想生长,不需要凭经验或者另外计算合适的控制生长因子SF;3)修剪过程,专门排除噪声异常数据;4)圆形网络结构,不存在生成神经元无法安排问题,可以适应自由生长。
Clustering is an unsupervised learning process,its main purpose is to no markedcharacteristics of the data into the same group.Mainly divided into the method, hierarchicalmethods,density-based method,grid-based approach model-based approach.model-basedapproach,SOM(Self-Organizing Maps)neural network is a typical unsupervised clusteringalgorithm,which is 1981,made by the Finnish scholar Kohonen self-organizing featureneural network model,such as they have topology to keep,maintain a probabilitydistribution, unsupervised learning and visualization features,are widely used in many areasof information processing can be used for speech recognition,image compression, robotcontrol,optimization control theory,financial analysis,experimental physicsscience,chemistry,medicine.
     This paper studies the self-organizing neural network algorithm for both the currentshortcomings remain GSOM to generate new neurons generated by the network constraintsand neuron to advance to a threshold,and an improved dynamic SOM algorithm,and aFellow of various indicators of cell testing data automatically cluster classification,compared with the traditional SOM algorithm to verify the algorithm efficiency.Improvedalgorithm has the following main innovations 1) do not need to set the number of neuronsbefore the experiment,completely self-organizing unsupervised learning,automaticclustering;2) analysis of variance based on the growth of ideas,no rule of thumb,or othersuitable control the growth of computing factor SF; 3) pruning process,specifically excludeabnormal noise data;4) circular network structure,there is no problem can not be arrangedgenerated neurons can adapt to free growth;The algorithm matlab programming tools,toachieve a classification and comparison of the clustering process.
引文
[1]KohonenT.Self-organization and associative memory[M],3rded.Berlin: Springer-Verlag,1989
    [2]UM Fayyad,G. Piatetsky-Shapiro,P. Smyth, and R. Uthurusamy.Advances in KnowledgeDiscovery and Data Mining[M] Eds., AAAI, 1996, 307–328
    [3]Kohonen T Self-organizing maps[M],Springer,Berlin ,1995
    [4]D.Alahakoon,S.K Halgamuge Dynamic Self-organizing Maps with Controlled Growth forKnowledge Discovery[J]. IEEE Transactions on Neural Networks,2000,12(1):153-158
    [5]Polani D.Organization measures for self-organizing maps[J].In:Proc.of the workshop onSelf-organlzing Maps.Helsinki,1997:280-285
    [6]吴郑,阎平凡.结构自适应自组织神经网络的研究[J].电子学报,1999
    [7]许少华,何新贵.自组织过程神经元网络及其应用研究[J].计算机研究与发展,2003,40(11),1612-1615
    [8]Koikkalainen P.Treestructured self-organizing maps[J].In:OjaE KaskiS.ed Kohonen mapsAillsterdam:Elsevier,1999,121-130
    [9]王莉,王正欧.TGSOM:一种用于数据聚类的动态自组织映射神经网络[J].电子与信息学报,2003,25(3),313-319
    [10]D.Choi,Self-Creating and Organizing Neural Networks[J].IEEE Transactions on NeuralNetworks,1994,5(4):561-575
    [11]沈来信.一种新的进化树型自组神经网络的研究与应用[D].暨南大学硕士学位论文,2007年6月
    [12]ALAHAKOONLD.A Self Growing Cluster Development Approach to DataMining[A].Proeeedings of the 1998 IEEE international Conference on Systems,Man andcybemeties[C],1998
    [13]陶骏,洪国辉.基于生长的自组织映射的数据挖掘[J].计算机应用, 2005,25(2):309-311
    [14]Xiao X.Gene Clustering Using Self-organizing Maps and Particle SwarmOptimization[C].Parallel and Distributed Processing Symposium,2003,4
    [15]Hussin M F,Kamel M.Document Clustering Using Hierarchical SOMART NeuralNetwork[J].Proceedings of the International Joint Conference on NeuralNetworks,2003,3(6):2238-2242
    [16]孙放,胡光锐,高军.SOM结合MLP的神经网络语音识别系统[J].数据采集与处理,1996,11(2):119-122
    [17]邵超,黄厚宽.一种新的基于SOM的数据可视化算法[J].计算机研究与发展,2006,43(3):429~435
    [18]A FLEXER.On the use of self-organizing maps for clustering andvisualization[J].Intelligent Data Analysis,2001,5(5):373-384
    [19]J.W Han and M.Kamber.Data Mining:Concepts and Techniques[M].SimonFraserUniversity Press,2000,21(2):224-258
    [20]田启明,王丽珍,尹群.基于网格距离的聚类算法的设计、实现和应用[J].计算机应用,2005,2月.25(2):294-296
    [21]王振动.聚类算法及其在客户行为分析中的应用研究[D].北京邮电大学硕士学位论文,2008年3月
    [22]Huang Z.Extensions to the K-means Algorithm for Clustering Large Data Sets withCategorical Values[J].Data Mining and Knowledge Discovery 1998,2:283~304
    [23] L.Kaufman,P.J Rousseeuw.Finding Groups in Data:an Introduction to Cluster AnalysisNew York[J]:John Wiley&Sons.1990:127~129
    [24]R.NG,J.Han.Efficient and Effective Clustering Method for Spatial Data Mining Proc1994 Int Conf Very Large Data Bases.1994:144~155
    [25] T.Zhang,R.Ramakrishnan and M.Livny.BIRCH.An Efficient DataCustering Method forVery Large Databaese Proc 1996 ACM SIGMOD Int Conf Management ofData.1996:103~114
    [26]S.Guha,R.Rastogi and K.Shi.CURE:An Efficient Clustering Algorithm for LargeDatabases Proc 1998 ACM SIGMOD Int Conf Management of Data.1998:73~84
    [27]荣秋生.基于DBSCAN聚类算法的研究与实现.计算机应用.2004,1(24):45~47
    [28]A Hinneburg,D.A.Keim.An Efficient Approach to Clustering in Large MultimediaDatabases with Noise Proc 1998 Int Conf Knowledge Discovery and Data Mining1998:58~65
    [29]J.Wang,R.Yang and Muntz.STING:A Statistical Information Grid Approach to SpatialData Mining Proc Conf Very Large Data Bases.1997:186~195
    [30]G.Sheikholeslami,S.Chatterjee and A.Zhang.WaveCluster:A Multi-Resolution ClusterApproach for Very Large Spatial Databases Proc International Conference of Very LargeData Bases New York,NY.1998:428~439
    [31]Rumelhart DE,Zipser D.Feature Discovery by Competitive Learning[J].CognitiveScience,1985,9:75~112
    [32]李戈,邵峰晶,朱本浩.基于神经网络聚类的研究[J].青岛大学学报.2001,12:1~4
    [33]杨占华,杨燕. SOM神经网络算法的研究与进展[J].2006,32(16):201-202
    [34]Tzke B.Growing Cell Structures:A Self-organizing Network for Unsupervised andSupervised Learning[J]. Neural Network,1994 7(9): 1411-1460
    [35]覃晓,元昌安.基于遗传算法和自组织特征映射网络的文本聚类方法改进[A]计算机应用,2008,3,0757-04
    [36]Bebis G,Geoorgiopouls M, Lobo N V. Using Self-organizing Maps to Learn GeometricHash Function for Model-based Object Recognition[J].IEEE Transactions on NeuralNetworks, 1998,9(5): 560-570
    [37]尹峻松,胡德文,陈爽等.DSOM:一种基于NO时空动态扩散机理的新型自组织模型[J].信息科学, 2004, 34(10): 1094-1109
    [38]Simon Haykin,叶世伟,史忠值译.神经网络原理[M]北京:机械工业出版社2004 :287-292,324-327
    [39]吴海桥,刘毅等.SOM人工神经网络在客机零部件故障诊断中的应用研究,南京航空航天大学学报:2002,34(1):31 -34
    [40]Yi Y C, Kuu Y Y. Applying SOM as a search mechanism for dynamic system [A ]. In:Proceedings of the 44 th IEEE Conference on Decision and Control [ C ] ,Seville ,Spain,2005: 4111-4116
    [41]Dianbo Jiang,Yahui Yang,Min Xia.Research on Intrusion Detection Based on anImproved SOM Neural Network[A].IAS,2009 Fifth International Conference onInformation Assurance and Security[C] Washington, IEEE ComputerSociety ,2009:400-403

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700