基于免疫遗传算法和粒子群算法的聚类研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着信息科学技术的发展,人们越来越倾向于选择用计算机来统计和管理数据,数据库的规模也随之不断地扩大。当人们积累了大量的商业数据以后,如何从汪洋大海般的数据中发现有价值的信息成为一个急需解决的重要问题。由此数据挖掘技术应运而生,它是目前数据库和信息决策领域最前沿的研究方向之一。聚类分析作为数据挖掘的一个重要分支,是通过分析数据的相似性把大型数据集合划分成组,使得同一个组里面的数据彼此最为相似,而不同组中的数据彼此相异。聚类是发现有用信息的一种有效手段。目前,聚类分析已经广泛地应用于模式识别,数据分析,图像处理以及市场研究等领域。
     目前在文献中存在大量的聚类算法。算法的选择取决于数据的类型、聚类的目的和应用。本文探讨了基于免疫遗传算法和基于粒子群算法的C—均值聚类方法。所做的主要工作如下:
     1.用免疫遗传算法完成聚类工作。首先,分析现有遗传算法的优缺点,将免疫机制引入遗传算法,用来克服了标准遗传算法的早熟现象;其次,将C—均值算法和免疫遗传算法有机结合,形成一种混合算法;最后,根据聚类问题的实际情况设计遗传选择、交叉和变异算子,使得混合算法更快、更有效地收敛到全局最优解。
     2.用改进后的粒子群算法实现聚类。首先,分析现有粒子群算法的优缺点;其次,将局部搜索能力强的C—均值算法和基于遗传算法的交叉、变异操作同时结合到粒子群算法中;最后,通过适当调节,发挥各自的优点。既提高了PSO算法的局部搜索能力,又因为增加了种群的多样性,防止了算法的早熟。
     3.将改进后的算法选择一些数据集用MATLAB编程做聚类实验,并与其他算法结果进行对比,分析试验结果。
With the development of information science and technology, people have inclined to collecting and organizing all kinds of data by computers, then the size of data has expended as well. When people have accumulated massive amount of business data, how to find the valuable information in the vast ocean-like data have become an urgent need to be solved. For this data mining techniques have emerged,which is one of the most cutting-edge research of the database and information decision-making. Cluster analysis as an important branch of data mining is the analysis of data’s similarity, and divided the large data sets into groups, in which the data inside the same group was most similar to each other and the data in different groups was differ from each other. Clustering is an effective means of finding useful information. At present, Cluster analysis has been widely used in pattern recognition, data analysis, image processing, market research and many other fields.
     There is a large number of clustering algorithms in the literature. The choice of algorithm depends on the type of data, the purpose and applications of clustering. This paper discussed C-means clustering method which based on the immune genetic algorithm and particle swarm optimization algorithm separately. Following is the main work has been done:
     1. Complemented clustering algorithm with immune genetic algorithm. First, analyzed the strengths and weaknesses of the existing genetic algorithm, the immune mechanism was introduced into the standard genetic algorithm to overcome the premature phenomenon; Second, the C-means algorithm and the immune genetic algorithm were combined to form a hybrid algorithm; Finally, based on the actual situation of the clustering problem designed the genetic selection, crossover and mutation operators, made the hybrid algorithm converge to the global optimal solution much faster and more efficiently.
     2. Clustering with the improved particle swarm algorithm. First, the advantages and disadvantages of the existing particle swarm optimization were analyzed; second, the C-means algorithm which has strong local search ability and the genetic algorithm-based crossover and mutation operations were mixed into the particle swarm algorithm; finally, them have played their advantages respectively through appropriate regulation. Not only the PSO algorithm’s ability of local search is improved, but also the diversity of the population was increased, at last achieved the purpose of prevent premature problem of the algorithm.
     3. Selected some data sets and the clustering experiments were implemented through MATLAB programming by the improved algorithms, and results were compared with other algorithms, and analyzed the result of the experiment.
引文
[1] Jiawei Han, Micheline Kamber数据挖掘.概念与技术[M].范明.北京:机械工业出版社, 2007,3.
    [2]数据挖掘讨论组.数据挖掘资料汇编[DB/OL].http://www.dmgroup.org.cn,2008.5.
    [3]陈栋.Knight.一个通用知识挖掘工具[J].计算机研究与发展,1998,35(4):338-343.
    [4]中国人民大学统计系数数据挖掘中心.数据挖掘中的聚类分析[J].统计与信息论坛,2002,17(3):28-33.
    [5]马仲来.系统聚类分析中应注意的两类问题.数理统计与管理[J],1994,12(6): 55-56.
    [6]马飞.数据挖掘中的聚类算法研究:[南京理工大学硕士学位论文].南京:南京理工大学,2008,1,5.
    [7]王鑫,王洪国,王瑶等.数据挖掘中聚类方法比较研究[J].计算机技术与发展, 2006,16(10):20-25.
    [8] Holland J.H. Adaption in natural and artificial system[M]. Michigan:The University of MichiganPress. 1975.
    [9] Hall L O, Ozyurt I B, Bezdek J C. Clustering with a genetically optimized approach [J].IEEE Transactions on Evolutionary Computation, 1999,3(2):103-112.
    [10]刘健庄,谢维信.聚类分析的遗传算法方法[J].电子学报,1995,23(11):81-83.
    [11]刘静,钟伟才.免疫进化聚类算法[J].电子学报, 2001,29(12):1868-1872.
    [12] Sheng W, Tucker A, LiuX. Clustering with Niching genetic K-means algorithm [A]. Proceedings of Genetic and Evolutionary Computation Conference[C]. Berlin: Springer -Verlag, 2004.162-173.
    [13] Krishna K, Murty M N. Genetic K-means algorithm[J]. IEEE Transactions on Systems, Man and Cybernetics,PartB: Cybernetics,1999,29(3):433-439.
    [14] Maulik U, Bandyopadhyay S. Genetic algorithm-based clustering technique[J]. Pattern Recognition, 1997,30(7):1109-1119.
    [15]高坚.基于C—均值和免疫遗传算法的聚类.[J].计算机工程,2003,29(12): 65-66.
    [16]周春光,梁艳春.计算智能[M].长春:吉林大学出版社,2001,1.
    [17]田小梅,龚静.实数编码遗传算法的评述[J].湖南环境生物职业技术学院学报,2005,11(1): 25-31.
    [18]李鹏,董聪.基于实数编码的广义遗传算法及其在优化问题中的应用[J].控制与决策, 2002,17(4):487-490.
    [19]张志顺,胡勇刚,赵宏伟等.基于改进形式的遗传算法研究[J].微电子学,2002,32(4):273- 275.
    [20]田小梅,郑金华,李合军.基于父个体相似度的自适应遗传算法[J].计算机工程与应用, 2005,31(18):61-63.
    [21]聂冲,王维平,赵雯等.基于学习算子的自学习遗传算法设计[J].计算机仿真,2006,23(9): 168-171.
    [22]吴养会,王乃信,刘瀛洲.多种群竞争遗传算法及其性能分析[J].西北农业科技大学学报, 2005,33(4):154-156.
    [23]蔡良伟,李霞.一种带融合操作的实数多种群遗传算法[J].计算机工程与应用,2005,31 (13):61-63.
    [24]王文义,任刚.多种群退火贪婪混合遗传算法[J].计算机工程与应用,2005,31 (23):60-62.
    [25] Suresh Chandra Satapathy,Venkatesh Katari, Rohit Parimi et al. A New Approach of Integrating PSO & Improved GA for Clustering with Parallel and Transitional Technique[C]. IEEE, Third International Conference on Natural Computation,2007.
    [26]周远晖,陆玉昌,石纯一.基于克服过早收敛的自适应并行遗传算法[J].清华大学学报, 1998,38(3):93-95.
    [27]吴佳英,李平,胡宁静等.一种改进多亲遗传算法的并行模型研究[J].计算机工程,2007, 33(5):190-192.
    [28]李同强,周天弋,吴斌.基于改进遗传算法的加权模糊C均值聚类算法[J].计算机应用,2009, 29(B12):260-262.
    [29]曹志宇,张忠林,李元韬.快速查找初始聚类中心的K-means算法[J].兰州交通大学学报,2009, 28(6):15-18.
    [30]司徒莹.新型免疫遗传算法研究[J].计算机应用与软件,2009, 26(11):266-268.
    [31]雷英杰,张善文,李续武等.MATLAB遗传算法工具箱及应用[M].西安:西安电子科技大学出版社,2005,4.
    [32]初级电气化验收规程中华人民共和国国家标准GB/T15659-95.
    [33]陈慰峰.医学免疫学[M].北京:人民卫生出版社,2001:1-18.
    [34]黎竹娟.免疫遗传算法WEB使用挖掘中的研究:[广西大学硕士学位论文].南宁:广西大学,2007.5.
    [35] Nasaroui O, Gonzalez F, Dasgupta D. The fuzzy immune system: motivations, basic concepts, and application to clustering and d Web profiling, Fuzzy Systems[C]. In:FUZZ- IEEE’02, Proceedings of the 2002 IEEE International Conference, Vol1 2002-05:711-716.
    [36] D Dasgupta. Immune-based intrusion detection system: a general framework[C]. In: Proceedings of the 22nd national information systems security conference,1999.
    [37] Ishida Y. Fully Distributed Diagnosis by PDP Learning Algorithm: Towards Immune Network PDP Model[C].In: Proc of IJCNN90, San Diego.1990.
    [38] Cserey, G., Falus, A., Pored,W., Roska, T. An artificial immune system for visual applications with CNN-UM[A].Circuits and Systems,2004,ISCAS 04,Proceedings of the 2004 International Symposium[C].2004,17-20.
    [39] P.J.Costa Branco,J.A.Dente,and R.Vilela Mendes. Using Immunology Principles for Fault Detection [J].IEEE Transactions on Industrial Electronics,2003,50(2):362- 373.
    [40] A Tarakanov. D Dasgupta. An immune chip architecture and its emulation[q].In: 2002 NASA Conference on Evolvable Hardware, 2002.
    [41] Forrest S, Perelson A, Cherukuri R. Self-nonself discrimination in a computer [C].In: Proceedings of 1994 IEEE Computer Society Symposium on Research in Security and Privacy, Los Almitos, CA,USA: IEEE Computer Society,1994:202-212.
    [42] Moshref,H., VanLandingham,H., Immune network simulation of reactive control of a robot arm manipulator[A].Soft Computing in Industrial Applications,2001, SMCia/01, Proceedings of the 2001 IEEE Mountain Workshop on[C].2001,81-85.
    [43] You Yong, Wang Sunhn, heng Wanxing. Short-term load forecasting using artificial immune network[A].Power System Technology,2002 Proceedings. PowerCon2002, International Conference [C].2002,322-325.
    [44] J Hunt,J Timmis. D Cooke etal. The Development of an Artificial Immune system for Real World Application [J].Artificial Immune system and their Applications.1999.
    [45]王磊,潘进,焦李成.免疫规划[J].计算机学报,2000,23(8):806-812.
    [46]焦李成,杜海峰.人工免疫系统进展与展望[J].电子学报,2003,31(l0):1540-1548.
    [47]谈英姿,沈炯,吕震中.免疫PID控制器在气温控制系统中的应用研究[J].中国电机工程学报,2002,22(10):148-152.
    [48]陈慰峰.医学免疫学[M].北京:人民卫生出版社,2001,8.
    [49]王磊,潘进,焦李成.免疫算法[J].电子学报,2000,28(7):74-78.
    [50] Burnet F M. The Clonal Selection Theory of Acquired Immunity [M]. Cambridge: Cambridge University Press, 1959,2.
    [51] Kepler T B, Perelson A S, Somatic Hypermutation in B cells: An Optional ontrol Treatment [J], Journal of theoretical Biology, 1993, (164): 37-64.
    [52]高新波,谢维信.模糊聚类理论发展及应用研究进展[J].科学通报,1999,44 (21):2241-2251.
    [53] Bezdek J C. Mumerical taxonomy with Fuzzy sets [J]. Journal Mathematical Biology, 1974, (14):57-71.
    [54] Berek C, Ziegner. The Maturation of the immune response today[J].1993,14(8) 400-402.
    [55] Bezdek J C. Pattern Recognition with Fuzzy Objective Function Algorithms [M]. New York: Plenum Press,1981.
    [56] Kennedy J, Eberhert R. Particle swarm optimization[C]//IEEE International Conference on Neural Networks, 1995:1942-1948.8.
    [57] Shi Y, Eberhart R C.Particle Swarm Optimization: developments, applications and resources [C].In: Proc Congress on Evolutionary Computation 2001 NJ: Piscataway, IEEE Press, 2001:81-86,101-106.
    [58]海风吹.蚁群算法,PSO算法以及两种算法可以融合的几种方法[DB/OL]. (2008-5-20). http://www.cnblogs.com/hetonghai/archive/2008/04/09/1144145 .html.
    [59] Shi Y, Eberhart R C. Fuzzy Adaptive Particle Swarm Optimization[C].In: Proc Congress on Evolutionary Computation, 2001:101-106.
    [60] Lovbjerg M, Rasmussen T k, Krink T.hybrid Particle Swarm Optimizer with Breeding and Subpopulation[C].In:Proc Congress on Evolutionary Computation,2001.
    [61] Ciuprina G, Inan D, Munteanu I. Use of Intelligent-Particle Swarm Optimization in Electromagnetics[J].IEEE Trans on Magnetics,2002:38(2):1037-1040.
    [62] Brits R,Engelbrecht AP,van den Bergh F.A Niching Particle Swarm Optimizer [C]In: 4th Asia-Pacific Conference on Simulated Evolution and Learning, 2002.
    [63]李峻金,向阳,芦英明.粒子群聚类算法综述[J].计算机应用与软件,2009,26(11):266-268.
    [64] van den Berth F, Engelbercht AP.A New Locally Convergent Particle Swarm Optimizer[C]. In: IEEE Conference on Systems ,Man, and Cyber-netics,2002.
    [65] Millonas M, Swarms M. Phase transition and collective intelligence. In:C.G.Langton, (Eds.), Artificial life III. Addison– Wesley, MA,1994.
    [66]高岳林,任子晖.带有变异算子的自适应粒子群优化算法[J].计算机工程与应用,2007,43 (25):43-47.
    [67]吕振肃,侯志荣.自适应变异的粒子群优化算法[J].电子学报,2004,32(3):416 -420.
    [68]寇保华,杨涛,张晓今等.基于交叉变异的混合粒子群优化算法[J].计算机工程与应用,2007,43(17):85-88.
    [69]刘向东,沙秋夫,刘勇奎等.基于粒子群算法的聚类分析[J].计算机工程,2006,32(6): 201-202.
    [70]高尚,杨静宇.一种新的基于粒子群算法的聚类方法[J].南京航空航天大学学报,2006,38(6):62-64.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700