基于可变规模粒子群的聚类分析方法
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
粒子群算法(Particle Swarm Optimization,PSO)源于鸟群捕食行为的研究,是一种新的群体智能优化算法,作为群智能算法的重要分支,在演化计算领域发挥着举足轻重的作用。PSO算法一经提出,因其自身的优良特性,引起学者们的极大的关注,目前已在组合优化、神经网络、机器人路径规划等领域获得了广泛应用。粒子群算法发展至今,虽取得大量研究成果,但它自身的缺陷仍值得继续研究。
     近十几年来,人们利用信息技术生产和搜集数据的能力大幅度提高,很多领域都积累了大量的数据。为了从数据中发现有价值的知识和规律,人们结合数据库、统计学及机器学习等技术,提出数据挖掘来解决这一难题。聚类分析技术是数据挖掘中的重要内容和挖掘方法,是各学科研究的重要工具。
     本论文针对PSO算法多样性缺失的缺陷,提出改进策略。由于聚类分析中的数据分类可以看作是一种分组的策略,原始PSO算法不适应求解此类问题,因此提出另一种改进策略,来使粒子群算法适应聚类分析的要求。并通过对图像分割的实验,验证算法的应用价值。本文的工作内容如下:
     (1)提出了动态种群规模的PSO算法。随机选取一些粒子,利用遗传算子按照一定的概率生成新个体,以新个体来改善种群的多样性。由于遗传算子每次迭代都可能生成一定规模的新个体,所以种群规模始终上升。为控制种群规模,引入疾病算子。当种群规模超过预先设置好的阈值时,将种群规模降为初始状态。
     (2)为求解聚类问题,将PSO算法修改为离散化PSO。首先,将粒子编码为样本的分类情况,粒子维数为样本个数,粒子的每一维代表当前样本的所属的类号;然后,定义粒子之间的距离;最后,修改更新公式,使粒子的每一维类号能够朝向最优解进化。
     (3)将新算法用来进行图形分割的实验,以此来验证算法的应用价值和算法的有效性。
     实验结果证明,动态种群规模可以很好的改善种群的多样性,为算法搜索全局最优解提供帮助。基于这动态种群粒子群算法的聚类分析方法不仅可以得到很好的数据集聚类结果,而且将聚类分析问题分割为聚类方法和聚类评价,使算法具有一定的通用性。同时算法在图像分割上得到了良好的结果,有一定的应用价值。
Originated from the research of birds' predatory behavior, particle swarm optimization (PSO) algorithm is a new swarm intelligent algorithm. As an part of swarm intelligent algorithms, PSO plays an important role in the field of evolutionary computation. Due to the excellent characteristics of the algorithm, PSO has attracted wide attentions of scholars. So that it has been extensively applied in many fields such as combinatorial optimization, neural network and robot path planning etc. Although the research of PSO has gained great achievements, it is still worth to be further investigated because of its defaults.
     In the past decade, the ability of human being to produce and search data with information technique has been greatly improved. Therefore, huge amount of data was accumulated in many fields. To find valuable knowledge and regularity from data, data base, statistics and machine learning are combined to generate data mining technique for solving the question. Cluster analysis is an important tool in subject research, whose contents and methods belong to data mining field.
     In accordance with the loss variety of PSO, an improved algorithm is proposed in this dissertation. In cluster analysis, data classification is taken as a grouping strategy. While traditional PSO does not fit for solving this kind of problem, the algorithm is improved to meet the demand of cluster analysis. Furthermore, the validity of the algorithm is tested by an image segmentation experiment. The main works are listed as follows:
     Firstly, a new PSO based on dynamic population size is presented. The first step in the algorithm is to randomly choose some particles to generate new individuals by genetic operators. And then, new individuals are used to improve the variety of population. Because new amount of individuals are generated after every iteration, the population size keeps increasing. To control population size, disease operator is introduced. While the population size exceed presetting threshold, the population size decrease to original state.
     Then, to solve discrete optimization problem, PSO algorithm is modified to discrete one. At first, particle was encoded to sample's classification, take dimensions of particle as numbers of sample and every dimension as class number of current sample. After defining the distance between particles, iterative formula is updated to impel every class number to evolve the direction of optimal solution.
     Finally, the new algorithm is used to image segmentation to test its validity. Meanwhile characters of the algorithm are also summarized.
     The experimental results show that the population variety can be improved by dynamic population size which does good to searching optimal solution. Based on the dynamic population size PSO, the cluster analysis algorithm can better classify dataset properly. Further more, dividing cluster analysis into cluster and evaluation makes the algorithm generality. Meanwhile, the algorithm is of some application value which is proved by good results of image segmentation.
引文
1 J.Kennedy,C.R.Eberhart,S.Yuhui.Swarm Intelligence.Morgan Kaufmann.2005:3-34
    2 M.Dorigo,L.M.Gambardella.Ant Colonies for the Traveling Salesman Problem.Biosystems.1997,(43):73-81
    3 吴迪,崔荣一,蜂群遗传算法.中国人工智能学会第11届全国学术年会论文集.武汉,2005:733-736
    4 A.Colomi,M.Dorigo,V.Maniezzo.Distributed Optimization by Ant Colonies.Proceedings of the First European Conference on Artificial Life,Paris,France,1991
    5 C.Zhu,Z.Liu,W.Zhang,et al.An Efficient Decentralized Grid Service Discovery Approach Based on Service Ontology.Proceedings of the 2004IEEE/WIC/ACM International conference on Web Intelligence.Washington D.C.:IEEE Computer Society,2004:570-573
    6 C Zhou,L Chia,B Lee.DAML-QoS Ontology for Web Service.Proceedings of International Conference on Web Service(ICWS).San Diego,California,USA,2004:472-479
    7 李晓磊,邵之江,钱积新.一种基于动物自治体的寻优模式:鱼群算法.系统工程理论与实践.2002,22(11):32-38
    8 李晓磊,钱积新.基于分解协调的人工鱼群算法研究.电路与系统学报.2003,8(1):1-6
    9 张梅风,邵诚,甘勇等.基于变异算子与模拟退火混合的人工鱼群优化算法.电子学报.2006,34(8):1381-1385
    10 李晓磊,薛云灿,路飞等.基于人工鱼群的参数估计方法.山东大学学报.2004,34(3):84-87
    11 李晓磊,冯少辉,钱积新等.基于人工鱼群算法的鲁棒PID控制器参数整定方法研究.信息与控制.2004,33(1):112-115
    12 马建伟,张国立.人工鱼群神经网络在电力系统短期负荷预测中的应用.电网技术.2005,29(11):36-39
    13 J.Kennedy,C.R.Eberhart.Particle Swarm Optimization.Proceedings of IEEE International Conference on Neural Networks(ICNN'95). Australia,1995:1942-1948
    14 R.C.Eberhart,J.Kennedy.A new optimizer using particle swarm theory.Proceedings of the Sixth International Symposium on Micro Machine and Human Science.Piscataway,USA:IEEE Service Center,1995:39-43.
    15 R.Eberhart,Y.Shi.Particle Swarm Optimization:Developments,Applications and Resources.Proceedings of the IEEE Congress on Evolutionary Computation.Piscataway,NJ:IEEE Service Center,2001:81-86
    16 冯翔,陈国龙,郭文中.粒子群优化算法中加速因子的设置与试验分析,集美大学学报(自然科学版).2006,11(2):146-151
    17 Y.Shi,R.Eberhart.A Modified Particle Swarm Optimizer.Proceedings of the IEEE Congress on Evolutionary Computation.Piscataway,USA:IEEE Service Center,1998:69-73
    18 王存睿,段晓东,刘向东,周福才.改进的基本粒子群优化算法,计算机工程.2004,30(21):35-37
    19 Y.Shi,R.C.Eberhart.Empirical Study of Particle Swarm Optimization.Proceeding of Congress on Evolutionary Computation.Piscataway,NJ:IEEE Service Center.1999:1945-1949
    20 Y.Shi,R.C Eberhart.Proceedings of the Congress on Evolutionary Computation.Seoul,Korea,2001:101-106
    21 A.Chatterjee,P.Siarry.Nonlinear Inertia Weight Variation for Dynamic Adaptation in Particle Swarm Optimization.Computers and Operations Research.2006,33(3):859-871
    22 朱小六,熊伟丽,徐保国.基于动态惯性因子的PSO算法的研究.计算机仿真.2007,24(5):154-157
    23 M.Clerc.The Swarm and the Queen:towards a deterministic and adaptive particle swarm optimization.Proceedings of the Congress on Evolutionary Computation.Piscataway,NJ:IEEE Service Center,1999:1951-1957
    24 X.H.Shi,Y.C.Liang,H.P.Lee,C.Lu,L.M.Wang.An improved GA and a Novel PSO-GA-based Hybrid Algorithm.Information Processing Letters.2005.93():255-261
    25 于万霞,杜太行,郑宏兴.基于蚁群粒子群的模糊神经网络交通流量预 测,计算机工程与应用.2007,43(25):197-199
    26 寇晓丽,六三阳.基于模拟退火的粒子群算法求解约束优化问题.吉林大学学报(工学版).2007,7(1):136-140
    27 胡春霞,曾建潮,王清华,夏小翔.一种免疫微粒群优化算法,计算机工程.2007,33(19):213-214
    28 徐俊杰,忻展红.基于两阶段策略的粒子群优化,北京邮电大学学报.2007,30(1):136-139
    29 郝武伟,曾建潮.基于聚类分析的微粒子群算法,计算机工程与应用.2008,44(20):41-44
    30 F van den Bergh,Engelbrecht A.P.Using Neighborhoods with the Guaranteed Convergence PSO.Proceedings of the 2003 IEEE,Swarm Intelligence Symposium,2003:234-242
    31 高海兵,周驰,高亮.广义粒子群优化模型.计算机学报.2005,28(12):1980-1987
    32 V.Adirkamanathan,K.Elvarajah,P.J.Fleming.Stability Analysis of the Particle Dynamics in Particle Swarm Optimizer.IEEE Transactions on Evolutionary Computation.2006,10(3):245-255
    33 R.K.Ursem,P.Vadstrup,Parameter Identification of Induction Motors Using Stochastic Optimization Algorithms.Applied Computing,2004:49-64
    34 陈长忆,叶永春.基于粒子群算法的非线性方程组求解.计算机应用与软件.2006,23(5):137-139
    35 祝成虎,彭宏.基于排序优化的微粒子群算法.计算机工程与设计.2006,27(21):4025-4027
    36 徐小慧,张安.基于粒子群优化算法的最佳熵阈值图像分割.计算机工程与应用.2006,42(10):8-11
    37 李欢,焦健民.简化的粒子群优化快速KNN分类算法.计算机工程与应用.2008,44(32):57-59
    38 孙吉贵,刘杰,赵连宇.聚类算法研究.软件学报.2008,19(1):48-61
    39 R.Gelbard,O.Goldman,I.Spiegler.Investigating Diversity of Clustering Methods:An empirical comparison.Data & Knowledge Engineering.2007,63(1):155-166
    40 陈黎飞,姜青山,王声瑞.基于层次划分的最佳聚类数确定方法.软件学报.2008,19(1):62-72
    41 诸克军,苏顺华,黎金玲.模糊C-均值中的最优聚类与最佳聚类数.系统工程理论与实践.2005,(3):52-61
    42 倪巍伟,陈耿,吴英杰,孙志挥.一种基于局部密度的分布式聚类挖掘算法.软件学报.2008,19(9):2339-2348
    43 C.F.Tsai,C.W.Tsar et al.ACODF:A Novel Data Clustering Approach for Data Mining in Large Databases.Journal of Systems and Software.2004,73(1):133-145
    44 M.Halkidi,Y.Batistakis,M.Vazirgiannis.Clustering Validity Checking Methods:Part Ⅱ.ACM SIGMOD Record Archive.2002,31(3):19-27
    45 M.Bouguessa,S.Wang,H.Sun.An Objective Approach to Cluster Validation.Pattern Recognition Letters.2006,27(13):1419-1430
    46 X.Xie,G.Beni,A Validity Measure for Fuzzy Clustering.IEEE Trans.On Pattern Analysis and Machine Intelligence.1991,13(8):841-847
    47 A.V.Kapp,R.Tibshirani.Are Clusters Found in One Dataset Present in Another Dataset? Biostatistics.2007,8(1):9-31
    48 Y.Fukuyama.Fundamentals of Particle Swarm Techniques.Lee K Y,E Sharkawi M A.Modern Heuristic Optimization Techniques with Applications to Power Systems.IEEE Power Engineering Society,2002:45-51
    49 陈海雷等.用基于二进制编码的一步粒子群算法解0/1背包问题.长春理工大学学报.2006,29(2):69-71
    50 杨维,李歧强.粒子群优化算法综述.中国工程科学.2004,6(5):87-94
    51 李秀等.基于遗传交叉因子的改进粒子群优化算法.计算机工程.2008,34(2):181-183

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700