基于多目标遗传算法的复杂网络社区划分
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近年来,分析复杂的真实世界的网络已经引起了一些重大发现。研究显示社区结构是许多来自各个领域的真实网络普遍拥有的拓扑性质,包括各种各样的社会、生物、Internet、经济、政治网络等。例如,在社会网络中,拥有共同背景和兴趣的人被发现经常组织在一起并且在组织之内的交流比之间更频繁。通常,社区(或者称为模块)是指在一个网络中内部节点连接密集而外部连接稀疏的节点集。揭示网络的社区结构对于我们更深刻地理解分析网络的功能、发现网络潜在模式、预测网络行为等都具有非常重要的意义。
     目前常见的方法是把社区划分问题转换成优化问题,群智能优化算法被越来越多的应用到该领域,但大多数算法都是优化单个目标。尽管基于单目标的社区划分算法在理论和应用中都取得了成功,但也存在一些诸如复杂度高、解限制等问题。为了解决这些问题,一个自然的方法就是把社区划分问题当做一个多目标优化问题。
     本文研究了在复杂网络中查找社区结构的一种多目标遗传算法。首先构建了社区分值和社区适应度这两个能够识别内部联系紧密但相互之间联系稀疏的节点群的目标函数。其次在多目标遗传算法返回一组两个目标函数之间折衷的非支配解后,通过模块度和规范化互信息这两个评估指标来选取最合适的解。该算法能够发现网络的层次结构,在这些层次中,拥有更多数目社区的深层次的解被包含在拥有较少数目社区的解之中。社区的个数自动取决于目标函数更佳的权衡值。最后通过在模拟和真实网络进行的实验对比表明,该算法能够成功发现复杂网络社区结构,并且与目前其他算法相比具有一定的竞争力。而且在大型网络上的实验表明算法能够有效应用于大规模复杂网络的划分。
Analysis of complex real-world networks has led to some significant discoveries in recent years. Studies have shown that community structure is prevalent in the real network topology from fields in nature, including various social, biological, Internet, economy, political and other networks. For instance, in social networks, individuals with common backgrounds or interests often found organizations together and communicate more frequently within the organizations than between them. Generally, a community, or the so called module in a network is considered as a node subset which has dense node-node connections internally but sparse connections externally. Revealing community structure of network is very important for analyzing function of network, discovering potential modes of network and forecasting action of network.
     At present, community detection is used to be transformed into optimization problem. And more and more swarm intelligence optimization algorithms have been used in community detection, but most of the algorithms optimized single objective. Although we achieved success on the theory and application of community detection based on single objective, but there are some problems, such as high complexity, the solution restrictions. In order to solve these problems, a natural method is to divide the community into as a multi-objective optimization problem.
     A multiobjective genetic algorithm to uncover community structure in complex network is studied.firstly, we constructed two objective functions,named community score and community fitness,which can identify inter-connections each other and have sparsely relationship.After returning non-dominated solution of a pair of objective functions by multiobjective genetic algorithm, the most suitable solution is selected through modularity and normalized mutual information of these two assessment indicators.This algorithm can discover hierarchical structure of network, consisting of a higher number of modules, are contained in solutions having a lower number of communities. The number of modules is automatically determined by the better tradeoff values of the objective functions. Finally, Experiments on synthetic and real life networks show that the algorithm successfully detects the network structure and it is competitive with state-of-the-art approaches. And the experiments also show that the algorithm can be effectively applied to large-scale complex network is divided in large networks.
引文
[1]汪小帆,李翔,陈关荣.复杂网络理论及其应用[M].北京:清华大学出版社.2006.
    [2]Watts DJ, Strogatz SH. Collective dynamics of Small-World networks[J]. Nature.1998, 393(6638):440-442.
    [3]Barabasi AL, Albert R. Emergence of scaling in random networks[J]. Science.1999, 286(5439):509-512.
    [4]Barabasi AL, Albert R, Jeong H, Bianconi G. Power-Law distribution of the World Wide Web[J]. Science.2000,287(5461):2115.
    [5]Albert R, Barabasi AL, Jeong H. The Internet's Achilles heel:Error and attack tolerance of complex networks[J]. Nature.2000,406(2115):378-382.
    [6]Girvan M, Newman M E J.Community structure in social and biological networks[J]. Proc. Natl. Acad. Sci.2002,9(12):7821-7826.
    [7]Palla G, Derenyi I, Farkas I, Vicsek T. Uncovering the overlapping community structures of complex networks in nature and society[J]. Nature.2005,435(7043):814-818.
    [8]Wilkinson D, Huberman B A. A method for finding communities of related genes[J].Proc. Natl. Acad. Sci.2004,101(Suppl.1):5241-5248.
    [9]Flake G W, Lawrence S, Giles C L, Coetzee F M. Self-Organization and identification of Web communities[J]. IEEE Computer.2002,35(3):66-71.
    [10]Kemighan B W, Lin S. An efficient heuristic procedure for partitioning graphs[J]. Bell System Technical Journal.1970,49(2):291-307.
    [11]Fiedler M. Algebraic connectivity of graphs[J]. Czech.Math.J.1973,23(2):298-305.
    [12]Newman M E J. Fast algorithm for detecting community structure in networks [J]. Phys. Rev. E.2004,69(6):066133.
    [13]Radicchi F, Castellano C, Cecconi F, et al. Defining and identifying communities in networks [J]. Proc.Natl. Acad. Sci.2004,101(9):2658-2663.
    [14]Wu F, Huberman B A. Finding communities in linear time:A physics approach[J]. Euro. Phys. JB.2003,38:331-338.
    [15]Clauset A, Newman M E J, Moore C. Finding community structure in very large networks [J]. Phys. Rev.E.2004,70(6):066111.
    [16]Donetti L,Munoz M A. Detecting network communities:A new systematic and efficient algorithm[J]. J. Stat. Mech. Theor. Exp.2004,P10012.
    [17]Guimera R, Amaral L A N. Functional cartography of complex metabolic networks[J]. Nature.2005,433(2):895-900.
    [18]Fortunato S, Barthelemy M. Resolution limit in community detection[J]. Proc. Natl. Acad. Sci.2007,104(1):36-41.
    [19]Duch J, Arenas A. Community detection in complex networks using extremal optimization[J]. Phys. Rev. E.2005,72(2):027104.
    [20]陈国强,王宇平.基于极值优化模块密度的复杂网络社区检测[J].华中科技大学学报 (自然版).2011,39(4):82-85.
    [21]Palla G, Derenyi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and sociery[J].Nature.2005,435(7043):814-818.
    [22]彭佳扬,杨路明,王建新.一种发现交叠社团的快速层次化算法[J].中南大学学报(自然版).2010,41(5):834-839.
    [23]Watts D J, Strogatz.S H. Collective dynamics of Small-World networks[J]. Nature.1998, 393:440-442.
    [24]F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, D. Parisi. Defining and identifying communities in networks [J]. Proc. Natl. Acad. Sci. USA.2004,101 (9):2658-2663.
    [25]A. Arenas, A. Diaz-Guilera. Synchronization and modularity in complex networks[J]. Eur. Phys. J.2007,143(1):19-25.
    [26]A. Lancichinetti, S. Fortunato, F. Radicchi. New benchmark in community detection.2008, arXiv:0805.4770v2 [physics.soc-ph] [Online]. Available:http://arxiv.org/pdf/0805.4770v2
    [27]P. Schuetz, A. Caflish. Multistep greedy algorithm identifies community structure in real-world and computer-generated networks[J]. Phys.Rev. E.2008,78 (2):026112.
    [28]Newman M E J, Girvan M. Finding and evaluating community structure in networks[J]. Phys.Rev.E.2004,69(2):026113.
    [29]V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefevre. Fast unfolding of communities in large networks[J]. J. Stat. Mech. Theor. Exp.2008,2008 (10):1008.
    [30]Duch J, Arenas A. Community detection in complex networks using extremal optimization[J]. Phys. Rev. E.2005,72(2):027104.
    [31]J. M. Pujol, J. Bejar, J. Delgado. Clustering algorithm for determining community structure in large networks[J]. Phys. Rev. E.2006,74 (1):016107.
    [32]A. Lancichinetti, S. Fortunato, J. Kertesz. Detecting the overlapping and hierarchical community structure of complex networks[J]. New J. Phys.2009,11 (3):033015.
    [33]J. Du, E. E. Korkmaz, R. Alhajj, K. Barker. Novel clustering that employs genetic algorithm with new representation scheme and multiple objectives[C]. Proc.6th Int. Conf. DaWaK. 2004,219-228.
    [34]K. Faceli, A. C. P. L. F. de Carvalho, M. C. P. de Souto. Multiobjective clustering ensemble[J]. Int. J. Hybrid Intell. Syst.2007,4(3):145-156.
    [35]J. Handl, J. Knowles. An evolutionary approach to multiobjective clustering[J]. IEEE Trans. Evol. Comput.2007,11 (1):56-76.
    [36]N. Matake, T. Hiroyasu, M. Miki, T. Senda. Multiobjective clustering with automatic k-determination for large-scale data[C]. Proc.9th Int. Conf. GECCO.2007,861-868.
    [37]A. Mukhopadhyay, U. Maulik, S. Bandyopadhyay. Multiobjective genetic algorithm-based fuzzy clustering of categorical attributes[J]. IEEE Trans. Evol. Comput.2009, 13(5):991-1005.
    [38]Y. J. Park, M. S. Song. A genetic algorithm for clustering problems[C]. Proc.3rd Annu. Conf. Genet. Algorithms.1989,2-9.
    [39]D. Datta, J. R. Figueira, C. M. Fonseca, F. Tavares-Pereira. Graph partitioning through a multiobjective evolutionary algorithm:A preliminary study[C]. Proc.10th Int.Conf. GECCO. 2008,625-632.
    [40]G. Nildem Demir, A. Sima Uyar, S. Gunduz Oguducu. Multiobjective evolutionary clustering of Web user sessions:a case study in web page recommendation[J].Soft Comput.2010,14(6):579-597.
    [41]M. Ehrgott. Multicriteria Optimization[M].Berlin, Germany:Springer,2005.
    [42]C. Pizzuti. GA-NET:A genetic algorithm for community detection in social networks[C]. Proc.10th Int. Conf. PPSN.2008,1081-1090.
    [43]郑金华.多目标进化算法及其应用[M].北京:科学出版社.2007.
    [44]C. A. Coello Coello, G. B. Lamont, D. A. Van Veldhuizen. Evolutionary Algorithms for Solving Multiobjective Problems[M]. Berlin,Germany:Springer,2007.
    [45]K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms[M]. Chichester, U.K.: Wiley,2001.
    [46]S. Fortunato.Community detection in graphs[J]. Phys. Rep.2010,486:75-174.
    [47]L. Danon, J. Duch, A. Arenas, A. D'iaz-Guilera. Comparing community structure identification[J]. J. Stat. Mech.2005,2005 (9):09008.
    [48]K. Deb, A. Pratap, S. Agarwal, T. Meyarivan. A fast and elitist multiobjective genetic algorithm:NSGA-Ⅱ[J]. IEEE Trans. Evol. Comput.2002,6(2):182-197.
    [49]周斌.复杂网络的社团结构挖掘及应用研究[D].南宁:广西师范学院.2010.
    [50]D. A. van Veldhuizen, G. B. Lamon. Multiobjective evolutionary algorithm test suites[C]. Proc. ACM SAC.1999,551-557.
    [51]E. Zitzler, L. Thiele. Multiobjective evolutionary algorithms:A comparative case study and the strength Pareto approach[J]. IEEE Trans.Evol. Comput.1999,3(4);257-271.
    [52]J. R. Schott. Fault tolerant design using single and multicriteria genetic algorithm optimization[M.S. thesis]. Dept. Aeronautics Astronautics, Massachusetts Instit. Technol., Cambridge,1995.
    [53]S. Bandyopadhyay, S. K. Pal. Multiobjective GAs, quantitative indices, and pattern classification[J]. IEEE Trans. Syst. Man Cybern. B Cybern.2004,34 (5):2088-2099.
    [54]杜楠.复杂网络中社区结构发现算法研究及建模[D].北京:北京邮电大学.2009.
    [55]HISTCITE [Online]. Available:http://www.garfield.library.upenn.edu/histcomp
    [56]S. Fortunato, C. Castellano. Community structure in graphs[J]. Physics.soc-ph.2007, 1141-1163.
    [57]B. H. Good, Y.-A. de Montjoye, A. Clauset. The performance ofmodularity maximization in practical contexts[J]. Phys. Rev. E.2010,81 (4):046106.