基于线性阈值模型与协同方法的社团检测算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
21世纪,随着信息技术的普及发展以及各种社交网站的流行,研究人员可以便捷地收集复杂社会网络数据集,这极大程度上促进了人们对复杂社会网络的研究分析。研究表明复杂社会网络具有小世界性、无尺度性、模块性等特性。社团目前尚无明确统一的定义,一般认为是满足社团内部节点联系紧密而社团间节点联系稀疏的节点集合。社团检测算法一直是复杂社会网络中的一大研究热点,主要研究如何从复杂社会网络中发现高质量的社团结构。清晰的社团结构对研究分析复杂社会网络问题有极大的帮助。
     本文的主要工作是社团检测算法研究,文中共提出了三个社团检测相关算法。第一,本文提出一种基于随机游走改进的节点相似性计算方法(RWS),可用于根据网络的拓扑图及权重信息快速计算网络中节点的相似性:第二,本文提出一种基于线性阈值模型的社团检测算法(LTCD),该算法通过初始社团不断激活边界节点的方式来扩展社团;第三,本文还提出一种基于协同方法的社团检测算法(CMCD),该算法旨在提高社团检测算法的性能。这三个算法出发点和侧重点各不相同,RWS是用来计算节点相似性,LTCD是一种简单的社团检测算法,而CMCD是一种提升社团结构检测质量的框架算法。
     本文在多个真实和人工合成社会网络数据集上进行实验,从模块度、准确率和标准互信息等方面对实验结果进行分析。实验结果表明本文提出的算法能得到较高质量的社团结构。
Since the21st century, with the development and popularization of information technology, and a variety of popular social networking sites, researchers can easily collect the complex social network datasets, which has largely inspired researchers to study complex social networks. Studies have shown that complex social networks have certain features such as small-world, scale-free and modularity. Generally, community is identified as a set of nodes whose links inside are more densely than links outside. Community detection algorithm which has always been a hot research topic, mainly studies how to find the high-quality of the community structure in complex social network. Good community structure is useful for complex social network analysis.
     Community detection is the key issue of finding the structure and understanding the function of many natural and artificial systems. In this paper, the community detection algorithm is the main study which is we focused on, we have achieved some results. First, we present a similarity calculation method based on random walks (RWS), which is used to quickly calculate the nodes' similarity according to the topology and weight information of the network. Second, a community detection algorithm (LTCD) based on the linear threshold model is proposed, which continuously activated the boundary node to extend the initial community. Finally, the paper also proposes a method based on cooperative method (CMCD), which is used to optimize the community structure found by community detection algorithm and improve its stability.
     In this paper, experiments on some real and artificial synthetic social network datasets prove that three algorithms mentioned above are feasible and can detect high-quality community structure.
引文
[1]林聚任.社会网络分析:理论、方法与应用[M].北京:北京师范大学出版社,2009.
    [2]Newman M E J, Girvan M. Finding and evaluating community structure in nerworks[J]. Physical review E,2004.69(2):026113.
    [3]Jiawei Han, Micheline Kamber. Data Mining Concepts and Techniques[M].2nd edition. BeiJing:China Machine Press,2007.
    [4]Girvan M, Newman M E J. Community structure in social and biological networks[J]. Proceedings of the National Academy of Sciences,2002,99(12):7821-7826.
    [5]Newman M E J. Fast algorithm for detecting community structure in networks[J]. Physical review E,2004,69(6):066133.
    [6]Clauset A, Newman M E J, Moore C. Finding community structure in very large networks[J]. Physical review E,2004,70(6):066111.
    [7]Raghavan U N, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks[J]. Physical Review E,2007,76(3):036106.
    [8]Pothen A, Simon H D, Liou K P. Partitioning sparse matrices with eigenvectors of graphs[J]. SIAM.Journal on Matrix Analysis and Applications,1990,11(3):430-452.
    [9]Long B. Wu X, Zhang Z, et al. Community learning by graph approximation[C]//Data Mining,2007. ICDM 2007. Seventh IEEE International Conference on. IEEE,2007: 232-241.
    [10]Barber M J, Clark J W. Detecting network communities by propagating labels under constraints[J]. Physical Review E,2009,80(2):026129.
    [11]Liu X, Murata T. Advanced modularity-specialized label propagation algorithm for detecting communities in networks[J]. Physica A:Statistical Mechanics and its Applications,2010, 389(7):1493-1500.
    [12]Herbiet G J, Bouvry P. SHARC:community-based partitioning for mobile ad hoc networks using neighborhood similarity[C]//World of Wireless Mobile and Multimedia Networks (WoWMoM),2010 IEEE International Symposium on a. IEEE,2010:1-9.
    [13]Vasudevan M, Balakrishnan H, Deo N. Community discovery algorithms:an overview[J]. Congressus Numerantium,2009,196:127-142.
    [14]Steinhaeuser K, Chawla N V. Identifying and evaluating community structure in complex networks[J]. Pattern Recognition Letters,2010,31(5):413-421.
    [15]Fortunato S, Barthelemy M. Resolution limit in community detection[J]. Proceedings of the National Academy of Sciences,2007,104(1):36-41.
    [16]王林,戴冠中,赵焕成.一种新的评价社区结构的模块度研究[J].Computer Engineering, 2010,36(14).
    [17]Ana L N F, Jain A K. Robust data clustering[C]//Computer Vision and Pattern Recognition, 2003. Proceedings.2003 IEEE Computer Society Conference on. IEEE,2003,2: 11-128-11-133 vol.2.
    [18]Danon L, Diaz-Guilera A, Duch J, et al. Comparing community structure identification[J]. Journal of Statistical Mechanics:Theory and Experiment,2005,2005(09):P09008.
    [19]Leicht E A, Holme P, Newman M E J. Vertex similarity in networks[J]. Physical Review E, 2006,73(2):026120.
    [20]Liu Z, Li P, Zheng Y, et al. Community detection by affinity propagation[R]. Technical Report,2008.
    [21]van Dongen S M. Graph clustering by flow simulation[J].2000.
    [22]Pons P, Latapy M. Computing communities in large networks using random walks[M]//Computer and Information Sciences-ISCIS 2005. Springer Berlin Heidelberg, 2005:284-293.
    [23]Zachary W W. An information flow model for conflict and fission in small groups[J]. Journal of anthropological research,1977:452-473.
    [24]Lusseau D, Schneider K., Boisseau O J, et al. The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations[J]. Behavioral Ecology and Sociobiology,2003,54(4):396-405.
    [25]Lusseau D, Newman M E J. Identifying the role that animals play in their social networks[J]. Proceedings of the Royal Society of London. Series B:Biological Sciences,2004,271(Suppl 6):S477-S481.
    [26]Chen D, Shang M, Lv Z, et al. Detecting overlapping communities of weighted networks via a local algorithm[J]. Physica A:Statistical Mechanics and its Applications,2010,389(19): 4177-4187.
    [27]Kempe D. Kleinberg J, Tardos E. Maximizing the spread of influence through a social network[C]//Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM,2003:137-146.
    [28]Hartigan J A. Direct clustering of a data matrix[J]. Journal of the american statistical association,1972,67(337):123-129.
    [29]Pedrycz W. Collaborative fuzzy clustering[J]. Pattern Recognition Letters,2002,23(14): 1675-1686.
    [30]Kashef R, Kamel M S. Cooperative clustering[J]. Pattern Recognition,2010,43(6): 2315-2329.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700