摘要
重叠社区发现是复杂网络研究的重要课题.提出一种基于标签传播的重叠社区发现算法.首先利用标签传播算法得到初始无重叠社区划分结果,之后通过设计新的重叠节点识别算法确定重叠节点,最后再根据重叠节点的识别结果对社区进行合并从而得到最终的重叠社区划分结果.该算法克服了已有算法重叠节点占比过大的弊端.为验证算法的有效性,在LFR人工数据集、3个标准公开测试集以及真实的大豆基因共表达网络上进行实验,并与已有算法进行对比.实验结果表明,该算法性能明显优于对比算法,极大地改善了重叠节点比重过大问题.
Overlapping community identification is an important problem in complex network study.An overlapping community indentification algorithm based on label propagation is proposed.Firstly,label propagation algorithm is used to achieve the initial non-overlapping community structure.And then,new overlapping node detection algorithm is proposed to identify overlapping nodes.At last,according to the identification results of overlapping nodes,the communities are merged to get the final result of overlapping community partition.The proposed algorithm overcomes the disadvantages of the oversize overlapped nodes in existing algorithms.To verify the effectiveness of the algorithm,the experiments and comparison with existing algorithms are carried out on LFR artificial datasets,three benchmark open test datasets,and real soybean gene co-expression networks.The experimental result shows that this algorithm is clear superior to the existing algorithms,overwhelmingly improves the problem of great proportion of overlapping nodes.
引文
[1]WATTS D J,STROGATZ S H.Collective dynamics of′small-world′networks[J].Nature,1998,393(6684):440-442.
[2]ADAMIC L A,HUBERMAN B A,BARABSI A L,et al.Power-law distribution of the World Wide Web[J].Science,2000,287(5461):2115.
[3]GIRVAN M,NEWMAN M E J.Community structure in social and biological networks[J].Proceedings of the National Academy of Sciences of the United States of America,2002,99(12):7821-7826.
[4]RADICCHI F,CASTELLANO C,CECCONI F,et al.Defining and identifying communities in networks[J].Proceedings of the National Academy of Sciences of the United States of America,2004,101(9):2658-2663.
[5]杨博,刘大有,LIU Jiming,等.复杂网络聚类方法[J].软件学报,2009,20(1):54-66.YANG Bo,LIU Dayou,LIU Jiming,et al.Complex network clustering algorithms[J].Journal of Software,2009,20(1):54-66.(in Chinese)
[6]XIE Jierui,KELLEY S,SZYMANSKI B K.Overlapping community detection in networks:The state-of-the-art and comparative study[J].ACM Computing Surveys,2013,45(4):43.
[7]PALLA G,DERNYI I,FARKAS I,et al.Uncovering the overlapping community structure of complex networks in nature and society[J].Nature,2005,435(7043):814-818.
[8]AHN Y Y,BAGROW J P,LEHMANN S.Link communities reveal multiscale complexity in networks[J].Nature,2010,466(7307):761-764.
[9]EVANS T S,LAMBIOTTE R.Line graphs,link partitions,and overlapping communities[J].Physical Review E—Statistical,Nonlinear,and Soft Matter Physics,2009,80(1):016105.
[10]BAUMES J,GOLDBERG M,KRISHNAMOORTHY M,et al.Finding communities by clustering agraph into overlapping subgraphs[C]//IADIS International Conference on Applied Computing 2005.Algarve:IADIS,2005:97-104.
[11]LANCICHINETTI A,FORTUNATO S,KERTSZ J.Detecting the overlapping and hierarchical community structure in complex networks[J].New Journal of Physics,2009,11(3):033015.
[12]ESQUIVEL A V,ROSVALL M.Compression of flow can reveal overlapping-module organization in networks[J].Physical Review X,2011,1(2):021025.
[13]GREGORY S.Finding overlapping communities in networks by label propagation[J].New Journal of Physics,2010,12(10):103018.
[14]RAGHAVAN U N,ALBERT R,KUMARA S.Near linear time algorithm to detect community structures in large-scale networks[J].Physical Review E,2007,76(3):036106.
[15]XIE Jierui,SZYMANSKI B K,LIU Xiaoming.SLPA:Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process[C]//Proceedings—11th IEEE International Conference on Data Mining Workshops,ICDMW 2011.Piscataway:IEEE,2011:6137400.
[16]GAITERI C,CHEN Mingming,SZYMANSKI B,et al.Identifying robust communities and multicommunity nodes by combining top-down and bottom-up approaches to clustering[J].Scientific Reports,2015,5:16361.
[17]LANCICHINETTI A,FORTUNATO S.Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities[J].Physical Review E—Statistical,Nonlinear,and Soft Matter Physics,2009,80(1):016118.
[18]SHEN Huawei,CHENG Xueqi,CAI Kai,et al.Detect overlapping and hierarchical community structure in networks[J].Physica A:Statistical Mechanics and Its Applications,2009,388(8):1706-1712.
[19]ROSSI R A,AHMED N K.The network data repository with interactive graph analytics and visualization[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence,AAAI 2015and the 27th Innovative Applications of Artificial Intelligence Conference,IAAI 2015.Austin:AI Access Foundation,2015:4292-4293.
[20]LESKOVEC J,KLEINBERG J,FALOUTSOS C.Graph evolution:Densification and shrinking diameters[J].ACM Transactions on Knowledge Discovery from Data,2006,1(1):1217301.
[21]PANTHEE D R,YUAN J S,WRIGHT D L,et al.Gene expression analysis in soybean in response to the causal agent of Asian soybean rust(Phakopsora pachyrhizi Sydow)in an early growth stage[J].Functional&Integrative Genomics,2007,7(4):291-301.
[22]GREGORY A W,ROAYAEI J A,QUIONES O A.A microarray analysis for differential gene expression in the soybean genome using Bioconductor and R[J].Briefings in Bioinformatics,2007,8(6):415-431.
[23]SCHNEIDER K T,VAN DE MORTEL M,BANCROFT T J,et al.Biphasic gene expression changes elicited by phakopsora pachyrhizi in soybean correlate with fungal penetration and haustoria formation[J].Plant Physiology,2011,157(1):355-371.
[24]VAN DE MORTEL M,RECKNOR J C,GRAHAM M A,et al.Distinct biphasic mRNA changes in response to Asian soybean rust infection[J].Molecular Plant-Microbe Interactions:MPMI,2007,20(8):887-899.