基于属性图的社交网络建模与态势分析理论研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着互联网的迅速发展,对基于Web的社交网络的研究引起了越来越多的研究者的关注,并取得了一批研究成果。但目前对社交网络的研究大多数是基于典型的图论理论,忽略了顶点和边的属性及其关联关系,不能很好地反映出Web上社交网络的动态性、隐含模糊性、信息粗糙性、不确定性以及关系多维性的特征。本文针对社交网络的复杂特征,研究构建社交网络数学模型的理论基础,探索对社交网络上个体及其结构进行态势分析的有效方法,从而达到全面、准确地了解多维不确定社交网络的状态,对社交网络进行高效挖掘、态势分析的目的。
     首先,针对社交网络中个体和链接均具有属性的特征,在传统图论的基础上,构建一种描述复杂社交网络的新结构——属性图,并对属性图的基本性质进行了研究。在属性图的基础上,运用粗糙集理论,研究构建了粗糙属性图模型,以描述社交网络中结点和链接属性的不完备性和关系的多样性。进一步结合社交网络的动态性,融合S-粗糙集理论,构建S-粗糙属性图模型,分析S-粗糙属性图、粗糙属性图、属性图以及传统图之间的关系。
     其次,基于属性图模型,研究社交网络进行图查询、图搜索时的子图匹配问题,提出了粗匹配属性子图的随机游走判定算法;基于粗糙属性图模型,提出了粗糙中心区挖掘算法;基于S-粗糙属性图,定义了S-图精度和粗糙度,并证明了迁移函数与图粗糙度的关系,提出了一种社交网络的动态分析方法;通过实例验证了粗糙属性图和S-粗糙属性图在实际应用中的有效性。
     再次,在已构建的社交网络数学模型基础上,考虑社交网络中结点及其关系的多维性、不确定性等特征,从复杂系统思想出发,运用集对分析方法构建集对社交网络分析模型,提出λ网络中心区和α关系社区等概念,并设计了相应的静态和动态挖掘算法,通过实验验证方法的有效性和合理性。
     最后,拓展集对势为广义集对势,构建对应的态势级别表,对社交网络结点关系与个体关系强度进行态势分析;拓展联系熵为广义联系熵,定义了社交网络关系社区的联系熵,提出了一种对社交网络态势及稳定性进行分析的方法。基于广义集对势同时考虑社交网络上实体间结点和边属性存在的差异性,提出了一种基于属性-关系的网络实体相似度计算方法,并设计基于此方法的网络社团检测算法,通过实例验证方法的合理性。
With the constant growth of internet, it has great significance to study the socialnetwork based on Web which has attracted more and more wide attention and anumber of research results have been made. However, the most researches on socialnetwork are based on the typical graph theory which ignores the attribute of node,edge and their relationship, so it could not well reflect the characteristic of dynamic,implicit fuzziness, information roughness and uncertainty and multidimensionalrelation of social network on the Web.In allusion to the complex features of socialnetwork, the theory foundation of modeling the social network is studied and theefficient method for analyzing the trend of the entity on the social and its structure,so as to achieve the purposes that can understand the state of the multi-dimensionaluncertain social networks accurately, and mining the social network efficiently.
     First, in allusion to the feature that individual and link have attributes in thesocial network, on the basis of traditional graph theory, it is built a new structure todescribe the complex social network—“attribute graph”. The basic property of theattribute graph is studied. On the basis of the attribute graph theory, the model ofrough attribute graph is built by integrating the rough set so as to describe theincomplete nodes and links relationship in the social network. Further more,considering the dynamic of nodes and links relationship in the social network, theS-rough attribute graph model is built by integrating the S-rough set theory. Therelationship between the S-rough attribute graph and rough attribute graph isdemonstrated.
     Secondly, it studies the sub graph matching problem in graph query and graphsearch of social networks based on the attribute graph.The decision algorithms areput forward. Based on the rough graph, the rough center area is defined and itsmining algorithm is designed which is proved efficient by an example.Based on theS-rough attribute graph, the relation between the transfer function and the graphroughness is proved so that there will have a simple and convenient method for analyzing the social network dynamically.
     Thirdly, On the basis of mathematical model constructed in the social network,considering the characteristic of multi-dimensional, uncertainty of nodes and theirrelationships in the social network, the model of set pair social network analysis isbuilt by applying the set pair analytical method. In view of the model, the conceptsof λ network central community and α relationship community are proposed.Further more, the mining algorithms of static and dynamic are designed which isdemonstrated the effectiveness and rationality by the experiments.
     Finally, the set pair potential is expanded into the generalized set pair potentialand the corresponding trend level table is built. The relation entropy is expanded intothe generalized relation entropy. The trends of the individual, relationship and thewhole social network are analyzed.In view of the generalized set pair potential, themethod is given how to calculate the network entity similarity based on attributesand relationships. It is demonstrated the method is reasonable by an example.
引文
[1] Scott J.Social Network Analysis:a Handbook[M].London:Sage Publications,2000:5-20.
    [2]谢幸,於志文.移动社交网络[J].中国计算机学会通讯,2012,8(5):6-7.
    [3]马帅,李佳,刘旭东,怀进鹏.图查询:社会计算时代的新型搜索[J].中国计算机学会通讯,2012,8(11):26-31.
    [4] Infographic: Instagram Statistics2012[EB/OL].2012-5-13. http://www.digitalbuzzblog. com/infographic-instagram-stats/.
    [5] Twitter has105,779,710Registered Users, Adding300K A Day [EB/OL].2010-4-14.http://techcrunch.com/2010/04/14/twitter-has-105779710-registered-users-adding-300.
    [6] Maria Giatsoglou, Symeon Papadopoulos and Athena Vakali.Massive Graph Management forthe Web and Web2.0, New Directions in Web Data Management[M].Springer,2011:20-35.
    [7] Mark Newman, Albert-László Barabási, and DuncanJ. Watts.The Structure and Dynamics ofNetworks [M]. Princeton: Princeton University Press,2006:25-50.
    [8] Eytan Adar and Christopher Re. Managing Uncertainty in Social Networks [J], IEEE DataEng. Bull,2007,30(2):15-22.
    [9] Gueorgi Kossinets. Effects of missing data in social networks [J]. Social Networks,2006(28):247-268.
    [10] Flake G, Lawrence S, Giles C. Efficient identification of Web communities[C].//Proceedingsof the6th ACM SIGKDD International Conference on Knowledge Discovery and DataMining,2000:150-160.
    [11]祝园园,秦璐,于旭.图匹配问题的应用和研究[J].中国计算机学会通讯,2012,8(11):21-25.
    [12] Boyd D M, Ellison N B.Socical network sites: definition, history and scholarship [J].Journalof Computer-Mediated Communication,2007,13(1):210-230.
    [13]胡军,王国胤.粗糙集的不确定性度量准则[J].模式识别与人工智能,2010,23(5):606-615.
    [14] Liang J Y,Shi Z Z.The information entropy,rough entropy and knowledge granulation inrough set theory[J].International Journal of Uncertainty,Fuzziness and Knowledge-BasedSystem,2004,12(1):37-46.
    [15]杨琴.网络拓扑模型的演化机制及抗毁性研究[D].郑州:解放军信息工程大学硕士学位论文,2009:28-36.
    [16] Whittaker J.Graphical Models in Applied Multivariate Statistics [M].New York,John Wileyand Sons Inc,1990:30-50.
    [17]张硕,高宏,李建中,邹兆年.不确定图数据库中高效查询处理[J].计算机学报,2009,32(10):2065-2079.
    [18] Asrhana S, King O D, Gibbons F D, etc.Predicting protein complex membership usingprobabilistic network reliability [J].Genome Research,2004,14(6):1170-1175.
    [19]张旭,何向南,金澈清,周傲英.面向不确定图的k最近邻查询[J].计算机研究与发展,2011,48(10):1871-1878.
    [20]李文凤,彭智勇,李德毅.不确定性top-k查询处理[J].软件学报,2012,23(6):1542-1560.
    [21] Hintsanen P.The most reliable subgraph problem[C].//Proceedings of the11thEuropeanConference on Principles and Practice of Knowledge Discovery in Databases.Berlin:Springer-Verlag,2007:471-478.
    [22] De Raedt L,Kersting K,Kimmig A,etc.Compressing probabilistic prolog programs[J].MachineLearning,2008,70(2/3):151-168.
    [23] Hintsanen P,Toivonen H.Finding reliable subgraphs from large probabilistic graphs[J].DataMining and Knowledge Discovery,2008,17(1):3-23.
    [24] Zou Zhao-nian,Li Jian-zhong,Gao Hong,etc.Frequent subgraph pattern mining on uncertaingraph data[C].//Proceedings of the18thACM Conference on Information and KnowledgeManagement.New York:ACM Press,2009:583-592.
    [25]韩蒙,张炜,李建中.RANKING:一种高效的不确定图K-极大频繁模式挖掘算法[J].计算机学报,2010,33(8):1387-1395.
    [26]袁野,王国仁.基于阈值的概率图可达查询[J].计算机学报,2010,33(12):2219-2228.
    [27] He Tong,Shi Kaiquan.Application of Rough Graph in Relationship Mining[J].Journal ofSystems Engineering and Electronics,2008,19(4):742-747.
    [28]何童,史开泉.粗糙集代数关系的图结构分析[J].系统工程与电子技术,2008,30(9):1679-1682.
    [29]何童,卢昌荆,史开泉.粗糙图与它的结构[J].山东大学学报(理学版),2006,41(6):46-50.
    [30]何童.粗糙图与它的应用[D].山东大学博士学位论文,2008:15-30.
    [31] He Tong,Shi Kaiquan Weighted Rough Graph and Its Application[C],//Proceedings of SixthIEEE International Conference on Intelligent Systems Design and Applications,Jinan,2006,3(1):486-491.
    [32]罗乐.基于核心成员识别的网络社区发现及跟踪方法[D].哈尔滨:哈尔滨工业大学硕士学位论文,2010:23-35.
    [33]曾王辉.微博网络的社区发现研究[D].昆明:云南大学硕士学位论文,2012:16-35.
    [34] Hanneman R, Riddle M. Introduction to social network methods [D]. Riverside, CA,University of California,2005:20-30.
    [35] Dourisboure Y, Geraci F, Pellegrini M. Extraction and classification of dense communities inthe Web[C].//Proceedings of the16th International Conference on World Wide Web,2007:461-470.
    [36] Flake G, Lawrence S, Giles C. Efficient identification of Web communities[C]//Proceedingsof the6th ACM SIGKDD International Conference on Knowledge Discovery and DataMining,2000.
    [37] Newman M. Finding community structure in networks using the eigenvectors of matrices [J].Physical Review E,2006,74(3):1-22.
    [38] Pei J, Zhou B, Tang Z. A spam city approach to Web spam detection[C].//Proceedings of the2008SIAM International Conference on Data Mining (SDM’08), Atlanta, USA,2008:277-288.
    [39]才华,周春光等.动态社会网络中的社区挖掘算法研究[J].吉林大学学报(信息科学版),2008,26(4):380-385.
    [40]魏绪仲,唐常杰等.CCDCD:基于图密度的动态约束社团核心挖掘方法[J].计算机科学与探索,2009,3(3):309-320.
    [41]韩毅,王智慧,贾焰.社会网络中面向多准则约束的社区发现算法[J].计算机科学与探索,2010,4(8):683-691.
    [42]高琰.基于多特征的Web社区发现关键技术研究[D].长沙:中南大学博士学位论文,2007:19-44.
    [43]淦文燕,赫南,李德毅等.一种基于拓扑势的网络社区发现方法[J].软件学报,2009,20(8):2241-2253.
    [44]方平,郭正彪,李芝棠等.基于共同好友数的在线社会网络社区发现算法[J].计算机科学与探索,2012,5::456-453.
    [45]林友芳,王天宇,唐锐等.一种有效的社会网络社区发现模型和算法[J].计算机研究与发展,2012,49(2):337-345.
    [46]马瑞新,邓贵仕.基于角色划分的动态社区挖掘算法研究[J].计算机科学,2012,39(9):60-64.
    [47]林旺群,邓镭,丁兆云等.一种新型的层次化动态社区并行计算方法[J].计算机学报,2012,35(8):1712-1725.
    [48] S.Fortunato, V.Latora, M.Marchiori. A method to find community structures based oninformation centrality [J].Phys Rev E,2004,70(5):056104.
    [49] V.Latora, M.Marchiori.Efficient behavior of small-world networks [J].Phys Rev Lett,2002,87(9):198701.
    [50]曹红.Blog社区的发现与演变追踪技术研究[D].哈尔滨:哈尔滨工业大学硕士学位论文,2009:38-48.
    [51]郭瑞,钟宁,李文彬.基于图熵的社会网络演化分析[J].模式识别与人工智能,2009,22(3):360-365.
    [52]黄静,殷保群,巫续敏.Gnutella网络上的动态社区结构分析[J].小型微型计算机系统,2012,33(8):1655-1659.
    [53]王飞跃,曾大军,曹志冬.应急2.0:万维社会媒体及群体态势建模与分析[J].中国应急管理,2009,1:21-25.
    [54]阳德清,肖仰华,汪卫.基于统计模型的社会网络群体关注度的分析与预测[J].计算机研究与发展,2010,47(Suppl.):378-384.
    [55]赵克勤.集对分析及其初步应用[M].杭州:浙江科学技术出版社,2000:11-37,68-78.
    [56]黄兵,钟斌,周献中.改进集对粗集模型[J].计算机工程与应用,2004,2:82-84.
    [57]黄兵,周献中.基于集对分析的不完备信息系统粗糙集模型[J].计算机科学,2002,29(9专刊):1-3.
    [58]刘富春.基于限制容差关系的集对粗糙集模型[J].计算机科学,2005,32(6):124-128.
    [59]刘富春.基于集对分析方法的不完备信息系统的扩充粗糙集模型[J].计算机科学,2006,33(2):169-172.
    [60]朱红宁,张斌.特征加权集对分析方法[J].计算机科学,2009,36(9):211-214.
    [61] Pawlak.Rough sets[J]. International Journal of Computer and Information Sciences,1982,11(5):341-353.
    [62]马建敏,张文修,朱朝晖.基于信息量的序信息系统的属性约简[J].系统工程理论与实践,2010,30(9):1679-1683.
    [63]胡军,王国胤.粗糙集的不确定性度量准则[J].模式识别与人工智能,2010,23(5):606-615.
    [64]张志飞,苗夺谦.基于粗糙集的文本分类特征选择算法[J].智能系统学报,2009,4(5):453-457.
    [65] Heckerman D. Bayesian networks for data mining [J].Data Mining and Knowledge Discory,1997,1(1):79-119.
    [66] Friedman N, Murphy K, Russell S. Learning the structure of dynamic probabilisticnetworks[C].//Proceedings of the14thInternational Conference on Uncertainty in ArtificialIntelligence, Madison,1998:139-147.
    [67]王双成,林士敏,陆玉昌.用Bayesian网络处理具有不完整数据的问题分析[J].清华大学学报(自然科学版),2000,40(9):65-68.
    [68]杨士强,孙立峰,崔鹏.Web社会网络分析[J].中国计算机学会通讯,2011,7(2):52-58.
    [69]马建敏,张文修,朱朝晖.基于信息量的序信息系统的属性约简[J].系统工程理论与实践,2010,30(9):1679-1683.
    [70]程玉胜,张佑生,胡学钢.基于边界域的知识粗糙熵与粗集粗糙熵[J].系统仿真学报,2007,19(9):2008-2011.
    [71] Liang J Y,Shi Z Z.The information entropy,rough entropy and knowledge granulation inrough set theory[J].International Journal of Uncertainty,Fuzziness and Knowledge-BasedSystem,2004,12(1):37-46.
    [72]史开泉,崔玉泉.S-粗集和它的一般结构[J].山东大学学报(理学版),2002,37(6):471-474.
    [73]王晶晶,史开泉,雷英杰.粗集、 S-粗集、函数S-粗集及其关系定理[J].计算机科学,2007,34(6):156-157.
    [74]王辉,施佺,徐波,徐晓旻.基于Web社会网络的结点间关系多样性分析[J].解放军理工大学学报(自然科学版),2011,12(6):593-598.
    [75] Zhang L., Wu J., Zhuang Y., Zhang Y., and Yang C. Review-oriented metadata enrichment:a case study[C].//Proceedings of JCDL, Austin, TX, USA,2009:173-182.
    [76] Zhu X., Goldberg A., Van Gael J., and Andrzejewski D. Improving diversity in rankingusing absorbing random walks[C].//Proceedings of NAACL HLT, Rochester, New York,USA,2007:97–104.
    [77]徐晓华.图上的随机游走学习[D].南京:南京航空航天大学博士学位论文,2008:19-32.
    [78]雷钰丽,李阳等.基于权重的马尔可夫随机游走相似度度量的实体识别方法[J].河北师范大学学报(自然科学版),2010,34(1):26-30.
    [79]郑伟,王朝坤,刘璋,王建民.一种基于随机游走模型的多标签分类算法[J].计算机学报.2010,38(8):1418-1425.
    [80] Albert R, Barabási A. Statistical mechanics of complex network [J]. Review of ModernPhysics,2002,74(1):47-97.
    [81]李宜敏,罗爱民等.集对分析联系度聚类方法在信息分类中的应用[J].火力与指挥控制,2008,33(8):145-148.
    [82]黄德才,张丽君等.基于a+bi型联系数的不确定网格静态调度算法[J].计算机科学,2007,34(8):126-129,179.
    [83]童英伟,刘志斌等.集对分析在河流水质评价中的应用[J].安全与环境学报,2008,8(6):84-86.
    [84]吴开亚,金菊良等.集对分析聚类预测方法在区域生态足迹趋势预测中的应用[J].武汉大学学报(信息科学版),2008,33(9):973-977.
    [85]赵克勤,黄德才.一种基于集对分析的同异反海量数据挖掘[C].中国人工智能学会第10届全国学术年会,2003:1468-1470.
    [86]张春英,郭景峰.概率决策空间上的SPA可能度格序结构[J].广西师范大学学报(自然科学版),2008,26(4):45-48.
    [87] Deng Cai, Zheng Shao, Xiaofei He,Xifeng Yan,Jiawei Han, Mining Hidden Community inHeterogeneous Social Networks[C]. LinkKDD’05, Chicago, IL, USA,2005:1-8.
    [88]王金龙,徐从富,骆国靖.面向异质关系的社区挖掘[J].计算机应用,2007,27(12):3016-3018.
    [89] R Kumar,J Novak,A Tomkins. Structure and evolution of online socialnetworks[C].Proceedings of the2006International Conference on Knowledge Discoveryand Data Mining(KDD’06),Philadelphia:Research Track Poster,2006:611-617.
    [90] Asur S’Parthasarathy S’Ucar D.An event-based framework for characterizing theevolutionary behavior of interaction graphs[C].In KDD07: Proceedings of the13th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining’ACM.NewYork.NY.USA.2007:913-921.
    [91]单波,姜守旭,张硕,高宏,李建中.IC:动态社会关系网络社区结构的增量识别算法[C].第26届中国数据库学术会议,南昌,2009:228-236.
    [92]黄静,殷保群,巫续敏.Gnutella网络上的动态社区结构分析[J].小型微型计算机系统,2012,33,8:1655-1659.
    [93]于卓尔,周春光,杨滨,等.基于静态和动态的社会网络挖掘算法[D].吉林大学学报(理学版),2008,46(5):897-902.
    [94]李德顺.基于广义集对分析的系统危险性评价研究[D].沈阳:东北大学博士学位论文,2010:80-88.
    [95] Cheng Qiyue,Qiu Wanhua, Liu Xiaofeng. Relation Entropy and Transferable Entropy Thinkof Aggregation on Group Decision Making[J]. Journal of Systems Science and SystemsEngineering,2002,11(1):13-18.
    [96] Liu Hongwei. Community detection by affinity propagation with various similaritymeasures [C].//4th International Joint Conference on Computational Sciences andOptimization, CSO2011:182-186.
    [97] Hu Yanqing,Li Menghui,Zhang Peng,etc. Community detection by signaling on complexnetworks[J].Phys Rev E,2008,78.
    [98] Guo Jingfeng,Hao Dandan,Zheng Chao. An Algorithm Based on Attributed RelationalGraph for Name Disambiguation [J], ICIC Express Letters,2010,5(1):113-118.
    [99] Xiang B,Chen B H,Zhou T.Finding community structure based on subgraphsimilarity[J].Studies in Computational Intelligence,2009,207(5):73-81.
    [100] Claset A,Newman M E J,Moore C.Finding community structure in very largenetworks[J].Physical Review E,2004,70(6):096-111.
    [101] Pan Y,Li D H,Liu GJ,etc. Detecting community structure in complex networks via nodesimilarity[J].Physica A,2010,389(14):2849-2857.
    [102] Leicht E A,Holme P,Newman M E J. Vertex similarity in networks[J]. Physical ReviewE,2006,73(2):026120.
    [103] Guo Jingfeng, Zhang Chunying, Chen Xiao. Attribute Graph and Its Structure[J]. ICICExpress Letters,2011.l5(8A):2611-2616.
    [104]张春英,梁瑞涛,刘璐.集对社会网络图分析模型及其应用[J].河北理工大学学报,2011,33(3):99-103.
    [105]杜方.复杂网络系统间相似性识别及其应用[D].杭州:浙江大学博士学位论文,2010:38-47.
    [106] G.Jeh, J.Widom.SimRank: A measure of structural-context similarity[C].Proceedings ofthe8thACM SIGKDD International Conference on Knowledge Discovery and DataMining, New York: Association of Computing Machinery,2002:538-543.
    [107] Weiss R,VelezB,Sheldon M.HyPursuit: a hierarchical network search engine that exploitscontent-link hypertext clustering[C].//Proceedings of the7tthACM conference onHypertext, New York: ACM Press,1996:180-193.
    [108] Dharmendra S. Modha, Dharmendra S. Modha, W. Scott Spangler, W. ScottSpangler.Clustering hypertext with applications to Web searching[C].//Proceedings of the11th ACM Conference on Hypertext and Hypermedia,2000:143-182
    [109]吴玲玉,高学东,武森.基于属性-关系综合相似度的聚类算法[J].计算机应用研究,2011,28(1):44-47.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700