基于随机游走的语义结构查询算法

英文篇名：A SEMANTIC STRUCTURE QUERY ALGORITHM BASED ON RANDOM WALK
作者：朱玉 ; 游进国 ; 付子玉
英文作者：Zhu Yu;You Jinguo;Fu Ziyu;College of Information Engineering and Automation, Kunming University of Science and Technology;College of Computer Science and Engineering, Dalian Minzu University;
关键词：聚集 ; 语义结构 ; 查询 ; 强弱关联 ; 随机游走
英文关键词：Aggregation;;Semantic structure;;Query;;Strong and weak association;;Random walk
中文刊名：JYRJ
英文刊名：Computer Applications and Software
机构：昆明理工大学信息工程与自动化学院;大连民族大学计算机科学与工程学院;
出版日期：2019-05-12
出版单位：计算机应用与软件
年：2019
期：v.36
基金：国家自然科学基金项目(61462050,61562054);; 云南省自然科学基金项目(KKSY201303095)
语种：中文;
页：JYRJ201905043
页数：7
CN：05
ISSN：31-1260/TP
分类号：248-254

摘要

在语义结构查询问题中,传统的查询方法无法快速直观地描述一个具有百万节点的大图,并衡量语义结构的重要性。针对该问题,VoG算法利用子图分割方法并最大化对语义结构进行匹配。提出一种MRQ算法,解决传统算法中查询时间长等问题。利用ApxGreedy算法对输入图进行聚集;通过聚集后超点强弱关联生成随机游走图;使用随机游走算法对语义结构进行查询,根据聚集过程与查询过程中产生的损失进行加权求和,并排序输出语义结构。随机游走查询算法有效地降低了时间复杂度。MRQ算法与VoG算法在真实数据集上的对比实验表明,MRQ算法在时间上比VoG快10倍,误差率降低3.75%。
In the semantic structure query problem, the traditional query methods cannot describe a large graph with millions of nodes quickly and intuitively, and cannot measure the importance of the semantic structure. To solve this problem, the VoG algorithm uses the subgraph segmentation method and maximizes the matching of the semantic structure. This paper proposed MRQ algorithm to solve the problem of long query time. The ApxGreedy algorithm was adopted to aggregate the input graph; random walk graphs were generated by the strong and weak correlation after the aggregation; random walk algorithm was used to query the semantic structure, weighted sum was made according to the loss of aggregation and query process, and the output semantic structure was ranked. The random walk query algorithm effectively reduces the time complexity. The comparison between the MRQ algorithm and the VoG algorithm on the real dataset shows that the MRQ algorithm is 10 times faster than VoG and the error rate is reduced by 3.75%.

引文

[1] Wuechner T ,Cislak A ,Ochoa M ,et al.Leveraging Compression-based Graph Mining for Behavior-based Malware Detection[J].IEEE Transactions on Dependable and Secure Computing,2019,16(1):99-112.
    [2] Liu S Q,Kozan E.New graph-based algorithms to efficiently solve large scale open pit mining optimisation problems[M].Pergamon Press,Inc.2016.
    [3] De Sá C R,Soares C,Knobbe A.Entropy-based discretization methods for ranking data[J].Information Sciences,2016,329:921-936.
    [4] Li H ,Wang Y ,Zhang D ,et al.Pfp:parallel fp-growth for query recommendation[C]// Proceedings of the 2008 ACM conference on Recommender systems.ACM,2008:107-114.
    [5] Yan X,Han J.gSpan:graph-based substructure pattern mining[C]// IEEE International Conference on Data Mining.IEEE Computer Society,2002:721.
    [6] Lakshmi K.A Comparative Study Of Frequent Subgraph Mining Algorithms[J].International Journal of Information Technology Convergence & Se,2012,2(2):23-39.
    [7] Wang W,Wang C,Zhu Y,et al.GraphMiner:a structural pattern-mining system for large disk-based graph databases and its applications[C]// ACM SIGMOD International Conference on Management of Data,Baltimore,Maryland,USA June.DBLP,2005:879-881.
    [8] Zhu F,Yan X,Han J,et al.gPrune:A Constraint Pushing Framework for Graph Pattern Mining[M]// Advances in Knowledge Discovery and Data Mining.Springer Berlin Heidelberg,2007:388-400.
    [9] Liu Y,Tian D,Li B.A Wireless Intrusion Detection Method Based on Dynamic Growing Neural Network[C]// International Multi-Symposiums on Computer and Computational Sciences.IEEE,2006:611-615.
    [10] Koutra D,Kang U,Vreeken J,et al.Summarizing and understanding large graphs[J].Statistical Analysis and Data Mining,2015,8(3):183-202.
    [11] Lefevre K,Terzi E.GraSS:Graph Structure Summarization[C]// Siam International Conference on Data Mining,SDM 2010,April 29-May 1,2010,Columbus,Ohio,Usa.DBLP,2010:454-465.
    [12] Navlakha S,Rastogi R,Shrivastava N.Graph Summarization With Bounded Error[C]// ACM SIGMOD International Conference on Management of Data,SIGMOD 2008,Vancouver,Bc,Canada,June.DBLP,2012:419-432.
    [13] Tian Y,Hankins R A,Patel J M.Efficient aggregation for graph summarization[C]// ACM SIGMOD International Conference on Management of Data,SIGMOD 2008,Vancouver,Bc,Canada,June.DBLP,2008:567-580.
    [14] Liu Y,Safavi T,Dighe A,et al.Graph Summarization Methods and Applications[J].ACM Computing Surveys,2018,51(3):1-34.
    [15] Riondato M,García-Soriano D,Bonchi F.Graph summarization with quality guarantees[C]// IEEE International Conference on Data Mining.IEEE,2015:947-952.
    [16] 潘秋萍.基于条件熵的图聚集算法研究[D].昆明:昆明理工大学,2014.
    [17] Bellaachia A,Al-Dhelaan M.NE-Rank:A Novel Graph-Based Keyphrase Extraction in Twitter[C]// IEEE/WIC/ACM International Conferences on Web Intelligence & Intelligent Agent Technology.IEEE,2012.